Improved Visual Story Generation with Adaptive Context Modeling

05/26/2023
by   Zhangyin Feng, et al.
0

Diffusion models developed on top of powerful text-to-image generation models like Stable Diffusion achieve remarkable success in visual story generation. However, the best-performing approach considers historically generated results as flattened memory cells, ignoring the fact that not all preceding images contribute equally to the generation of the characters and scenes at the current stage. To address this, we present a simple method that improves the leading system with adaptive context modeling, which is not only incorporated in the encoder but also adopted as additional guidance in the sampling stage to boost the global consistency of the generated story. We evaluate our model on PororoSV and FlintstonesSV datasets and show that our approach achieves state-of-the-art FID scores on both story visualization and continuation scenarios. We conduct detailed model analysis and show that our model excels at generating semantically consistent images for stories.

READ FULL TEXT

page 3

page 6

page 7

page 8

page 14

page 15

research
12/06/2018

StoryGAN: A Sequential Conditional GAN for Story Visualization

In this work we propose a new task called Story Visualization. Given a m...
research
11/23/2022

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

There has been a recent explosion of impressive generative models that c...
research
10/17/2020

Consistency and Coherency Enhanced Story Generation

Story generation is a challenging task, which demands to maintain consis...
research
10/16/2022

Character-Centric Story Visualization via Visual Planning and Token Alignment

Story visualization advances the traditional text-to-image generation by...
research
09/18/2023

Causal-Story: Local Causal Attention Utilizing Parameter-Efficient Tuning For Visual Story Synthesis

The excellent text-to-image synthesis capability of diffusion models has...
research
01/07/2023

Visual Story Generation Based on Emotion and Keywords

Automated visual story generation aims to produce stories with correspon...
research
11/20/2022

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

Conditioned diffusion models have demonstrated state-of-the-art text-to-...

Please sign up or login with your details

Forgot password? Click here to reset