Imagen Video: High Definition Video Generation with Diffusion Models

by   Jonathan Ho, et al.

We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. We describe how we scale up the system as a high definition text-to-video model including design decisions such as the choice of fully-convolutional temporal and spatial super-resolution models at certain resolutions, and the choice of the v-parameterization of diffusion models. In addition, we confirm and transfer findings from previous work on diffusion-based image generation to the video generation setting. Finally, we apply progressive distillation to our video models with classifier-free guidance for fast, high quality sampling. We find Imagen Video not only capable of generating videos of high fidelity, but also having a high degree of controllability and world knowledge, including the ability to generate diverse videos and text animations in various artistic styles and with 3D object understanding. See for samples.


page 1

page 2

page 3

page 4

page 5

page 12

page 13

page 15


VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation

In this paper, we present VideoGen, a text-to-video generation approach,...

Video Diffusion Models

Generating temporally coherent high fidelity video is an important miles...

SimDA: Simple Diffusion Adapter for Efficient Video Generation

The recent wave of AI-generated content has witnessed the great developm...

HoloFusion: Towards Photo-realistic 3D Generative Modeling

Diffusion-based image generators can now produce high-quality and divers...

Probabilistic Adaptation of Text-to-Video Models

Large text-to-video models trained on internet-scale data have demonstra...

GD-VDM: Generated Depth for better Diffusion-based Video Generation

The field of generative models has recently witnessed significant progre...

Scaling Autoregressive Video Models

Due to the statistical complexity of video, the high degree of inherent ...

Please sign up or login with your details

Forgot password? Click here to reset