Dreamix: Video Diffusion Models are General Video Editors

02/02/2023
by   Eyal Molad, et al.
5

Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the first diffusion-based method that is able to perform text-based motion and appearance editing of general videos. Our approach uses a video diffusion model to combine, at inference time, the low-resolution spatio-temporal information from the original video with new, high resolution information that it synthesized to align with the guiding text prompt. As obtaining high-fidelity to the original video requires retaining some of its high-resolution information, we add a preliminary stage of finetuning the model on the original video, significantly boosting fidelity. We propose to improve motion editability by a new, mixed objective that jointly finetunes with full temporal attention and with temporal attention masking. We further introduce a new framework for image animation. We first transform the image into a coarse video by simple image processing operations such as replication and perspective geometric projections, and then use our general video editor to animate it. As a further application, we can use our method for subject-driven video generation. Extensive qualitative and numerical experiments showcase the remarkable editing ability of our method and establish its superior performance compared to baseline methods.

READ FULL TEXT

page 3

page 8

page 10

page 14

page 15

page 16

page 17

page 18

research
08/18/2023

StableVideo: Text-driven Consistency-aware Diffusion Video Editing

Diffusion-based methods can generate realistic images and videos, but th...
research
06/14/2023

VidEdit: Zero-Shot and Spatially Aware Text-Driven Video Editing

Recently, diffusion-based generative models have achieved remarkable suc...
research
08/28/2023

MagicEdit: High-Fidelity and Temporally Coherent Video Editing

In this report, we present MagicEdit, a surprisingly simple yet effectiv...
research
05/15/2023

Make-A-Protagonist: Generic Video Editing with An Ensemble of Experts

The text-driven image and video diffusion models have achieved unprecede...
research
03/14/2023

Edit-A-Video: Single Video Editing with Object-Aware Consistency

Despite the fact that text-to-video (TTV) model has recently achieved re...
research
05/27/2023

Towards Consistent Video Editing with Text-to-Image Diffusion Models

Existing works have advanced Text-to-Image (TTI) diffusion models for vi...
research
05/26/2023

ControlVideo: Adding Conditional Control for One Shot Text-to-Video Editing

In this paper, we present ControlVideo, a novel method for text-driven v...

Please sign up or login with your details

Forgot password? Click here to reset