Style-A-Video: Agile Diffusion for Arbitrary Text-based Video Style Transfer

05/09/2023
by   Nisha Huang, et al.
0

Large-scale text-to-video diffusion models have demonstrated an exceptional ability to synthesize diverse videos. However, due to the lack of extensive text-to-video datasets and the necessary computational resources for training, directly applying these models for video stylization remains difficult. Also, given that the noise addition process on the input content is random and destructive, fulfilling the style transfer task's content preservation criteria is challenging. This paper proposes a zero-shot video stylization method named Style-A-Video, which utilizes a generative pre-trained transformer with an image latent diffusion model to achieve a concise text-controlled video stylization. We improve the guidance condition in the denoising process, establishing a balance between artistic expression and structure preservation. Furthermore, to decrease inter-frame flicker and avoid the formation of additional artifacts, we employ a sampling optimization and a temporal consistency module. Extensive experiments show that we can attain superior content preservation and stylistic performance while incurring less consumption than previous solutions. Code will be available at https://github.com/haha-lisa/Style-A-Video.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

page 8

research
03/15/2023

Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Diffusion models have shown great promise in text-guided image style tra...
research
04/29/2019

Style Transfer by Relaxed Optimal Transport and Self-Similarity

Style transfer algorithms strive to render the content of one image usin...
research
05/14/2019

Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation

Disentangling the content and style in the latent space is prevalent in ...
research
08/04/2023

Painterly Image Harmonization using Diffusion Model

Painterly image harmonization aims to insert photographic objects into p...
research
09/14/2023

Masked Diffusion with Task-awareness for Procedure Planning in Instructional Videos

A key challenge with procedure planning in instructional videos lies in ...
research
04/12/2023

Improving Diffusion Models for Scene Text Editing with Dual Encoders

Scene text editing is a challenging task that involves modifying or inse...
research
10/28/2022

MagicMix: Semantic Mixing with Diffusion Models

Have you ever imagined what a corgi-alike coffee machine or a tiger-alik...

Please sign up or login with your details

Forgot password? Click here to reset