Video Colorization with Pre-trained Text-to-Image Diffusion Models

06/02/2023
by   Hanyuan Liu, et al.
0

Video colorization is a challenging task that involves inferring plausible and temporally consistent colors for grayscale frames. In this paper, we present ColorDiffuser, an adaptation of a pre-trained text-to-image latent diffusion model for video colorization. With the proposed adapter-based approach, we repropose the pre-trained text-to-image model to accept input grayscale video frames, with the optional text description, for video colorization. To enhance the temporal coherence and maintain the vividness of colorization across frames, we propose two novel techniques: the Color Propagation Attention and Alternated Sampling Strategy. Color Propagation Attention enables the model to refine its colorization decision based on a reference latent frame, while Alternated Sampling Strategy captures spatiotemporal dependencies by using the next and previous adjacent latent frames alternatively as reference during the generative diffusion sampling steps. This encourages bidirectional color information propagation between adjacent video frames, leading to improved color consistency across frames. We conduct extensive experiments on benchmark datasets, and the results demonstrate the effectiveness of our proposed framework. The evaluations show that ColorDiffuser achieves state-of-the-art performance in video colorization, surpassing existing methods in terms of color fidelity, temporal consistency, and visual quality.

READ FULL TEXT

page 1

page 4

page 8

page 9

page 10

research
04/21/2023

Improved Diffusion-based Image Colorization via Piggybacked Models

Image colorization has been attracting the research interests of the com...
research
09/05/2023

Hierarchical Masked 3D Diffusion Model for Video Outpainting

Video outpainting aims to adequately complete missing areas at the edges...
research
08/09/2018

Deep Video Color Propagation

Traditional approaches for color propagation in videos rely on some form...
research
04/26/2021

VCGAN: Video Colorization with Hybrid Generative Adversarial Network

We propose a hybrid recurrent Video Colorization with Hybrid Generative ...
research
08/24/2023

APLA: Additional Perturbation for Latent Noise with Adversarial Training Enables Consistency

Diffusion models have exhibited promising progress in video generation. ...
research
06/01/2023

Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance

Creating a vivid video from the event or scenario in our imagination is ...
research
08/03/2023

DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models

Recent data-driven image colorization methods have enabled automatic or ...

Please sign up or login with your details

Forgot password? Click here to reset