Towards Smooth Video Composition

12/14/2022
by   Qihang Zhang, et al.
4

Video generation requires synthesizing consistent and persistent frames with dynamic content over time. This work investigates modeling the temporal relations for composing video with arbitrary length, from a few frames to even infinite, using generative adversarial networks (GANs). First, towards composing adjacent frames, we show that the alias-free operation for single image generation, together with adequately pre-learned knowledge, brings a smooth frame transition without compromising the per-frame quality. Second, by incorporating the temporal shift module (TSM), originally designed for video understanding, into the discriminator, we manage to advance the generator in synthesizing more consistent dynamics. Third, we develop a novel B-Spline based motion representation to ensure temporal smoothness to achieve infinite-length video generation. It can go beyond the frame number used in training. A low-rank temporal modulation is also proposed to alleviate repeating contents for long video generation. We evaluate our approach on various datasets and show substantial improvements over video generation baselines. Code and models will be publicly available at https://genforce.github.io/StyleSV.

READ FULL TEXT

page 1

page 4

page 5

page 8

page 16

page 17

page 18

page 19

research
07/26/2023

VideoControlNet: A Motion-Guided Video-to-Video Translation Framework by Using Diffusion Model with ControlNet

Recently, diffusion models like StableDiffusion have achieved impressive...
research
04/23/2018

To Create What You Tell: Generating Videos from Captions

We are creating multimedia contents everyday and everywhere. While autom...
research
04/05/2019

Point-to-Point Video Generation

While image manipulation achieves tremendous breakthroughs (e.g., genera...
research
12/13/2022

PV3D: A 3D Generative Model for Portrait Video Generation

Recent advances in generative adversarial networks (GANs) have demonstra...
research
12/17/2019

Jointly Trained Image and Video Generation using Residual Vectors

In this work, we propose a modeling technique for jointly training image...
research
08/12/2023

ModelScope Text-to-Video Technical Report

This paper introduces ModelScopeT2V, a text-to-video synthesis model tha...
research
08/29/2023

Learning Modulated Transformation in GANs

The success of style-based generators largely benefits from style modula...

Please sign up or login with your details

Forgot password? Click here to reset