Jointly Trained Image and Video Generation using Residual Vectors

12/17/2019
by   Yatin Dandi, et al.
17

In this work, we propose a modeling technique for jointly training image and video generation models by simultaneously learning to map latent variables with a fixed prior onto real images and interpolate over images to generate videos. The proposed approach models the variations in representations using residual vectors encoding the change at each time step over a summary vector for the entire video. We utilize the technique to jointly train an image generation model with a fixed prior along with a video generation model lacking constraints such as disentanglement. The joint training enables the image generator to exploit temporal information while the video generation model learns to flexibly share information across frames. Moreover, experimental results verify our approach's compatibility with pre-training on videos or images and training on datasets containing a mixture of both. A comprehensive set of quantitative and qualitative evaluations reveal the improvements in sample quality and diversity over both video generation and image generation baselines. We further demonstrate the technique's capabilities of exploiting similarity in features across frames by applying it to a model based on decomposing the video into motion and content. The proposed model allows minor variations in content across frames while maintaining the temporal dependence through latent vectors encoding the pose or motion features.

READ FULL TEXT

page 2

page 6

page 7

page 8

page 13

page 14

research
07/17/2017

MoCoGAN: Decomposing Motion and Content for Video Generation

Visual signals in a video can be divided into content and motion. While ...
research
02/26/2021

Dual-MTGAN: Stochastic and Deterministic Motion Transfer for Image-to-Video Synthesis

Generating videos with content and motion variations is a challenging ta...
research
01/18/2022

Autoencoding Video Latents for Adversarial Video Generation

Given the three dimensional complexity of a video signal, training a rob...
research
08/06/2016

Signs in time: Encoding human motion as a temporal image

The goal of this work is to recognise and localise short temporal signal...
research
12/14/2022

Towards Smooth Video Composition

Video generation requires synthesizing consistent and persistent frames ...
research
12/29/2021

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

Videos show continuous events, yet most - if not all - video synthesis f...
research
09/20/2023

Attentive VQ-VAE

We present a novel approach to enhance the capabilities of VQVAE models ...

Please sign up or login with your details

Forgot password? Click here to reset