InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation

01/08/2021
by   Yaohui Wang, et al.
8

In this work, we introduce an unconditional video generative model, InMoDeGAN, targeted to (a) generate high quality videos, as well as to (b) allow for interpretation of the latent space. For the latter, we place emphasis on interpreting and manipulating motion. Towards this, we decompose motion into semantic sub-spaces, which allow for control of generated samples. We design the architecture of InMoDeGAN-generator in accordance to proposed Linear Motion Decomposition, which carries the assumption that motion can be represented by a dictionary, with related vectors forming an orthogonal basis in the latent space. Each vector in the basis represents a semantic sub-space. In addition, a Temporal Pyramid Discriminator analyzes videos at different temporal resolutions. Extensive quantitative and qualitative analysis shows that our model systematically and significantly outperforms state-of-the-art methods on the VoxCeleb2-mini and BAIR-robot datasets w.r.t. video quality related to (a). Towards (b) we present experimental results, confirming that decomposed sub-spaces are interpretable and moreover, generated motion is controllable.

READ FULL TEXT

page 1

page 3

page 7

page 8

research
03/17/2022

Latent Image Animator: Learning to Animate Images via Latent Space Navigation

Due to the remarkable progress of deep generative models, animating imag...
research
04/23/2023

LaMD: Latent Motion Diffusion for Video Generation

Generating coherent and natural movement is the key challenge in video g...
research
12/11/2019

G^3AN: This video does not exist. Disentangling motion and appearance for video generation

Creating realistic human videos introduces the challenge of being able t...
research
03/30/2023

LatentForensics: Towards lighter deepfake detection in the StyleGAN latent space

The classification of forged videos has been a challenge for the past fe...
research
01/18/2022

Autoencoding Video Latents for Adversarial Video Generation

Given the three dimensional complexity of a video signal, training a rob...
research
12/01/2022

VIDM: Video Implicit Diffusion Models

Diffusion models have emerged as a powerful generative method for synthe...
research
11/05/2022

Disentangling Content and Motion for Text-Based Neural Video Manipulation

Giving machines the ability to imagine possible new objects or scenes fr...

Please sign up or login with your details

Forgot password? Click here to reset