Probabilistic Video Generation using Holistic Attribute Control

03/21/2018
by   Jiawei He, et al.
2

Videos express highly structured spatio-temporal patterns of visual data. A video can be thought of as being governed by two factors: (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attribute-induced appearance, encoding the persistent content of each frame, and (ii) an inter-frame motion or scene dynamics (e.g., encoding evolution of the person ex-ecuting the action). Based on this intuition, we propose a generative framework for video generation and future prediction. The proposed framework generates a video (short clip) by decoding samples sequentially drawn from a latent space distribution into full video frames. Variational Autoencoders (VAEs) are used as a means of encoding/decoding frames into/from the latent space and RNN as a wayto model the dynamics in the latent space. We improve the video generation consistency through temporally-conditional sampling and quality by structuring the latent space with attribute controls; ensuring that attributes can be both inferred and conditioned on during learning/generation. As a result, given attributes and/orthe first frame, our model is able to generate diverse but highly consistent sets ofvideo sequences, accounting for the inherent uncertainty in the prediction task. Experimental results on Chair CAD, Weizmann Human Action, and MIT-Flickr datasets, along with detailed comparison to the state-of-the-art, verify effectiveness of the framework.

READ FULL TEXT

page 2

page 13

page 14

page 18

page 19

page 20

page 21

research
09/07/2021

Simple Video Generation using Neural ODEs

Despite having been studied to a great extent, the task of conditional g...
research
11/14/2017

Prediction Under Uncertainty with Error-Encoding Networks

In this work we introduce a new framework for performing temporal predic...
research
03/24/2018

VOS-GAN: Adversarial Learning of Visual-Temporal Dynamics for Unsupervised Dense Prediction in Videos

Recent GAN-based video generation approaches model videos as the combina...
research
04/20/2022

Sound-Guided Semantic Video Generation

The recent success in StyleGAN demonstrates that pre-trained StyleGAN la...
research
06/07/2021

Multi-frame sequence generator of 4D human body motion

We examine the problem of generating temporally and spatially dense 4D h...
research
10/05/2018

Deep Probabilistic Video Compression

We propose a variational inference approach to deep probabilistic video ...
research
02/21/2020

Stochastic Latent Residual Video Prediction

Designing video prediction models that account for the inherent uncertai...

Please sign up or login with your details

Forgot password? Click here to reset