Efficient training for future video generation based on hierarchical disentangled representation of latent variables

06/07/2021
by   Naoya Fushishita, et al.
0

Generating videos predicting the future of a given sequence has been an area of active research in recent years. However, an essential problem remains unsolved: most of the methods require large computational cost and memory usage for training. In this paper, we propose a novel method for generating future prediction videos with less memory usage than the conventional methods. This is a critical stepping stone in the path towards generating videos with high image quality, similar to that of generated images in the latest works in the field of image generation. We achieve high-efficiency by training our method in two stages: (1) image reconstruction to encode video frames into latent variables, and (2) latent variable prediction to generate the future sequence. Our method achieves an efficient compression of video into low-dimensional latent variables by decomposing each frame according to its hierarchical structure. That is, we consider that video can be separated into background and foreground objects, and that each object holds time-varying and time-independent information independently. Our experiments show that the proposed method can efficiently generate future prediction videos, even for complex datasets that cannot be handled by previous methods.

READ FULL TEXT

page 6

page 7

page 8

page 12

page 13

research
12/02/2015

Attribute2Image: Conditional Image Generation from Visual Attributes

This paper investigates a novel problem of generating images from visual...
research
03/02/2021

Predicting Video with VQVAE

In recent years, the task of video prediction-forecasting future video g...
research
11/25/2017

Predictive Learning: Using Future Representation Learning Variantial Autoencoder for Human Action Prediction

The unsupervised Pretraining method has been widely used in aiding human...
research
11/17/2020

Mutual Information Based Method for Unsupervised Disentanglement of Video Representation

Video Prediction is an interesting and challenging task of predicting fu...
research
06/13/2017

Video Imagination from a Single Image with Transformation Generation

In this work, we focus on a challenging task: synthesizing multiple imag...
research
05/20/2022

Nonlinear motion separation via untrained generator networks with disentangled latent space variables and applications to cardiac MRI

In this paper, a nonlinear approach to separate different motion types i...
research
11/14/2018

Extractive Summary as Discrete Latent Variables

In this paper, we compare various methods to compress a text using a neu...

Please sign up or login with your details

Forgot password? Click here to reset