Predicting Video with VQVAE

03/02/2021
by   Jacob Walker, et al.
0

In recent years, the task of video prediction-forecasting future video given past video frames-has attracted attention in the research community. In this paper we propose a novel approach to this problem with Vector Quantized Variational AutoEncoders (VQ-VAE). With VQ-VAE we compress high-resolution videos into a hierarchical set of multi-scale discrete latent variables. Compared to pixels, this compressed latent space has dramatically reduced dimensionality, allowing us to apply scalable autoregressive generative models to predict video. In contrast to previous work that has largely emphasized highly constrained datasets, we focus on very diverse, large-scale datasets such as Kinetics-600. We predict video at a higher resolution on unconstrained videos, 256x256, than any other previous method to our knowledge. We further validate our approach against prior work via a crowdsourced human evaluation.

READ FULL TEXT

page 2

page 3

page 7

page 8

research
06/18/2020

Latent Video Transformer

The video generation task can be formulated as a prediction of future vi...
research
09/15/2022

HARP: Autoregressive Latent Video Prediction with High-Fidelity Image Generator

Video prediction is an important yet challenging problem; burdened with ...
research
06/07/2021

Efficient training for future video generation based on hierarchical disentangled representation of latent variables

Generating videos predicting the future of a given sequence has been an ...
research
01/28/2021

VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild

Predicting future frames of video sequences is challenging due to the co...
research
03/07/2022

Hierarchical Sketch Induction for Paraphrase Generation

We propose a generative model of paraphrase generation, that encourages ...
research
03/06/2021

Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction

A video prediction model that generalizes to diverse scenes would enable...
research
11/26/2022

Randomized Conditional Flow Matching for Video Prediction

We introduce a novel generative model for video prediction based on late...

Please sign up or login with your details

Forgot password? Click here to reset