Fully Context-Aware Video Prediction

by   Wonmin Byeon, et al.

This paper proposes a new neural network design for unsupervised learning through video prediction. Current video prediction models based on convolutional networks, recurrent networks, and their combinations often result in blurry predictions. Recent work has attempted to address this issue with techniques like separation of background and foreground modeling, motion flow learning, or adversarial training. We highlight that a contributing factor for this problem is the failure of current architectures to fully capture relevant past information for accurately predicting the future. To address this shortcoming we introduce a fully context-aware architecture, which captures the entire available past context for each pixel using Parallel Multi-Dimensional LSTM units and aggregates it using context blending blocks. Our model is simple, efficient and directly applicable to high resolution video frames. It yields state-of-the-art performance for next step prediction on three challenging real-world video datasets: Human 3.6M, Caltech Pedestrian, and UCF-101 and produces sharp predictions of high visual quality.



There are no comments yet.


page 4

page 8

page 9


Disentangling Propagation and Generation for Video Prediction

Learning to predict future video frames is a challenging task. Recent ap...

Context-Aware Trajectory Prediction

Human motion and behaviour in crowded spaces is influenced by several fa...

FitVid: Overfitting in Pixel-Level Video Prediction

An agent that is capable of predicting what happens next can perform a v...

High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks

Predicting future video frames is extremely challenging, as there are ma...

Transformation-based Adversarial Video Prediction on Large-Scale Data

Recent breakthroughs in adversarial generative modeling have led to mode...

Wide and Narrow: Video Prediction from Context and Motion

Video prediction, forecasting the future frames from a sequence of input...

Learning Occupancy Priors of Human Motion from Semantic Maps of Urban Environments

Understanding and anticipating human activity is an important capability...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.