Disentangling Propagation and Generation for Video Prediction

12/02/2018
by   Hang Gao, et al.
0

Learning to predict future video frames is a challenging task. Recent approaches for natural scenes directly predict pixels via inferring appearance flow and using flow-guided warping. Such models excel when motion estimates are accurate, but the motion may be ambiguous or erroneous in many real scenes. When scene motion exposes new regions of the scene, motion-based prediction yields poor results. However, learning to predict novel pixels directly can also require a prohibitive amount of training. In this work, we present a confidence-aware spatial-temporal context encoder for video prediction called Flow-Grounded Video Prediction (FGVP), in which motion propagation and novel pixel generation are first disentangled and then fused according to computed flow uncertainty map. For regions where motion-based prediction shows low-confidence, our model uses a conditional context encoder to hallucinate appropriate content. We test our methods on the standard CalTech Pedestrian dataset and the more challenging KITTI Flow dataset of larger motions and occlusions. Our methods produce both sharp and natural predictions compared to previous works, achieving the state-of-the-art performance on both datasets.

READ FULL TEXT

page 1

page 6

page 7

page 12

page 13

research
04/20/2021

Learning Semantic-Aware Dynamics for Video Prediction

We propose an architecture and training scheme to predict video frames b...
research
10/23/2017

Fully Context-Aware Video Prediction

This paper proposes a new neural network design for unsupervised learnin...
research
08/24/2023

Motion-Guided Masking for Spatiotemporal Representation Learning

Several recent works have directly extended the image masked autoencoder...
research
10/22/2021

Wide and Narrow: Video Prediction from Context and Motion

Video prediction, forecasting the future frames from a sequence of input...
research
10/16/2019

Animating Landscape: Self-Supervised Learning of Decoupled Motion and Appearance for Single-Image Video Synthesis

Automatic generation of a high-quality video from a single image remains...
research
10/15/2014

Detection of Salient Regions in Crowded Scenes

The increasing number of cameras and a handful of human operators to mon...
research
09/03/2020

Flow-edge Guided Video Completion

We present a new flow-based video completion algorithm. Previous flow co...

Please sign up or login with your details

Forgot password? Click here to reset