Wide and Narrow: Video Prediction from Context and Motion

10/22/2021
by   Jaehoon Cho, et al.
0

Video prediction, forecasting the future frames from a sequence of input frames, is a challenging task since the view changes are influenced by various factors, such as the global context surrounding the scene and local motion dynamics. In this paper, we propose a new framework to integrate these complementary attributes to predict complex pixel dynamics through deep networks. We present global context propagation networks that iteratively aggregate the non-local neighboring representations to preserve the contextual information over the past frames. To capture the local motion pattern of objects, we also devise local filter memory networks that generate adaptive filter kernels by storing the prototypical motion of moving objects in the memory. The proposed framework, utilizing the outputs from both networks, can address blurry predictions and color distortion. We conduct experiments on Caltech pedestrian and UCF101 datasets, and demonstrate state-of-the-art results. Especially for multi-step prediction, we obtain an outstanding performance in quantitative and qualitative evaluation.

READ FULL TEXT

page 2

page 4

page 7

page 8

page 9

research
04/01/2020

Future Video Synthesis with Object Motion Prediction

We present an approach to predict future video frames given a sequence o...
research
12/02/2018

Disentangling Propagation and Generation for Video Prediction

Learning to predict future video frames is a challenging task. Recent ap...
research
12/26/2018

3D PersonVLAD: Learning Deep Global Representations for Video-based Person Re-identification

In this paper, we introduce a global video representation to video-based...
research
04/20/2021

Learning Semantic-Aware Dynamics for Video Prediction

We propose an architecture and training scheme to predict video frames b...
research
10/23/2017

Fully Context-Aware Video Prediction

This paper proposes a new neural network design for unsupervised learnin...
research
10/27/2019

Non-Local ConvLSTM for Video Compression Artifact Reduction

Video compression artifact reduction aims to recover high-quality videos...
research
11/04/2019

Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision

We tackle the problem of Human Locomotion Forecasting, a task for jointl...

Please sign up or login with your details

Forgot password? Click here to reset