Taylor saves for later: disentanglement for video prediction using Taylor representation

05/24/2021
by   Ting Pan, et al.
0

Video prediction is a challenging task with wide application prospects in meteorology and robot systems. Existing works fail to trade off short-term and long-term prediction performances and extract robust latent dynamics laws in video frames. We propose a two-branch seq-to-seq deep model to disentangle the Taylor feature and the residual feature in video frames by a novel recurrent prediction module (TaylorCell) and residual module. TaylorCell can expand the video frames' high-dimensional features into the finite Taylor series to describe the latent laws. In TaylorCell, we propose the Taylor prediction unit (TPU) and the memory correction unit (MCU). TPU employs the first input frame's derivative information to predict the future frames, avoiding error accumulation. MCU distills all past frames' information to correct the predicted Taylor feature from TPU. Correspondingly, the residual module extracts the residual feature complementary to the Taylor feature. On three generalist datasets (Moving MNIST, TaxiBJ, Human 3.6), our model outperforms or reaches state-of-the-art models, and ablation experiments demonstrate the effectiveness of our model in long-term prediction.

READ FULL TEXT

page 3

page 6

page 7

research
12/12/2018

Long-Term Feature Banks for Detailed Video Understanding

To understand the world, we humans constantly need to relate the present...
research
04/14/2021

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction

Learning to predict the long-term future of video frames is notoriously ...
research
09/09/2022

Domain-specific Learning of Multi-scale Facial Dynamics for Apparent Personality Traits Prediction

Human personality decides various aspects of their daily life and workin...
research
04/20/2019

Cubic LSTMs for Video Prediction

Predicting future frames in videos has become a promising direction of r...
research
06/11/2018

Learning to Decompose and Disentangle Representations for Video Prediction

Our goal is to predict future video frames given a sequence of input fra...
research
06/17/2023

Fast Fourier Inception Networks for Occluded Video Prediction

Video prediction is a pixel-level task that generates future frames by e...
research
07/18/2019

Video Prediction for Precipitation Nowcasting

Video prediction, which aims to synthesize new consecutive frames subseq...

Please sign up or login with your details

Forgot password? Click here to reset