Analyzing and Improving the Pyramidal Predictive Network for Future Video Frame Prediction

01/13/2023
by   Chaofan Ling, et al.
0

The pyramidal predictive network (PPNV1) proposes an interesting temporal pyramid architecture and yields promising results on the task of future video-frame prediction. We expose and analyze its signal dissemination and characteristic artifacts, and propose corresponding improvements in model architecture and training strategies to address them. Although the PPNV1 theoretically mimics the workings of human brain, its careless signal processing leads to aliasing in the network. We redesign the network architecture to solve the problems. In addition to improving the unreasonable information dissemination, the new architecture also aims to solve the aliasing in neural networks. Different inputs are no longer simply concatenated, and the downsampling and upsampling components have also been redesigned to ensure that the network can more easily construct images from Fourier features of low-frequency inputs. Finally, we further improve the training strategies, to alleviate the problem of input inconsistency during training and testing. Overall, the improved model is more interpretable, stronger, and the quality of its predictions is better. Code is available at https://github.com/Ling-CF/PPNV2.

READ FULL TEXT

page 1

page 2

page 8

page 9

page 10

research
01/25/2022

Capturing Temporal Information in a Single Frame: Channel Sampling Strategies for Action Recognition

We address the problem of capturing temporal information for video class...
research
08/15/2022

Pyramidal Predictive Network: A Model for Visual-frame Prediction Based on Predictive Coding Theory

Visual-frame prediction is a pixel-dense prediction task that infers fut...
research
12/22/2022

Predictive Coding Based Multiscale Network with Encoder-Decoder LSTM for Video Prediction

We are introducing a multi-scale predictive model for video prediction h...
research
11/08/2021

Triple-level Model Inferred Collaborative Network Architecture for Video Deraining

Video deraining is an important issue for outdoor vision systems and has...
research
03/26/2023

Frame Flexible Network

Existing video recognition algorithms always conduct different training ...
research
07/26/2019

Multi-Stage Prediction Networks for Data Harmonization

In this paper, we introduce multi-task learning (MTL) to data harmonizat...
research
02/25/2018

A Dataset To Evaluate The Representations Learned By Video Prediction Models

We present a parameterized synthetic dataset called Moving Symbols to su...

Please sign up or login with your details

Forgot password? Click here to reset