Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

02/23/2020
by   Beibei Jin, et al.
0

Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current predictive models, which lead to image distortion and temporal inconsistency. In this paper, we point out the necessity of exploring multi-frequency analysis to deal with the two problems. Inspired by the frequency band decomposition characteristic of Human Vision System (HVS), we propose a video prediction network based on multi-level wavelet analysis to deal with spatial and temporal information in a unified manner. Specifically, the multi-level spatial discrete wavelet transform decomposes each video frame into anisotropic sub-bands with multiple frequencies, helping to enrich structural information and reserve fine details. On the other hand, multi-level temporal discrete wavelet transform which operates on time axis decomposes the frame sequence into sub-band groups of different frequencies to accurately capture multi-frequency motions under a fixed frame rate. Extensive experiments on diverse datasets demonstrate that our model shows significant improvements on fidelity and temporal consistency over state-of-the-art works.

READ FULL TEXT

page 1

page 2

page 7

page 8

research
12/26/2018

Spatial and Temporal Mutual Promotion for Video-based Person Re-identification

Video-based person re-identification is a crucial task of matching video...
research
07/27/2022

Efficient Video Deblurring Guided by Motion Magnitude

Video deblurring is a highly under-constrained problem due to the spatia...
research
08/15/2022

Pyramidal Predictive Network: A Model for Visual-frame Prediction Based on Predictive Coding Theory

Visual-frame prediction is a pixel-dense prediction task that infers fut...
research
07/25/2018

Flow-Grounded Spatial-Temporal Video Prediction from Still Images

Existing video prediction methods mainly rely on observing multiple hist...
research
06/14/2021

Group-based Bi-Directional Recurrent Wavelet Neural Networks for Video Super-Resolution

Video super-resolution (VSR) aims to estimate a high-resolution (HR) fra...
research
06/29/2023

Low-Light Enhancement in the Frequency Domain

Decreased visibility, intensive noise, and biased color are the common p...
research
11/15/2022

Dynamic Temporal Filtering in Video Models

Video temporal dynamics is conventionally modeled with 3D spatial-tempor...

Please sign up or login with your details

Forgot password? Click here to reset