CMS-LSTM: Context-Embedding and Multi-Scale Spatiotemporal-Expression LSTM for Video Prediction

02/06/2021
by   Zenghao Chai, et al.
0

Extracting variation and spatiotemporal features via limited frames remains as an unsolved and challenging problem in video prediction. Inherent uncertainty among consecutive frames exacerbates the difficulty in long-term prediction. To tackle the problem, we focus on capturing context correlations and multi-scale spatiotemporal flows, then propose CMS-LSTM by integrating two effective and lightweight blocks, namely Context-Embedding (CE) and Spatiotemporal-Expression (SE) block, into ConvLSTM backbone. CE block is designed for abundant context interactions, while SE block focuses on multi-scale spatiotemporal expression in hidden states. The newly introduced blocks also facilitate other spatiotemporal models (e.g., PredRNN, SA-ConvLSTM) to produce representative implicit features for video prediction. Qualitative and quantitative experiments demonstrate the effectiveness and flexibility of our proposed method. We use fewer parameters to reach markedly state-of-the-art results on Moving MNIST and TaxiBJ datasets in numbers of metrics. All source code is available at https://github.com/czh-98/CMS-LSTM.

READ FULL TEXT

page 4

page 5

page 6

research
10/25/2021

MoDeRNN: Towards Fine-grained Motion Details for Spatiotemporal Predictive Learning

Spatiotemporal predictive learning (ST-PL) aims at predicting the subseq...
research
08/19/2023

SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM

Integrating CNNs and RNNs to capture spatiotemporal dependencies is a pr...
research
12/10/2020

SE-ECGNet: A Multi-scale Deep Residual Network with Squeeze-and-Excitation Module for ECG Signal Classification

The classification of electrocardiogram (ECG) signals, which takes much ...
research
09/13/2023

Aggregating Long-term Sharp Features via Hybrid Transformers for Video Deblurring

Video deblurring methods, aiming at recovering consecutive sharp frames ...
research
09/09/2022

EchoCoTr: Estimation of the Left Ventricular Ejection Fraction from Spatiotemporal Echocardiography

Learning spatiotemporal features is an important task for efficient vide...
research
06/09/2022

STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction

Although significant achievements have been achieved by recurrent neural...
research
11/10/2022

Spatiotemporal k-means

Spatiotemporal data is readily available due to emerging sensor and data...

Please sign up or login with your details

Forgot password? Click here to reset