STAU: A SpatioTemporal-Aware Unit for Video Prediction and Beyond

04/20/2022
by   Zheng Chang, et al.
0

Video prediction aims to predict future frames by modeling the complex spatiotemporal dynamics in videos. However, most of the existing methods only model the temporal information and the spatial information for videos in an independent manner but haven't fully explored the correlations between both terms. In this paper, we propose a SpatioTemporal-Aware Unit (STAU) for video prediction and beyond by exploring the significant spatiotemporal correlations in videos. On the one hand, the motion-aware attention weights are learned from the spatial states to help aggregate the temporal states in the temporal domain. On the other hand, the appearance-aware attention weights are learned from the temporal states to help aggregate the spatial states in the spatial domain. In this way, the temporal information and the spatial information can be greatly aware of each other in both domains, during which, the spatiotemporal receptive field can also be greatly broadened for more reliable spatiotemporal modeling. Experiments are not only conducted on traditional video prediction tasks but also other tasks beyond video prediction, including the early action recognition and object detection tasks. Experimental results show that our STAU can outperform other methods on all tasks in terms of performance and computation efficiency.

READ FULL TEXT

page 6

page 7

page 8

page 10

page 11

page 13

research
03/30/2022

STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction

Although many video prediction methods have obtained good performance in...
research
02/21/2017

Object Detection in Videos with Tubelet Proposal Networks

Object detection in videos has drawn increasing attention recently with ...
research
06/09/2022

STIP: A SpatioTemporal Information-Preserving and Perception-Augmented Model for High-Resolution Video Prediction

Although significant achievements have been achieved by recurrent neural...
research
09/01/2023

ARFA: An Asymmetric Receptive Field Autoencoder Model for Spatiotemporal Prediction

Spatiotemporal prediction aims to generate future sequences by paradigms...
research
04/09/2019

Learning from Videos with Deep Convolutional LSTM Networks

This paper explores the use of convolution LSTMs to simultaneously learn...
research
01/25/2023

Accelerating Domain-aware Deep Learning Models with Distributed Training

Recent advances in data-generating techniques led to an explosive growth...
research
03/30/2017

TS-LSTM and Temporal-Inception: Exploiting Spatiotemporal Dynamics for Activity Recognition

Recent two-stream deep Convolutional Neural Networks (ConvNets) have mad...

Please sign up or login with your details

Forgot password? Click here to reset