TCTN: A 3D-Temporal Convolutional Transformer Network for Spatiotemporal Predictive Learning

12/02/2021
by   Ziao Yang, et al.
0

Spatiotemporal predictive learning is to generate future frames given a sequence of historical frames. Conventional algorithms are mostly based on recurrent neural networks (RNNs). However, RNN suffers from heavy computational burden such as time and long back-propagation process due to the seriality of recurrent structure. Recently, Transformer-based methods have also been investigated in the form of encoder-decoder or plain encoder, but the encoder-decoder form requires too deep networks and the plain encoder is lack of short-term dependencies. To tackle these problems, we propose an algorithm named 3D-temporal convolutional transformer (TCTN), where a transformer-based encoder with temporal convolutional layers is employed to capture short-term and long-term dependencies. Our proposed algorithm can be easy to implement and trained much faster compared with RNN-based methods thanks to the parallel mechanism of Transformer. To validate our algorithm, we conduct experiments on the MovingMNIST and KTH dataset, and show that TCTN outperforms state-of-the-art (SOTA) methods in both performance and training speed.

READ FULL TEXT

page 3

page 6

research
07/21/2021

Audio Captioning Transformer

Audio captioning aims to automatically generate a natural language descr...
research
07/22/2021

Tsformer: Time series Transformer for tourism demand forecasting

AI-based methods have been widely applied to tourism demand forecasting....
research
06/21/2021

OadTR: Online Action Detection with Transformers

Most recent approaches for online action detection tend to apply Recurre...
research
05/03/2019

Temporal Deformable Convolutional Encoder-Decoder Networks for Video Captioning

It is well believed that video captioning is a fundamental but challengi...
research
02/21/2020

Transformer Hawkes Process

Modern data acquisition routinely produce massive amounts of event seque...
research
08/22/2019

Convolutional Recurrent Reconstructive Network for Spatiotemporal Anomaly Detection in Solder Paste Inspection

Surface mount technology (SMT) is a process for producing printed circui...
research
04/17/2018

PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

We present PredRNN++, an improved recurrent network for video predictive...

Please sign up or login with your details

Forgot password? Click here to reset