Video Prediction by Efficient Transformers

12/12/2022
by   Xi Ye, et al.
0

Video prediction is a challenging computer vision task that has a wide range of applications. In this work, we present a new family of Transformer-based models for video prediction. Firstly, an efficient local spatial-temporal separation attention mechanism is proposed to reduce the complexity of standard Transformers. Then, a full autoregressive model, a partial autoregressive model and a non-autoregressive model are developed based on the new efficient Transformer. The partial autoregressive model has a similar performance with the full autoregressive model but a faster inference speed. The non-autoregressive model not only achieves a faster inference speed but also mitigates the quality degradation problem of the autoregressive counterparts, but it requires additional parameters and loss function for learning. Given the same attention mechanism, we conducted a comprehensive study to compare the proposed three video prediction variants. Experiments show that the proposed video prediction models are competitive with more complex state-of-the-art convolutional-LSTM based models. The source code is available at https://github.com/XiYe20/VPTR.

READ FULL TEXT

page 6

page 8

research
03/29/2022

VPTR: Efficient Transformers for Video Prediction

In this paper, we propose a new Transformer block for video future frame...
research
06/08/2021

FastSeq: Make Sequence Generation Faster

Transformer-based models have made tremendous impacts in natural languag...
research
06/17/2021

Semi-Autoregressive Transformer for Image Captioning

Current state-of-the-art image captioning models adopt autoregressive de...
research
04/02/2023

FANS: Fast Non-Autoregressive Sequence Generation for Item List Continuation

User-curated item lists, such as video-based playlists on Youtube and bo...
research
03/14/2023

Implicit Stacked Autoregressive Model for Video Prediction

Future frame prediction has been approached through two primary methods:...
research
09/08/2021

Highly Parallel Autoregressive Entity Linking with Discriminative Correction

Generative approaches have been recently shown to be effective for both ...
research
10/11/2022

Continuous conditional video synthesis by neural processes

We propose a unified model for multiple conditional video synthesis task...

Please sign up or login with your details

Forgot password? Click here to reset