Anticipative Video Transformer

06/03/2021
by   Rohit Girdhar, et al.
0

We propose Anticipative Video Transformer (AVT), an end-to-end attention-based video modeling architecture that attends to the previously observed video in order to anticipate future actions. We train the model jointly to predict the next action in a video sequence, while also learning frame feature encoders that are predictive of successive future frames' features. Compared to existing temporal aggregation strategies, AVT has the advantage of both maintaining the sequential progression of observed actions while still capturing long-range dependencies–both critical for the anticipation task. Through extensive experiments, we show that AVT obtains the best reported performance on four popular action anticipation benchmarks: EpicKitchens-55, EpicKitchens-100, EGTEA Gaze+, and 50-Salads; and it wins first place in the EpicKitchens-100 CVPR'21 challenge.

READ FULL TEXT

page 1

page 4

page 14

research
05/27/2022

Future Transformer for Long-term Action Anticipation

The task of predicting future actions from a video is crucial for a real...
research
03/07/2020

TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation

Video action anticipation aims to predict future action categories from ...
research
10/20/2022

Rethinking Learning Approaches for Long-Term Action Anticipation

Action anticipation involves predicting future actions having observed t...
research
06/09/2022

GateHUB: Gated History Unit with Background Suppression for Online Action Detection

Online action detection is the task of predicting the action as soon as ...
research
05/26/2022

Efficient U-Transformer with Boundary-Aware Loss for Action Segmentation

Action classification has made great progress, but segmenting and recogn...
research
03/30/2021

Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation

Temporal action proposal generation (TAPG) is a fundamental and challeng...
research
07/04/2023

Technical Report for Ego4D Long Term Action Anticipation Challenge 2023

In this report, we describe the technical details of our approach for th...

Please sign up or login with your details

Forgot password? Click here to reset