OadTR: Online Action Detection with Transformers

06/21/2021
by   Xiang Wang, et al.
0

Most recent approaches for online action detection tend to apply Recurrent Neural Network (RNN) to capture long-range temporal structure. However, RNN suffers from non-parallelism and gradient vanishing, hence it is hard to be optimized. In this paper, we propose a new encoder-decoder framework based on Transformers, named OadTR, to tackle these problems. The encoder attached with a task token aims to capture the relationships and global interactions between historical observations. The decoder extracts auxiliary information by aggregating anticipated future clip representations. Therefore, OadTR can recognize current actions by encoding historical information and predicting future context simultaneously. We extensively evaluate the proposed OadTR on three challenging datasets: HDD, TVSeries, and THUMOS14. The experimental results show that OadTR achieves higher training and inference speeds than current RNN based approaches, and significantly outperforms the state-of-the-art methods in terms of both mAP and mcAP. Code is available at https://github.com/wangxiang1230/OadTR.

READ FULL TEXT

page 3

page 8

research
11/18/2018

Temporal Recurrent Networks for Online Action Detection

Most work on temporal action detection is formulated in an offline manne...
research
12/02/2021

TCTN: A 3D-Temporal Convolutional Transformer Network for Spatiotemporal Predictive Learning

Spatiotemporal predictive learning is to generate future frames given a ...
research
03/30/2018

Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present

Recently, caption generation with an encoder-decoder framework has been ...
research
06/21/2023

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

Scene text removal (STR) aims at replacing text strokes in natural scene...
research
07/27/2021

Exploring Sequence Feature Alignment for Domain Adaptive Detection Transformers

Detection transformers have recently shown promising object detection re...
research
06/05/2023

DecompX: Explaining Transformers Decisions by Propagating Token Decomposition

An emerging solution for explaining Transformer-based models is to use v...
research
08/15/2023

Memory-and-Anticipation Transformer for Online Action Understanding

Most existing forecasting systems are memory-based methods, which attemp...

Please sign up or login with your details

Forgot password? Click here to reset