Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation

03/30/2021
by   Shuning Chang, et al.
9

Temporal action proposal generation (TAPG) is a fundamental and challenging task in video understanding, especially in temporal action detection. Most previous works focus on capturing the local temporal context and can well locate simple action instances with clean frames and clear boundaries. However, they generally fail in complicated scenarios where interested actions involve irrelevant frames and background clutters, and the local temporal context becomes less effective. To deal with these problems, we present an augmented transformer with adaptive graph network (ATAG) to exploit both long-range and local temporal contexts for TAPG. Specifically, we enhance the vanilla transformer by equipping a snippet actionness loss and a front block, dubbed augmented transformer, and it improves the abilities of capturing long-range dependencies and learning robust feature for noisy action instances.Moreover, an adaptive graph convolutional network (GCN) is proposed to build local temporal context by mining the position information and difference between adjacent features. The features from the two modules carry rich semantic information of the video, and are fused for effective sequential proposal generation. Extensive experiments are conducted on two challenging datasets, THUMOS14 and ActivityNet1.3, and the results demonstrate that our method outperforms state-of-the-art TAPG methods. Our code will be released soon.

READ FULL TEXT

page 3

page 7

page 12

research
11/26/2019

G-TAD: Sub-Graph Localization for Temporal Action Detection

Temporal action detection is a fundamental yet challenging task in video...
research
02/03/2021

Relaxed Transformer Decoders for Direct Action Proposal Generation

Temporal action proposal generation is an important and challenging task...
research
11/26/2019

SRG: Snippet Relatedness-based Temporal Action Proposal Generator

Recent temporal action proposal generation approaches have suggested int...
research
06/21/2022

Pyramid Region-based Slot Attention Network for Temporal Action Proposal Generation

It has been found that temporal action proposal generation, which aims t...
research
06/03/2021

Anticipative Video Transformer

We propose Anticipative Video Transformer (AVT), an end-to-end attention...
research
07/04/2020

Structure-Aware Human-Action Generation

Generating long-range skeleton-based human actions has been a challengin...
research
07/17/2021

Agent-Environment Network for Temporal Action Proposal Generation

Temporal action proposal generation is an essential and challenging task...

Please sign up or login with your details

Forgot password? Click here to reset