Class Semantics-based Attention for Action Detection

09/06/2021
by   Deepak Sridhar, et al.
0

Action localization networks are often structured as a feature encoder sub-network and a localization sub-network, where the feature encoder learns to transform an input video to features that are useful for the localization sub-network to generate reliable action proposals. While some of the encoded features may be more useful for generating action proposals, prior action localization approaches do not include any attention mechanism that enables the localization sub-network to attend more to the more important features. In this paper, we propose a novel attention mechanism, the Class Semantics-based Attention (CSA), that learns from the temporal distribution of semantics of action classes present in an input video to find the importance scores of the encoded features, which are used to provide attention to the more useful encoded features. We demonstrate on two popular action detection datasets that incorporating our novel attention mechanism provides considerable performance gains on competitive action detection models (e.g., around 6.2 over BMN action detection baseline to obtain 47.5 dataset), and a new state-of-the-art of 36.25 dataset. Further, the CSA localization model family which includes BMN-CSA, was part of the second-placed submission at the 2021 ActivityNet action localization challenge. Our attention mechanism outperforms prior self-attention modules such as the squeeze-and-excitation in action detection task. We also observe that our attention mechanism is complementary to such self-attention modules in that performance improvements are seen when both are used together.

READ FULL TEXT

page 3

page 8

research
08/09/2023

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection

We present PAT, a transformer-based network that learns complex temporal...
research
11/13/2020

SALAD: Self-Assessment Learning for Action Detection

Literature on self-assessment in machine learning mainly focuses on the ...
research
05/20/2022

Structured Attention Composition for Temporal Action Localization

Temporal action localization aims at localizing action instances from un...
research
07/28/2021

TransAction: ICL-SJTU Submission to EPIC-Kitchens Action Anticipation Challenge 2021

In this report, the technical details of our submission to the EPIC-Kitc...
research
06/28/2023

Sequential Attention Source Identification Based on Feature Representation

Snapshot observation based source localization has been widely studied d...
research
07/06/2021

Self-Adversarial Training incorporating Forgery Attention for Image Forgery Localization

Image editing techniques enable people to modify the content of an image...
research
02/16/2022

ActionFormer: Localizing Moments of Actions with Transformers

Self-attention based Transformer models have demonstrated impressive res...

Please sign up or login with your details

Forgot password? Click here to reset