Interaction-aware Spatio-temporal Pyramid Attention Networks for Action Classification

08/03/2018
by   Yang Du, et al.
0

Local features at neighboring spatial positions in feature maps have high correlation since their receptive fields are often overlapped. Self-attention usually uses the weighted sum (or other functions) with internal elements of each local feature to obtain its weight score, which ignores interactions among local features. To address this, we propose an effective interaction-aware self-attention model inspired by PCA to learn attention maps. Furthermore, since different layers in a deep network capture feature maps of different scales, we use these feature maps to construct a spatial pyramid and then utilize multi-scale information to obtain more accurate attention scores, which are used to weight the local features in all spatial positions of feature maps to calculate attention maps. Moreover, our spatial pyramid attention is unrestricted to the number of its input feature maps so it is easily extended to a spatio-temporal version. Finally, our model is embedded in general CNNs to form end-to-end attention networks for action classification. Experimental results show that our method achieves the state-of-the-art results on the UCF101, HMDB51 and untrimmed Charades.

READ FULL TEXT

page 13

page 14

research
05/28/2022

Feature Pyramid Attention based Residual Neural Network for Environmental Sound Classification

Environmental sound classification (ESC) is a challenging problem due to...
research
08/05/2018

Self-Attention Recurrent Network for Saliency Detection

Feature maps in deep neural network generally contain different semantic...
research
07/10/2022

Self-attention on Multi-Shifted Windows for Scene Segmentation

Scene segmentation in images is a fundamental yet challenging problem in...
research
08/10/2023

Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention

Convolutional neural networks (CNNs) and vision transformers (ViTs) have...
research
12/23/2018

Chinese Herbal Recognition based on Competitive Attentional Fusion of Multi-hierarchies Pyramid Features

Convolution neural netwotks (CNNs) are successfully applied in image rec...
research
09/01/2020

SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

We present a novel framework, Spatial Pyramid Attention Network (SPAN) f...
research
11/17/2016

AutoScaler: Scale-Attention Networks for Visual Correspondence

Finding visual correspondence between local features is key to many comp...

Please sign up or login with your details

Forgot password? Click here to reset