Region-based Non-local Operation for Video Classification

07/17/2020
by   Guoxi Huang, et al.
0

Convolutional Neural Networks (CNNs) model long-range dependencies by deeply stacking convolution operations with small window sizes, which makes the optimizations difficult. This paper presents region-based non-local operation (RNL), a family of self-attention mechanisms, which can directly capture long-range dependencies without a deep stack of local operations. Given an intermediate feature map, our method recalibrates the feature at a position by aggregating information from the neighboring regions of all positions. By combining a channel attention module with the proposed RNL, we design an attention chain, which can be integrated into off-the-shelf CNNs for end-to-end training. We evaluate our method on two video classification benchmarks. The experimental result of our method outperforms other attention mechanisms, and we achieve state-of-the-art performance on Something-Something V1.

READ FULL TEXT

page 2

page 6

research
03/17/2020

Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

Convolution exploits locality for efficiency at a cost of missing long r...
research
07/23/2020

AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification

Convolutional operations have two limitations: (1) do not explicitly mod...
research
02/24/2023

Spatial Bias for Attention-free Non-local Neural Networks

In this paper, we introduce the spatial bias to learn global knowledge w...
research
11/07/2020

Non-local convolutional neural networks (nlcnn) for speaker recognition

Speaker recognition is the process of identifying a speaker based on the...
research
04/05/2019

Relation-Aware Global Attention

Attention mechanism aims to increase the representation power by focusin...
research
11/23/2022

GhostNetV2: Enhance Cheap Operation with Long-Range Attention

Light-weight convolutional neural networks (CNNs) are specially designed...
research
06/12/2021

Structure-Regularized Attention for Deformable Object Representation

Capturing contextual dependencies has proven useful to improve the repre...

Please sign up or login with your details

Forgot password? Click here to reset