Structure-Regularized Attention for Deformable Object Representation

06/12/2021
by   Shenao Zhang, et al.
0

Capturing contextual dependencies has proven useful to improve the representational power of deep neural networks. Recent approaches that focus on modeling global context, such as self-attention and non-local operation, achieve this goal by enabling unconstrained pairwise interactions between elements. In this work, we consider learning representations for deformable objects which can benefit from context exploitation by modeling the structural dependencies that the data intrinsically possesses. To this end, we provide a novel structure-regularized attention mechanism, which formalizes feature interaction as structural factorization through the use of a pair of light-weight operations. The instantiated building blocks can be directly incorporated into modern convolutional neural networks, to boost the representational power in an efficient manner. Comprehensive studies on multiple tasks and empirical comparisons with modern attention mechanisms demonstrate the gains brought by our method in terms of both performance and model complexity. We further investigate its effect on feature representations, showing that our trained models can capture diversified representations characterizing object parts without resorting to extra supervision.

READ FULL TEXT
research
02/15/2019

Context-Aware Self-Attention Networks

Self-attention model have shown its flexibility in parallel computation ...
research
10/24/2018

Modeling Localness for Self-Attention Networks

Self-attention networks have proven to be of profound value for its stre...
research
07/17/2020

Region-based Non-local Operation for Video Classification

Convolutional Neural Networks (CNNs) model long-range dependencies by de...
research
02/24/2023

Spatial Bias for Attention-free Non-local Neural Networks

In this paper, we introduce the spatial bias to learn global knowledge w...
research
06/04/2021

X-volution: On the unification of convolution and self-attention

Convolution and self-attention are acting as two fundamental building bl...
research
08/04/2022

Data-driven Attention and Data-independent DCT based Global Context Modeling for Text-independent Speaker Recognition

Learning an effective speaker representation is crucial for achieving re...
research
01/09/2020

Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias

Existing models often leverage co-occurrences between objects and their ...

Please sign up or login with your details

Forgot password? Click here to reset