An Empirical Study of Spatial Attention Mechanisms in Deep Networks

04/11/2019
by   Xizhou Zhu, et al.
0

Attention mechanisms have become a popular component in deep neural networks, yet there has been little examination of how different influencing factors and methods for computing attention from these factors affect performance. Toward a better general understanding of attention mechanisms, we present an empirical study that ablates various spatial attention elements within a generalized attention formulation, encompassing the dominant Transformer attention as well as the prevalent deformable convolution and dynamic convolution modules. Conducted on a variety of applications, the study yields significant findings about spatial attention in deep networks, some of which run counter to conventional understanding. For example, we find that the query and key content comparison in Transformer attention is negligible for self-attention, but vital for encoder-decoder attention. A proper combination of deformable convolution with key content only saliency achieves the best accuracy-efficiency tradeoff in self-attention. Our results suggest that there exists much room for improvement in the design of attention mechanisms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/27/2019

IS Attention All What You Need? – An Empirical Investigation on Convolution-Based Active Memory and Self-Attention

The key to a Transformer model is the self-attention mechanism, which al...
research
06/03/2022

EAANet: Efficient Attention Augmented Convolutional Networks

Humans can effectively find salient regions in complex scenes. Self-atte...
research
01/09/2023

DeMT: Deformable Mixer Transformer for Multi-Task Learning of Dense Prediction

Convolution neural networks (CNNs) and Transformers have their own advan...
research
11/18/2019

Affine Self Convolution

Attention mechanisms, and most prominently self-attention, are a powerfu...
research
07/06/2021

COVID-19 Pneumonia Severity Prediction using Hybrid Convolution-Attention Neural Architectures

This study proposed a novel framework for COVID-19 severity prediction, ...
research
12/23/2021

Assessing the Impact of Attention and Self-Attention Mechanisms on the Classification of Skin Lesions

Attention mechanisms have raised significant interest in the research co...
research
08/30/2021

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Convolutional neural networks (CNN) are the dominant deep neural network...

Please sign up or login with your details

Forgot password? Click here to reset