Global Self-Attention Networks for Image Recognition

10/06/2020
by   Shen Zhuoran, et al.
40

Recently, a series of works in computer vision have shown promising results on various image and video understanding tasks using self-attention. However, due to the quadratic computational and memory complexities of self-attention, these works either apply attention only to low-resolution feature maps in later stages of a deep network or restrict the receptive field of attention in each layer to a small local region. To overcome these limitations, this work introduces a new global self-attention module, referred to as the GSA module, which is efficient enough to serve as the backbone component of a deep network. This module consists of two parallel layers: a content attention layer that attends to pixels based only on their content and a positional attention layer that attends to pixels based on their spatial locations. The output of this module is the sum of the outputs of the two layers. Based on the proposed GSA module, we introduce new standalone global attention-based deep networks that use GSA modules instead of convolutions to model pixel interactions. Due to the global extent of the proposed GSA module, a GSA network has the ability to model long-range pixel interactions throughout the network. Our experimental results show that GSA networks outperform the corresponding convolution-based networks significantly on the CIFAR-100 and ImageNet datasets while using less parameters and computations. The proposed GSA networks also outperform various existing attention-based networks on the ImageNet dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2022

Rethinking Query-Key Pairwise Interactions in Vision Transformers

Vision Transformers have achieved state-of-the-art performance in many v...
research
11/24/2021

MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video

Self-attention has become an integral component of the recent network ar...
research
04/12/2021

GAttANet: Global attention agreement for convolutional neural networks

Transformer attention architectures, similar to those developed for natu...
research
05/20/2019

Less Memory, Faster Speed: Refining Self-Attention Module for Image Reconstruction

Self-attention (SA) mechanisms can capture effectively global dependenci...
research
01/22/2022

Linear Array Network for Low-light Image Enhancement

Convolution neural networks (CNNs) based methods have dominated the low-...
research
12/02/2022

Dunhuang murals contour generation network based on convolution and self-attention fusion

Dunhuang murals are a collection of Chinese style and national style, fo...
research
02/17/2021

LambdaNetworks: Modeling Long-Range Interactions Without Attention

We present lambda layers – an alternative framework to self-attention – ...

Please sign up or login with your details

Forgot password? Click here to reset