Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation

03/17/2020
by   Huiyu Wang, et al.
8

Convolution exploits locality for efficiency at a cost of missing long range context. Self-attention has been adopted to augment CNNs with non-local interactions. Recent works prove it possible to stack self-attention layers to obtain a fully attentional network by restricting the attention to a local region. In this paper, we attempt to remove this constraint by factorizing 2D self-attention into two 1D self-attentions. This reduces computation complexity and allows performing attention within a larger or even global region. In companion, we also propose a position-sensitive self-attention design. Combining both yields our position-sensitive axial-attention layer, a novel building block that one could stack to form axial-attention models for image classification and dense prediction. We demonstrate the effectiveness of our model on four large-scale datasets. In particular, our model outperforms all existing stand-alone self-attention models on ImageNet. Our Axial-DeepLab improves 2.8 previous state-of-the-art is attained by our small variant that is 3.8x parameter-efficient and 27x computation-efficient. Axial-DeepLab also achieves state-of-the-art results on Mapillary Vistas and Cityscapes.

READ FULL TEXT

page 22

page 23

page 24

research
06/13/2019

Stand-Alone Self-Attention in Vision Models

Convolutions are a fundamental building block of modern computer vision ...
research
07/17/2020

Region-based Non-local Operation for Video Classification

Convolutional Neural Networks (CNNs) model long-range dependencies by de...
research
10/31/2021

A Simple Approach to Image Tilt Correction with Self-Attention MobileNet for Smartphones

The main contributions of our work are two-fold. First, we present a Sel...
research
05/13/2022

Local Attention Graph-based Transformer for Multi-target Genetic Alteration Prediction

Classical multiple instance learning (MIL) methods are often based on th...
research
11/24/2021

MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video

Self-attention has become an integral component of the recent network ar...
research
03/29/2022

Domain Invariant Siamese Attention Mask for Small Object Change Detection via Everyday Indoor Robot Navigation

The problem of image change detection via everyday indoor robot navigati...

Please sign up or login with your details

Forgot password? Click here to reset