CCNet: Criss-Cross Attention for Semantic Segmentation

11/28/2018
by   Zilong Huang, et al.
6

Long-range dependencies can capture useful contextual information to benefit visual understanding problems. In this work, we propose a Criss-Cross Network (CCNet) for obtaining such important information through a more effective and efficient way. Concretely, for each pixel, our CCNet can harvest the contextual information of its surrounding pixels on the criss-cross path through a novel criss-cross attention module. By taking a further recurrent operation, each pixel can finally capture the long-range dependencies from all pixels. Overall, our CCNet is with the following merits: 1) GPU memory friendly. Compared with the non-local block, the recurrent criss-cross attention module requires 11× less GPU memory usage. 2) High computational efficiency. The recurrent criss-cross attention significantly reduces FLOPs by about 85% of the non-local block in computing long-range dependencies. 3) The state-of-the-art performance. We conduct extensive experiments on popular semantic segmentation benchmarks including Cityscapes, ADE20K, and instance segmentation benchmark COCO. In particular, our CCNet achieves the mIoU score of 81.4 and 45.22 on Cityscapes test set and ADE20K validation set, respectively, which are the new state-of-the-art results. We make the code publicly available at <https://github.com/speedinghzl/CCNet .>

READ FULL TEXT

page 3

page 7

page 8

research
08/21/2019

Asymmetric Non-local Neural Networks for Semantic Segmentation

The non-local module works as a particularly useful technique for semant...
research
09/04/2021

Sparse Spatial Attention Network for Semantic Segmentation

The spatial attention mechanism captures long-range dependencies by aggr...
research
03/20/2021

Efficient Spatialtemporal Context Modeling for Action Recognition

Contextual information plays an important role in action recognition. Lo...
research
08/12/2020

Representative Graph Neural Network

Non-local operation is widely explored to model the long-range dependenc...
research
09/22/2021

Efficient Context-Aware Network for Abdominal Multi-organ Segmentation

The contextual information, presented in abdominal CT scan, is relative ...
research
11/23/2022

EurNet: Efficient Multi-Range Relational Modeling of Spatial Multi-Relational Data

Modeling spatial relationship in the data remains critical across many d...
research
12/08/2021

Fully Attentional Network for Semantic Segmentation

Recent non-local self-attention methods have proven to be effective in c...

Please sign up or login with your details

Forgot password? Click here to reset