Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net

08/17/2021
by   Yu Qiu, et al.
11

Existing salient object detection (SOD) methods mainly rely on CNN-based U-shaped structures with skip connections to combine the global contexts and local spatial details that are crucial for locating salient objects and refining object details, respectively. Despite great successes, the ability of CNN in learning global contexts is limited. Recently, the vision transformer has achieved revolutionary progress in computer vision owing to its powerful modeling of global dependencies. However, directly applying the transformer to SOD is suboptimal because the transformer lacks the ability to learn local spatial representations. To this end, this paper explores the combination of transformer and CNN to learn both global and local representations for SOD. We propose a transformer-based Asymmetric Bilateral U-Net (ABiU-Net). The asymmetric bilateral encoder has a transformer path and a lightweight CNN path, where the two paths communicate at each encoder stage to learn complementary global contexts and local spatial details, respectively. The asymmetric bilateral decoder also consists of two paths to process features from the transformer and CNN encoder paths, with communication at each decoder stage for decoding coarse salient object locations and find-grained object details, respectively. Such communication between the two encoder/decoder paths enables AbiU-Net to learn complementary global and local representations, taking advantage of the natural properties of transformer and CNN, respectively. Hence, ABiU-Net provides a new perspective for transformer-based SOD. Extensive experiments demonstrate that ABiU-Net performs favorably against previous state-of-the-art SOD methods. The code will be released.

READ FULL TEXT

page 1

page 7

page 9

page 12

research
08/05/2021

Unifying Global-Local Representations in Salient Object Detection with Transformer

The fully convolutional network (FCN) has dominated salient object detec...
research
05/23/2022

SelfReformer: Self-Refined Network with Transformer for Salient Object Detection

The global and local contexts significantly contribute to the integrity ...
research
07/04/2022

TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

Existing RGB-D SOD methods mainly rely on a symmetric two-stream CNN-bas...
research
10/28/2022

PSFormer: Point Transformer for 3D Salient Object Detection

We propose PSFormer, an effective point transformer model for 3D salient...
research
10/03/2020

CardioXNet: A Novel Lightweight CRNN Framework for Classifying Cardiovascular Diseases from Phonocardiogram Recordings

The alarmingly high mortality rate and increasing global prevalence of c...
research
02/18/2023

Hyneter: Hybrid Network Transformer for Object Detection

In this paper, we point out that the essential differences between CNN-b...
research
04/21/2022

Transformer-Guided Convolutional Neural Network for Cross-View Geolocalization

Ground-to-aerial geolocalization refers to localizing a ground-level que...

Please sign up or login with your details

Forgot password? Click here to reset