TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection

07/04/2022
by   Chang Liu, et al.
6

Existing RGB-D SOD methods mainly rely on a symmetric two-stream CNN-based network to extract RGB and depth channel features separately. However, there are two problems with the symmetric conventional network structure: first, the ability of CNN in learning global contexts is limited; second, the symmetric two-stream structure ignores the inherent differences between modalities. In this paper, we propose a Transformer-based asymmetric network (TANet) to tackle the issues mentioned above. We employ the powerful feature extraction capability of Transformer (PVTv2) to extract global semantic information from RGB data and design a lightweight CNN backbone (LWDepthNet) to extract spatial structure information from depth data without pre-training. The asymmetric hybrid encoder (AHE) effectively reduces the number of parameters in the model while increasing speed without sacrificing performance. Then, we design a cross-modal feature fusion module (CMFFM), which enhances and fuses RGB and depth features with each other. Finally, we add edge prediction as an auxiliary task and propose an edge enhancement module (EEM) to generate sharper contours. Extensive experiments demonstrate that our method achieves superior performance over 14 state-of-the-art RGB-D methods on six public datasets. Our code will be released at https://github.com/lc012463/TANet.

READ FULL TEXT

page 1

page 4

page 8

page 9

page 10

research
07/14/2020

A Single Stream Network for Robust and Real-time RGB-D Salient Object Detection

Existing RGB-D salient object detection (SOD) approaches concentrate on ...
research
06/24/2019

Cross-Channel Correlation Preserved Three-Stream Lightweight CNNs for Demosaicking

Demosaicking is a procedure to reconstruct full RGB images from Color Fi...
research
04/12/2022

SwinNet: Swin Transformer drives edge-aware RGB-D and RGB-T salient object detection

Convolutional neural networks (CNNs) are good at extracting contexture f...
research
08/17/2021

Boosting Salient Object Detection with Transformer-based Asymmetric Bilateral U-Net

Existing salient object detection (SOD) methods mainly rely on CNN-based...
research
03/09/2022

Fast Road Segmentation via Uncertainty-aware Symmetric Network

The high performance of RGB-D based road segmentation methods contrasts ...
research
08/02/2023

WCCNet: Wavelet-integrated CNN with Crossmodal Rearranging Fusion for Fast Multispectral Pedestrian Detection

Multispectral pedestrian detection achieves better visibility in challen...
research
05/05/2022

BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection

Temporal action detection (TAD) is extensively studied in the video unde...

Please sign up or login with your details

Forgot password? Click here to reset