Transformer-based Network for RGB-D Saliency Detection

12/01/2021
by   Yue Wang, et al.
0

RGB-D saliency detection integrates information from both RGB images and depth maps to improve prediction of salient regions under challenging conditions. The key to RGB-D saliency detection is to fully mine and fuse information at multiple scales across the two modalities. Previous approaches tend to apply the multi-scale and multi-modal fusion separately via local operations, which fails to capture long-range dependencies. Here we propose a transformer-based network to address this issue. Our proposed architecture is composed of two modules: a transformer-based within-modality feature enhancement module (TWFEM) and a transformer-based feature fusion module (TFFM). TFFM conducts a sufficient feature fusion by integrating features from multiple scales and two modalities over all positions simultaneously. TWFEM enhances feature on each scale by selecting and integrating complementary information from other scales within the same modality before TFFM. We show that transformer is a uniform operation which presents great efficacy in both feature fusion and feature enhancement, and simplifies the model design. Extensive experimental results on six benchmark datasets demonstrate that our proposed network performs favorably against state-of-the-art RGB-D saliency detection methods.

READ FULL TEXT

page 2

page 6

page 7

research
07/09/2022

SiaTrans: Siamese Transformer Network for RGB-D Salient Object Detection with Depth Image Classification

RGB-D SOD uses depth information to handle challenging scenes and obtain...
research
07/13/2022

Symmetry-Aware Transformer-based Mirror Detection

Mirror detection aims to identify the mirror regions in the given input ...
research
01/08/2023

RGB-T Multi-Modal Crowd Counting Based on Transformer

Crowd counting aims to estimate the number of persons in a scene. Most s...
research
04/20/2021

M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection

The widespread dissemination of forged images generated by Deepfake tech...
research
10/04/2022

ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

3D human reconstruction from RGB images achieves decent results in good ...
research
02/08/2021

Towards Accurate RGB-D Saliency Detection with Complementary Attention and Adaptive Integration

Saliency detection based on the complementary information from RGB image...
research
04/25/2021

Visual Saliency Transformer

Recently, massive saliency detection methods have achieved promising res...

Please sign up or login with your details

Forgot password? Click here to reset