TransCMD: Cross-Modal Decoder Equipped with Transformer for RGB-D Salient Object Detection

12/04/2021
by   Youwei Pang, et al.
0

Most of the existing RGB-D salient object detection methods utilize the convolution operation and construct complex interweave fusion structures to achieve cross-modal information integration. The inherent local connectivity of convolution operation constrains the performance of the convolution-based methods to a ceiling. In this work, we rethink this task from the perspective of global information alignment and transformation. Specifically, the proposed method (TransCMD) cascades several cross-modal integration units to construct a top-down transformer-based information propagation path (TIPP). TransCMD treats the multi-scale and multi-modal feature integration as a sequence-to-sequence context propagation and update process built on the transformer. Besides, considering the quadratic complexity w.r.t. the number of input tokens, we design a patch-wise token re-embedding strategy (PTRE) with acceptable computational cost. Experimental results on seven RGB-D SOD benchmark datasets demonstrate that a simple two-stream encoder-decoder framework can surpass the state-of-the-art purely CNN-based methods when it is equipped with the TIPP.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 10

research
02/16/2023

Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection

Most of existing RGB-D salient object detection (SOD) methods follow the...
research
04/25/2021

Visual Saliency Transformer

Recently, massive saliency detection methods have achieved promising res...
research
01/25/2021

RGB-D Salient Object Detection via 3D Convolutional Neural Networks

RGB-D salient object detection (SOD) recently has attracted increasing r...
research
01/29/2021

Self-Supervised Representation Learning for RGB-D Salient Object Detection

Existing CNNs-Based RGB-D Salient Object Detection (SOD) networks are al...
research
07/13/2020

Hierarchical Dynamic Filtering Network for RGB-D Salient Object Detection

The main purpose of RGB-D salient object detection (SOD) is how to bette...
research
01/29/2017

MSCM-LiFe: Multi-scale cross modal linear feature for horizon detection in maritime images

This paper proposes a new method for horizon detection called the multi-...
research
03/19/2020

Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection

There are two main issues in RGB-D salient object detection: (1) how to ...

Please sign up or login with your details

Forgot password? Click here to reset