HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness

01/18/2023
by   Zongwei Wu, et al.
0

RGB-D saliency detection aims to fuse multi-modal cues to accurately localize salient regions. Existing works often adopt attention modules for feature modeling, with few methods explicitly leveraging fine-grained details to merge with semantic cues. Thus, despite the auxiliary depth information, it is still challenging for existing models to distinguish objects with similar appearances but at distinct camera distances. In this paper, from a new perspective, we propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection. Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies. To realize multi-modal and multi-level fusion, we first use a granularity-based attention scheme to strengthen the discriminatory power of RGB and depth features separately. Then we introduce a unified cross dual-attention module for multi-modal and multi-level fusion in a coarse-to-fine manner. The encoded multi-modal features are gradually aggregated into a shared decoder. Further, we exploit a multi-scale loss to take full advantage of the hierarchical information. Extensive experiments on challenging benchmark datasets demonstrate that our HiDAnet performs favorably over the state-of-the-art methods by large margins.

READ FULL TEXT

page 1

page 3

page 4

page 8

page 10

research
03/22/2021

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion

RGB-D salient object detection (SOD) is usually formulated as a problem ...
research
08/18/2021

Specificity-preserving RGB-D Saliency Detection

RGB-D saliency detection has attracted increasing attention, due to its ...
research
08/13/2021

Modal-Adaptive Gated Recoding Network for RGB-D Salient Object Detection

The multi-modal salient object detection model based on RGB-D informatio...
research
08/02/2022

Robust RGB-D Fusion for Saliency Detection

Efficiently exploiting multi-modal inputs for accurate RGB-D saliency de...
research
07/06/2020

BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network

Multi-level feature fusion is a fundamental topic in computer vision for...
research
06/18/2021

Multi-Granularity Network with Modal Attention for Dense Affective Understanding

Video affective understanding, which aims to predict the evoked expressi...
research
07/13/2022

Symmetry-Aware Transformer-based Mirror Detection

Mirror detection aims to identify the mirror regions in the given input ...

Please sign up or login with your details

Forgot password? Click here to reset