Depth Quality-Inspired Feature Manipulation for Efficient RGB-D and Video Salient Object Detection

08/08/2022
by   Wenbo Zhang, et al.
4

Recently CNN-based RGB-D salient object detection (SOD) has obtained significant improvement on detection accuracy. However, existing models often fail to perform well in terms of efficiency and accuracy simultaneously. This hinders their potential applications on mobile devices as well as many real-world problems. To bridge the accuracy gap between lightweight and large models for RGB-D SOD, in this paper, an efficient module that can greatly improve the accuracy but adds little computation is proposed. Inspired by the fact that depth quality is a key factor influencing the accuracy, we propose an efficient depth quality-inspired feature manipulation (DQFM) process, which can dynamically filter depth features according to depth quality. The proposed DQFM resorts to the alignment of low-level RGB and depth features, as well as holistic attention of the depth stream to explicitly control and enhance cross-modal fusion. We embed DQFM to obtain an efficient lightweight RGB-D SOD model called DFM-Net, where we in addition design a tailored depth backbone and a two-stage decoder as basic parts. Extensive experimental results on nine RGB-D datasets demonstrate that our DFM-Net outperforms recent efficient models, running at about 20 FPS on CPU with only 8.5Mb model size, and meanwhile being 2.9/2.4 times faster and 6.7/3.1 times smaller than the latest best models A2dele and MobileSal. It also maintains state-of-the-art accuracy when even compared to non-efficient models. Interestingly, further statistics and analyses verify the ability of DQFM in distinguishing depth maps of various qualities without any quality labels. Last but not least, we further apply DFM-Net to deal with video SOD (VSOD), achieving comparable performance against recent efficient models while being 3/2.3 times faster/smaller than the prior best in this field. Our code is available at https://github.com/zwbx/DFM-Net.

READ FULL TEXT

page 2

page 4

page 6

page 7

page 14

page 16

page 18

page 19

research
07/05/2021

Depth Quality-Inspired Feature Manipulation for Efficient RGB-D Salient Object Detection

RGB-D salient object detection (SOD) recently has attracted increasing r...
research
08/26/2020

Siamese Network for RGB-D Salient Object Detection and Beyond

Existing RGB-D salient object detection (SOD) models usually treat RGB a...
research
01/25/2021

RGB-D Salient Object Detection via 3D Convolutional Neural Networks

RGB-D salient object detection (SOD) recently has attracted increasing r...
research
04/05/2021

BTS-Net: Bi-directional Transfer-and-Selection Network For RGB-D Salient Object Detection

Depth information has been proved beneficial in RGB-D salient object det...
research
09/18/2023

DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

We present DFormer, a novel RGB-D pretraining framework to learn transfe...
research
11/11/2020

FINO-Net: A Deep Multimodal Sensor Fusion Framework for Manipulation Failure Detection

Safe manipulation in unstructured environments for service robots is a c...
research
05/03/2021

CMA-Net: A Cascaded Mutual Attention Network for Light Field Salient Object Detection

In the past few years, numerous deep learning methods have been proposed...

Please sign up or login with your details

Forgot password? Click here to reset