Depth-Cooperated Trimodal Network for Video Salient Object Detection

02/12/2022
by   Yukang Lu, et al.
0

Depth can provide useful geographical cues for salient object detection (SOD), and has been proven helpful in recent RGB-D SOD methods. However, existing video salient object detection (VSOD) methods only utilize spatiotemporal information and seldom exploit depth information for detection. In this paper, we propose a depth-cooperated trimodal network, called DCTNet for VSOD, which is a pioneering work to incorporate depth information to assist VSOD. To this end, we first generate depth from RGB frames, and then propose an approach to treat the three modalities unequally. Specifically, a multi-modal attention module (MAM) is designed to model multi-modal long-range dependencies between the main modality (RGB) and the two auxiliary modalities (depth, optical flow). We also introduce a refinement fusion module (RFM) to suppress noises in each modality and select useful information dynamically for further feature refinement. Lastly, a progressive fusion strategy is adopted after the refined features to achieve final cross-modal fusion. Experiments on five benchmark datasets demonstrate the superiority of our depth-cooperated model against 12 state-of-the-art methods, and the necessity of depth is also validated.

READ FULL TEXT

page 2

page 3

page 4

research
05/09/2023

Guided Focal Stack Refinement Network for Light Field Salient Object Detection

Light field salient object detection (SOD) is an emerging research direc...
research
03/19/2020

Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection

There are two main issues in RGB-D salient object detection: (1) how to ...
research
03/09/2022

Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction

Benefiting from color independence, illumination invariance and location...
research
06/07/2021

Progressive Multi-scale Fusion Network for RGB-D Salient Object Detection

Salient object detection(SOD) aims at locating the most significant obje...
research
04/12/2022

Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal Fusion with Depth Guidance

Image outpainting technology generates visually reasonable content regar...
research
10/12/2022

PSNet: Parallel Symmetric Network for Video Salient Object Detection

For the video salient object detection (VSOD) task, how to excavate the ...
research
04/23/2021

Middle-level Fusion for Lightweight RGB-D Salient Object Detection

Most existing RGB-D salient object detection (SOD) models require large ...

Please sign up or login with your details

Forgot password? Click here to reset