Cross-Modal Attentional Context Learning for RGB-D Object Detection

10/30/2018
by   Guanbin Li, et al.
4

Recognizing objects from simultaneously sensed photometric (RGB) and depth channels is a fundamental yet practical problem in many machine vision applications such as robot grasping and autonomous driving. In this paper, we address this problem by developing a Cross-Modal Attentional Context (CMAC) learning framework, which enables the full exploitation of the context information from both RGB and depth data. Compared to existing RGB-D object detection frameworks, our approach has several appealing properties. First, it consists of an attention-based global context model for exploiting adaptive contextual information and incorporating this information into a region-based CNN (e.g., Fast RCNN) framework to achieve improved object detection performance. Second, our CMAC framework further contains a fine-grained object part attention module to harness multiple discriminative object parts inside each possible object region for superior local feature representation. While greatly improving the accuracy of RGB-D object detection, the effective cross-modal information fusion as well as attentional context modeling in our proposed model provide an interpretable visualization scheme. Experimental results demonstrate that the proposed method significantly improves upon the state of the art on all public benchmarks.

READ FULL TEXT

page 1

page 2

page 7

page 9

research
07/09/2020

Cross-Modal Weighting Network for RGB-D Salient Object Detection

Depth maps contain geometric clues for assisting Salient Object Detectio...
research
01/24/2022

Multi-Scale Iterative Refinement Network for RGB-D Salient Object Detection

The extensive research leveraging RGB-D information has been exploited i...
research
06/22/2022

Depth-aware Glass Surface Detection with Cross-modal Context Mining

Glass surfaces are becoming increasingly ubiquitous as modern buildings ...
research
08/03/2020

Active Object Search

In this work, we investigate an Active Object Search (AOS) task that is ...
research
03/19/2020

Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection

There are two main issues in RGB-D salient object detection: (1) how to ...
research
09/28/2022

Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection

Integrating multispectral data in object detection, especially visible a...
research
04/21/2022

Weakly Aligned Feature Fusion for Multimodal Object Detection

To achieve accurate and robust object detection in the real-world scenar...

Please sign up or login with your details

Forgot password? Click here to reset