Informative Data Selection with Uncertainty for Multi-modal Object Detection

04/23/2023
by   Xinyu Zhang, et al.
0

Noise has always been nonnegligible trouble in object detection by creating confusion in model reasoning, thereby reducing the informativeness of the data. It can lead to inaccurate recognition due to the shift in the observed pattern, that requires a robust generalization of the models. To implement a general vision model, we need to develop deep learning models that can adaptively select valid information from multi-modal data. This is mainly based on two reasons. Multi-modal learning can break through the inherent defects of single-modal data, and adaptive information selection can reduce chaos in multi-modal data. To tackle this problem, we propose a universal uncertainty-aware multi-modal fusion model. It adopts a multi-pipeline loosely coupled architecture to combine the features and results from point clouds and images. To quantify the correlation in multi-modal information, we model the uncertainty, as the inverse of data information, in different modalities and embed it in the bounding box generation. In this way, our model reduces the randomness in fusion and generates reliable output. Moreover, we conducted a completed investigation on the KITTI 2D object detection dataset and its derived dirty data. Our fusion model is proven to resist severe noise interference like Gaussian, motion blur, and frost, with only slight degradation. The experiment results demonstrate the benefits of our adaptive fusion. Our analysis on the robustness of multi-modal fusion will provide further insights for future research.

READ FULL TEXT

page 1

page 2

page 6

page 7

page 8

page 13

page 14

research
04/01/2022

CAT-Det: Contrastively Augmented Transformer for Multi-modal 3D Object Detection

In autonomous driving, LiDAR point-clouds and RGB images are two major d...
research
05/12/2023

Multi-Modal 3D Object Detection by Box Matching

Multi-modal 3D object detection has received growing attention as the in...
research
07/09/2021

Multimodal Icon Annotation For Mobile Applications

Annotating user interfaces (UIs) that involves localization and classifi...
research
06/20/2022

Explicit and implicit models in infrared and visible image fusion

Infrared and visible images, as multi-modal image pairs, show significan...
research
06/19/2023

UniG3D: A Unified 3D Object Generation Dataset

The field of generative AI has a transformative impact on various areas,...
research
03/17/2023

GOOD: General Optimization-based Fusion for 3D Object Detection via LiDAR-Camera Object Candidates

3D object detection serves as the core basis of the perception tasks in ...
research
12/21/2021

EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection

Recently, fusing the LiDAR point cloud and camera image to improve the p...

Please sign up or login with your details

Forgot password? Click here to reset