Bridging the View Disparity of Radar and Camera Features for Multi-modal Fusion 3D Object Detection

08/25/2022
by   Taohua Zhou, et al.
1

Environmental perception with multi-modal fusion of radar and camera is crucial in autonomous driving to increase the accuracy, completeness, and robustness. This paper focuses on how to utilize millimeter-wave (MMW) radar and camera sensor fusion for 3D object detection. A novel method which realizes the feature-level fusion under bird-eye view (BEV) for a better feature representation is proposed. Firstly, radar features are augmented with temporal accumulation and sent to a temporal-spatial encoder for radar feature extraction. Meanwhile, multi-scale image 2D features which adapt to various spatial scales are obtained by image backbone and neck model. Then, image features are transformed to BEV with the designed view transformer. In addition, this work fuses the multi-modal features with a two-stage fusion model called point fusion and ROI fusion, respectively. Finally, a detection head regresses objects category and 3D locations. Experimental results demonstrate that the proposed method realizes the state-of-the-art performance under the most important detection metrics, mean average precision (mAP) and nuScenes detection score (NDS) on the challenging nuScenes dataset.

READ FULL TEXT

page 1

page 6

page 10

research
09/26/2022

DeepFusion: A Robust and Modular 3D Object Detector for Lidars, Cameras and Radars

We propose DeepFusion, a modular multi-modal architecture to fuse lidars...
research
09/11/2023

FusionFormer: A Multi-sensory Fusion in Bird's-Eye-View and Temporal Consistent Transformer for 3D Objection

Multi-sensor modal fusion has demonstrated strong advantages in 3D objec...
research
07/20/2023

SMURF: Spatial Multi-Representation Fusion for 3D Object Detection with 4D Imaging Radar

The 4D Millimeter wave (mmWave) radar is a promising technology for vehi...
research
05/22/2018

A scene perception system for visually impaired based on object detection and classification using multi-modal DCNN

This paper represents a cost-effective scene perception system aimed tow...
research
05/25/2023

RC-BEVFusion: A Plug-In Module for Radar-Camera Bird's Eye View Feature Fusion

Radars and cameras belong to the most frequently used sensors for advanc...
research
04/03/2023

CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception

Autonomous driving requires an accurate and fast 3D perception system th...
research
08/12/2023

4DRVO-Net: Deep 4D Radar-Visual Odometry Using Multi-Modal and Multi-Scale Adaptive Fusion

Four-dimensional (4D) radar–visual odometry (4DRVO) integrates complemen...

Please sign up or login with your details

Forgot password? Click here to reset