SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection

04/27/2023
by   Yichen Xie, et al.
0

By identifying four important components of existing LiDAR-camera 3D object detection methods (LiDAR and camera candidates, transformation, and fusion outputs), we observe that all existing methods either find dense candidates or yield dense representations of scenes. However, given that objects occupy only a small part of a scene, finding dense candidates and generating dense representations is noisy and inefficient. We propose SparseFusion, a novel multi-sensor 3D detection method that exclusively uses sparse candidates and sparse representations. Specifically, SparseFusion utilizes the outputs of parallel detectors in the LiDAR and camera modalities as sparse candidates for fusion. We transform the camera candidates into the LiDAR coordinate space by disentangling the object representations. Then, we can fuse the multi-modality candidates in a unified 3D space by a lightweight self-attention module. To mitigate negative transfer between modalities, we propose novel semantic and geometric cross-modality transfer modules that are applied prior to the modality-specific detectors. SparseFusion achieves state-of-the-art performance on the nuScenes benchmark while also running at the fastest speed, even outperforming methods with stronger backbones. We perform extensive experiments to demonstrate the effectiveness and efficiency of our modules and overall method pipeline. Our code will be made publicly available at https://github.com/yichen928/SparseFusion.

READ FULL TEXT

page 5

page 7

page 13

research
09/02/2020

CLOCs: Camera-LiDAR Object Candidates Fusion for 3D Object Detection

There have been significant advances in neural networks for both 3D obje...
research
04/09/2023

Sparse Dense Fusion for 3D Object Detection

With the prevalence of multimodal learning, camera-LiDAR fusion has gain...
research
09/22/2022

FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection

3D object detection with multi-sensors is essential for an accurate and ...
research
03/16/2023

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

3D scene understanding plays a vital role in vision-based autonomous dri...
research
08/03/2023

SpaDen : Sparse and Dense Keypoint Estimation for Real-World Chart Understanding

We introduce a novel bottom-up approach for the extraction of chart data...
research
04/06/2023

Geometric-aware Pretraining for Vision-centric 3D Object Detection

Multi-camera 3D object detection for autonomous driving is a challenging...
research
08/18/2023

SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos

Camera-based 3D object detection in BEV (Bird's Eye View) space has draw...

Please sign up or login with your details

Forgot password? Click here to reset