From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera Fusion

09/25/2022
by   Rui Wan, et al.
0

LiDAR and cameras are two complementary sensors for 3D perception in autonomous driving. LiDAR point clouds have accurate spatial and geometry information, while RGB images provide textural and color data for context reasoning. To exploit LiDAR and cameras jointly, existing fusion methods tend to align each 3D point to only one projected image pixel based on calibration, namely one-to-one mapping. However, the performance of these approaches highly relies on the calibration quality, which is sensitive to the temporal and spatial synchronization of sensors. Therefore, we propose a Dynamic Cross Attention (DCA) module with a novel one-to-many cross-modality mapping that learns multiple offsets from the initial projection towards the neighborhood and thus develops tolerance to calibration error. Moreover, a dynamic query enhancement is proposed to perceive the model-independent calibration, which further strengthens DCA's tolerance to the initial misalignment. The whole fusion architecture named Dynamic Cross Attention Network (DCAN) exploits multi-level image features and adapts to multiple representations of point clouds, which allows DCA to serve as a plug-in fusion module. Extensive experiments on nuScenes and KITTI prove DCA's effectiveness. The proposed DCAN outperforms state-of-the-art methods on the nuScenes detection challenge.

READ FULL TEXT
research
03/30/2023

BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation

Integrating LiDAR and Camera information into Bird's-Eye-View (BEV) has ...
research
09/22/2022

FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection

3D object detection with multi-sensors is essential for an accurate and ...
research
03/13/2023

A Generalized Multi-Modal Fusion Detection Framework

LiDAR point clouds have become the most common data source in autonomous...
research
04/19/2023

UniCal: a Single-Branch Transformer-Based Model for Camera-to-LiDAR Calibration and Validation

We introduce a novel architecture, UniCal, for Camera-to-LiDAR (C2L) ext...
research
03/17/2023

LCE-Calib: Automatic LiDAR-Frame/Event Camera Extrinsic Calibration With A Globally Optimal Solution

The combination of LiDARs and cameras enables a mobile robot to perceive...
research
02/28/2022

Large-Scale 3D Semantic Reconstruction for Automated Driving Vehicles with Adaptive Truncated Signed Distance Function

The Large-scale 3D reconstruction, texturing and semantic mapping are no...
research
03/30/2022

Interactive Multi-scale Fusion of 2D and 3D Features for Multi-object Tracking

Multiple object tracking (MOT) is a significant task in achieving autono...

Please sign up or login with your details

Forgot password? Click here to reset