BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation

03/30/2023
by   Hongxiang Cai, et al.
0

Integrating LiDAR and Camera information into Bird's-Eye-View (BEV) has become an essential topic for 3D object detection in autonomous driving. Existing methods mostly adopt an independent dual-branch framework to generate LiDAR and camera BEV, then perform an adaptive modality fusion. Since point clouds provide more accurate localization and geometry information, they could serve as a reliable spatial prior to acquiring relevant semantic information from the images. Therefore, we design a LiDAR-Guided View Transformer (LGVT) to effectively obtain the camera representation in BEV space and thus benefit the whole dual-branch fusion system. LGVT takes camera BEV as the primitive semantic query, repeatedly leveraging the spatial cue of LiDAR BEV for extracting image features across multiple camera views. Moreover, we extend our framework into the temporal domain with our proposed Temporal Deformable Alignment (TDA) module, which aims to aggregate BEV features from multiple historical frames. Including these two modules, our framework dubbed BEVFusion4D achieves state-of-the-art results in 3D object detection, with 72.0 NDS on nuScenes test set, respectively.

READ FULL TEXT

page 2

page 8

page 13

research
12/09/2022

SemanticBEVFusion: Rethink LiDAR-Camera Fusion in Unified Bird's-Eye View Representation for 3D Object Detection

LiDAR and camera are two essential sensors for 3D object detection in au...
research
05/24/2023

DynStatF: An Efficient Feature Fusion Strategy for LiDAR 3D Object Detection

Augmenting LiDAR input with multiple previous frames provides richer sem...
research
09/25/2022

From One to Many: Dynamic Cross Attention Networks for LiDAR and Camera Fusion

LiDAR and cameras are two complementary sensors for 3D perception in aut...
research
09/20/2023

BroadBEV: Collaborative LiDAR-camera Fusion for Broad-sighted Bird's Eye View Map Construction

A recent sensor fusion in a Bird's Eye View (BEV) space has shown its ut...
research
04/19/2023

UniCal: a Single-Branch Transformer-Based Model for Camera-to-LiDAR Calibration and Validation

We introduce a novel architecture, UniCal, for Camera-to-LiDAR (C2L) ext...
research
11/24/2022

3D Dual-Fusion: Dual-Domain Dual-Query Camera-LiDAR Fusion for 3D Object Detection

Fusing data from cameras and LiDAR sensors is an essential technique to ...
research
04/06/2023

Geometric-aware Pretraining for Vision-centric 3D Object Detection

Multi-camera 3D object detection for autonomous driving is a challenging...

Please sign up or login with your details

Forgot password? Click here to reset