BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

by   Junjie Huang, et al.

Autonomous driving perceives the surrounding environment for decision making, which is one of the most complicated scenes for visual perception. The great power of paradigm innovation in solving the 2D object detection task inspires us to seek an elegant, feasible, and scalable paradigm for pushing the performance boundary in this area. To this end, we contribute the BEVDet paradigm in this paper. BEVDet is developed by following the principle of detecting the 3D objects in Bird-Eye-View (BEV), where route planning can be handily performed. In this paradigm, four kinds of modules are conducted in succession with different roles: an image-view encoder for encoding feature in image view, a view transformer for feature transformation from image view to BEV, a BEV encoder for further encoding feature in BEV, and a task-specific head for predicting the targets in BEV. We merely reuse the existing modules for constructing BEVDet and make it feasible for multi-camera 3D object detection by constructing an exclusive data augmentation strategy. The proposed paradigm works well in multi-camera 3D object detection and offers a good trade-off between computing budget and performance. BEVDet with 704x256 (1/8 of the competitors) image size scores 29.4 set, which is comparable with FCOS3D (i.e., 2008.2 GFLOPs, 1.7 FPS, 29.5 and 37.2 runs 4.3 times faster. Scaling up the input size to 1408x512, BEVDet scores 34.9 suppresses FCOS3D by 5.4 tells the magic of paradigm innovation.


SA-BEV: Generating Semantic-Aware Bird's-Eye-View Feature for Multi-view 3D Object Detection

Recently, the pure camera-based Bird's-Eye-View (BEV) perception provide...

OCBEV: Object-Centric BEV Transformer for Multi-View 3D Object Detection

Multi-view 3D object detection is becoming popular in autonomous driving...

PolarFormer: Multi-camera 3D Object Detection with Polar Transformers

3D object detection in autonomous driving aims to reason "what" and "whe...

SOGDet: Semantic-Occupancy Guided Multi-view 3D Object Detection

In the field of autonomous driving, accurate and comprehensive perceptio...

Multi-Camera Calibration Free BEV Representation for 3D Object Detection

In advanced paradigms of autonomous driving, learning Bird's Eye View (B...

PersDet: Monocular 3D Detection in Perspective Bird's-Eye-View

Currently, detecting 3D objects in Bird's-Eye-View (BEV) is superior to ...

Gaining Scale Invariance in UAV Bird's Eye View Object Detection by Adaptive Resizing

In this work, we introduce a new preprocessing step applicable to UAV bi...

Please sign up or login with your details

Forgot password? Click here to reset