DeepAI AI Chat
Log In Sign Up

Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception

by   Bin Huang, et al.
Beijing University of Posts and Telecommunications
SenseTime Corporation
The University of Texas at Austin
The University of Hong Kong

Recently, the pure camera-based Bird's-Eye-View (BEV) perception removes expensive Lidar sensors, making it a feasible solution for economical autonomous driving. However, most existing BEV solutions either suffer from modest performance or require considerable resources to execute on-vehicle inference. This paper proposes a simple yet effective framework, termed Fast-BEV, which is capable of performing real-time BEV perception on the on-vehicle chips. Towards this goal, we first empirically find that the BEV representation can be sufficiently powerful without expensive view transformation or depth representation. Starting from M2BEV baseline, we further introduce (1) a strong data augmentation strategy for both image and BEV space to avoid over-fitting (2) a multi-frame feature fusion mechanism to leverage the temporal information (3) an optimized deployment-friendly view transformation to speed up the inference. Through experiments, we show Fast-BEV model family achieves considerable accuracy and efficiency on edge. In particular, our M1 model (R18@256x704) can run over 50FPS on the Tesla T4 platform, with 47.0 (R101@900x1600) establishes a new state-of-the-art 53.5 validation set. The code is released at:


page 4

page 5


Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

Recently, perception task based on Bird's-Eye View (BEV) representation ...

BEVPoolv2: A Cutting-edge Implementation of BEVDet Toward Deployment

We release a new codebase version of the BEVDet, dubbed branch dev2.0. W...

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

Learning Bird's Eye View (BEV) representation from surrounding-view came...

RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions

The recent advances in camera-based bird's eye view (BEV) representation...

BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

3D visual perception tasks, including 3D detection and map segmentation ...

HFT: Lifting Perspective Representations via Hybrid Feature Transformation

Autonomous driving requires accurate and detailed Bird's Eye View (BEV) ...

MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception

This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) v...