HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

07/25/2023
by   Yiming Wu, et al.
1

Vision-based Bird's Eye View (BEV) representation is an emerging perception formulation for autonomous driving. The core challenge is to construct BEV space with multi-camera features, which is a one-to-many ill-posed problem. Diving into all previous BEV representation generation methods, we found that most of them fall into two types: modeling depths in image views or modeling heights in the BEV space, mostly in an implicit way. In this work, we propose to explicitly model heights in the BEV space, which needs no extra data like LiDAR and can fit arbitrary camera rigs and types compared to modeling depths. Theoretically, we give proof of the equivalence between height-based methods and depth-based methods. Considering the equivalence and some advantages of modeling heights, we propose HeightFormer, which models heights and uncertainties in a self-recursive way. Without any extra data, the proposed HeightFormer could estimate heights in BEV accurately. Benchmark results show that the performance of HeightFormer achieves SOTA compared with those camera-only methods.

READ FULL TEXT
research
02/15/2023

Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

Modern methods for vision-centric autonomous driving perception widely a...
research
09/16/2023

Multi-camera Bird's Eye View Perception for Autonomous Driving

Most automated driving systems comprise a diverse sensor set, including ...
research
03/15/2023

BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection

While most recent autonomous driving system focuses on developing percep...
research
02/13/2023

Surround-View Vision-based 3D Detection for Autonomous Driving: A Survey

Vision-based 3D Detection task is fundamental task for the perception of...
research
03/30/2023

Understanding the Robustness of 3D Object Detection with Bird's-Eye-View Representations in Autonomous Driving

3D object detection is an essential perception task in autonomous drivin...
research
04/11/2022

M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

In this paper, we propose M^2BEV, a unified framework that jointly perfo...
research
06/08/2022

Learning Ego 3D Representation as Ray Tracing

A self-driving perception model aims to extract 3D semantic representati...

Please sign up or login with your details

Forgot password? Click here to reset