Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

02/15/2023
by   Yuanhui Huang, et al.
0

Modern methods for vision-centric autonomous driving perception widely adopt the bird's-eye-view (BEV) representation to describe a 3D scene. Despite its better efficiency than voxel representation, it has difficulty describing the fine-grained 3D structure of a scene with a single plane. To address this, we propose a tri-perspective view (TPV) representation which accompanies BEV with two additional perpendicular planes. We model each point in the 3D space by summing its projected features on the three planes. To lift image features to the 3D TPV space, we further propose a transformer-based TPV encoder (TPVFormer) to obtain the TPV features effectively. We employ the attention mechanism to aggregate the image features corresponding to each query in each TPV plane. Experiments show that our model trained with sparse supervision effectively predicts the semantic occupancy for all voxels. We demonstrate for the first time that using only camera inputs can achieve comparable performance with LiDAR-based methods on the LiDAR segmentation task on nuScenes. Code: https://github.com/wzzheng/TPVFormer.

READ FULL TEXT

page 1

page 5

page 6

page 9

page 10

research
08/31/2023

PointOcc: Cylindrical Tri-Perspective View for Point-based 3D Semantic Occupancy Prediction

Semantic segmentation in autonomous driving has been undergoing an evolu...
research
04/11/2023

OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction

The vision-based perception for autonomous driving has undergone a trans...
research
03/16/2023

SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

3D scene understanding plays a vital role in vision-based autonomous dri...
research
07/25/2023

HeightFormer: Explicit Height Modeling without Extra Data for Camera-only 3D Object Detection in Bird's Eye View

Vision-based Bird's Eye View (BEV) representation is an emerging percept...
research
01/29/2023

Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline

Recently, perception task based on Bird's-Eye View (BEV) representation ...
research
06/27/2023

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal ...
research
06/09/2022

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

Learning Bird's Eye View (BEV) representation from surrounding-view came...

Please sign up or login with your details

Forgot password? Click here to reset