CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers

07/05/2022
by   Runsheng Xu, et al.
0

Bird's eye view (BEV) semantic segmentation plays a crucial role in spatial sensing for autonomous driving. Although recent literature has made significant progress on BEV map understanding, they are all based on single-agent camera-based systems which are difficult to handle occlusions and detect distant objects in complex traffic scenes. Vehicle-to-Vehicle (V2V) communication technologies have enabled autonomous vehicles to share sensing information, which can dramatically improve the perception performance and range as compared to single-agent systems. In this paper, we propose CoBEVT, the first generic multi-agent multi-camera perception framework that can cooperatively generate BEV map predictions. To efficiently fuse camera features from multi-view and multi-agent data in an underlying Transformer architecture, we design a fused axial attention or FAX module, which can capture sparsely local and global spatial interactions across views and agents. The extensive experiments on the V2V perception dataset, OPV2V, demonstrate that CoBEVT achieves state-of-the-art performance for cooperative BEV semantic segmentation. Moreover, CoBEVT is shown to be generalizable to other tasks, including 1) BEV segmentation with single-agent multi-camera and 2) 3D object detection with multi-agent LiDAR systems, and achieves state-of-the-art performance with real-time inference speed.

READ FULL TEXT

page 2

page 7

page 12

page 15

page 16

page 17

page 18

page 19

research
04/20/2023

HM-ViT: Hetero-modal Vehicle-to-Vehicle Cooperative perception with vision transformer

Vehicle-to-Vehicle technologies have enabled autonomous vehicles to shar...
research
11/15/2022

Monocular BEV Perception of Road Scenes via Front-to-Top View Projection

HD map reconstruction is crucial for autonomous driving. LiDAR-based met...
research
03/15/2023

DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception

BEV perception is of great importance in the field of autonomous driving...
research
04/07/2023

A Cross-Scale Hierarchical Transformer with Correspondence-Augmented Attention for inferring Bird's-Eye-View Semantic Segmentation

As bird's-eye-view (BEV) semantic segmentation is simple-to-visualize an...
research
06/09/2019

Cross-view Semantic Segmentation for Sensing Surroundings

Sensing surroundings is ubiquitous and effortless to humans: It takes a ...
research
01/05/2022

Multi-Robot Collaborative Perception with Graph Neural Networks

Multi-robot systems such as swarms of aerial robots are naturally suited...
research
04/11/2022

M^2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation

In this paper, we propose M^2BEV, a unified framework that jointly perfo...

Please sign up or login with your details

Forgot password? Click here to reset