VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention

03/18/2022
by   Shengheng Deng, et al.
0

Detecting objects from LiDAR point clouds is of tremendous significance in autonomous driving. In spite of good progress, accurate and reliable 3D detection is yet to be achieved due to the sparsity and irregularity of LiDAR point clouds. Among existing strategies, multi-view methods have shown great promise by leveraging the more comprehensive information from both bird's eye view (BEV) and range view (RV). These multi-view methods either refine the proposals predicted from single view via fused features, or fuse the features without considering the global spatial context; their performance is limited consequently. In this paper, we propose to adaptively fuse multi-view features in a global spatial context via Dual Cross-VIew SpaTial Attention (VISTA). The proposed VISTA is a novel plug-and-play fusion module, wherein the multi-layer perceptron widely adopted in standard attention modules is replaced with a convolutional one. Thanks to the learned attention mechanism, VISTA can produce fused features of high quality for prediction of proposals. We decouple the classification and regression tasks in VISTA, and an additional constraint of attention variance is applied that enables the attention module to focus on specific targets instead of generic points. We conduct thorough experiments on the benchmarks of nuScenes and Waymo; results confirm the efficacy of our designs. At the time of submission, our method achieves 63.0 and 69.8 by up to 24 PyTorch is available at https://github.com/Gorilla-Lab-SCUT/VISTA

READ FULL TEXT

page 4

page 8

page 11

research
11/02/2020

Multi-View Adaptive Fusion Network for 3D Object Detection

3D object detection based on LiDAR-camera fusion is becoming an emerging...
research
09/22/2022

FusionRCNN: LiDAR-Camera Fusion for Two-stage 3D Object Detection

3D object detection with multi-sensors is essential for an accurate and ...
research
07/30/2021

From Multi-View to Hollow-3D: Hallucinated Hollow-3D R-CNN for 3D Object Detection

As an emerging data modal with precise distance sensing, LiDAR point clo...
research
07/20/2023

SCA-PVNet: Self-and-Cross Attention Based Aggregation of Point Cloud and Multi-View for 3D Object Retrieval

To address 3D object retrieval, substantial efforts have been made to ge...
research
10/15/2019

End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds

Recent work on 3D object detection advocates point cloud voxelization in...
research
11/19/2022

Sparse4D: Multi-view 3D Object Detection with Sparse Spatial-Temporal Fusion

Bird-eye-view (BEV) based methods have made great progress recently in m...
research
04/06/2023

Geometric-aware Pretraining for Vision-centric 3D Object Detection

Multi-camera 3D object detection for autonomous driving is a challenging...

Please sign up or login with your details

Forgot password? Click here to reset