Multi-View Adaptive Fusion Network for 3D Object Detection

by   Guojun Wang, et al.

3D object detection based on LiDAR-camera fusion is becoming an emerging research theme for autonomous driving. However, it has been surprisingly difficult to effectively fuse both modalities without information loss and interference. To solve this issue, we propose a single-stage multi-view fusion framework that takes LiDAR Birds-Eye View, LiDAR Range View and Camera View images as inputs for 3D object detection. To effectively fuse multi-view features, we propose an Attentive Pointwise Fusion (APF) module to estimate the importance of the three sources with attention mechanisms which can achieve adaptive fusion of multi-view features in a pointwise manner. Besides, an Attentive Pointwise Weighting (APW) module is designed to help the network learn structure information and point feature importance with two extra tasks: foreground classification and center regression, and the predicted foreground probability will be used to reweight the point features. We design an end-to-end learnable network named MVAF-Net to integrate these two components. Our evaluations conducted on the KITTI 3D object detection datasets demonstrate that the proposed APF and APW module offer significant performance gain and that the proposed MVAF-Net achieves state-of-the-art performance in the KITTI benchmark.


page 1

page 2

page 3

page 6

page 8

page 9

page 11


3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection

In this paper, we propose a new deep architecture for fusing camera and ...

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation

3D object detection using LiDAR data is an indispensable component for a...

VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention

Detecting objects from LiDAR point clouds is of tremendous significance ...

X-view: Non-egocentric Multi-View 3D Object Detector

3D object detection algorithms for autonomous driving reason about 3D ob...

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection

In this paper, we aim at addressing two critical issues in the 3D detect...

CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers

Bird's eye view (BEV) semantic segmentation plays a crucial role in spat...

CAP-Net: Correspondence-Aware Point-view Fusion Network for 3D Shape Analysis

Learning 3D representations by fusing point cloud and multi-view data ha...