VPFNet: Voxel-Pixel Fusion Network for Multi-class 3D Object Detection

11/01/2021
by   Chia-Hung Wang, et al.
0

Many LiDAR-based methods for detecting large objects, single-class object detection, or under easy situations were claimed to perform quite well. However, their performances of detecting small objects or under hard situations did not surpass those of the fusion-based ones due to failure to leverage the image semantics. In order to elevate the detection performance in a complicated environment, this paper proposes a deep learning (DL)-embedded fusion-based multi-class 3D object detection network which admits both LiDAR and camera sensor data streams, named Voxel-Pixel Fusion Network (VPFNet). Inside this network, a key novel component is called Voxel-Pixel Fusion (VPF) layer, which takes advantage of the geometric relation of a voxel-pixel pair and fuses the voxel features and the pixel features with proper mechanisms. Moreover, several parameters are particularly designed to guide and enhance the fusion effect after considering the characteristics of a voxel-pixel pair. Finally, the proposed method is evaluated on the KITTI benchmark for multi-class 3D object detection task under multilevel difficulty, and is shown to outperform all state-of-the-art methods in mean average precision (mAP). It is also noteworthy that our approach here ranks the first on the KITTI leaderboard for the challenging pedestrian class.

READ FULL TEXT

page 1

page 2

page 7

research
03/02/2022

Dense Voxel Fusion for 3D Object Detection

Camera and LiDAR sensor modalities provide complementary appearance and ...
research
11/24/2022

3D Dual-Fusion: Dual-Domain Dual-Query Camera-LiDAR Fusion for 3D Object Detection

Fusing data from cameras and LiDAR sensors is an essential technique to ...
research
07/22/2019

Class-specific Anchoring Proposal for 3D Object Recognition in LIDAR and RGB Images

Detecting objects in a two-dimensional setting is often insufficient in ...
research
02/29/2020

HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection

We present Hybrid Voxel Network (HVNet), a novel one-stage unified netwo...
research
09/26/2022

Center Feature Fusion: Selective Multi-Sensor Fusion of Center-based Objects

Leveraging multi-modal fusion, especially between camera and LiDAR, has ...
research
05/31/2022

Voxel Field Fusion for 3D Object Detection

In this work, we present a conceptually simple yet effective framework f...
research
03/04/2023

Virtual Sparse Convolution for Multimodal 3D Object Detection

Recently, virtual/pseudo-point-based 3D object detection that seamlessly...

Please sign up or login with your details

Forgot password? Click here to reset