Structure Information is the Key: Self-Attention RoI Feature Extractor in 3D Object Detection

11/01/2021
by   Diankun Zhang, et al.
0

Unlike 2D object detection where all RoI features come from grid pixels, the RoI feature extraction of 3D point cloud object detection is more diverse. In this paper, we first compare and analyze the differences in structure and performance between the two state-of-the-art models PV-RCNN and Voxel-RCNN. Then, we find that the performance gap between the two models does not come from point information, but structural information. The voxel features contain more structural information because they do quantization instead of downsampling to point cloud so that they can contain basically the complete information of the whole point cloud. The stronger structural information in voxel features makes the detector have higher performance in our experiments even if the voxel features don't have accurate location information. Then, we propose that structural information is the key to 3D object detection. Based on the above conclusion, we propose a Self-Attention RoI Feature Extractor (SARFE) to enhance structural information of the feature extracted from 3D proposals. SARFE is a plug-and-play module that can be easily used on existing 3D detectors. Our SARFE is evaluated on both KITTI dataset and Waymo Open dataset. With the newly introduced SARFE, we improve the performance of the state-of-the-art 3D detectors by a large margin in cyclist on KITTI dataset while keeping real-time capability.

READ FULL TEXT
research
10/18/2022

Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection

Multi-modal 3D object detection has been an active research topic in aut...
research
07/07/2021

VIN: Voxel-based Implicit Network for Joint 3D Object Detection and Segmentation for Lidars

A unified neural network structure is presented for joint 3D object dete...
research
01/31/2021

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

3D object detection is receiving increasing attention from both industry...
research
09/20/2022

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

Bird's eye view (BEV) is widely adopted by most of the current point clo...
research
02/26/2023

Pillar R-CNN for Point Cloud 3D Object Detection

The performance of point cloud 3D object detection hinges on effectively...
research
03/13/2023

Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection

Current 3D object detection models follow a single dataset-specific trai...

Please sign up or login with your details

Forgot password? Click here to reset