Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection

by   Qingdong He, et al.

3D object detection has become an emerging task in autonomous driving scenarios. Previous works process 3D point clouds using either projection-based or voxel-based models. However, both approaches contain some drawbacks. The voxel-based methods lack semantic information, while the projection-based methods suffer from numerous spatial information loss when projected to different views. In this paper, we propose the Stereo RGB and Deeper LIDAR (SRDL) framework which can utilize semantic and spatial information simultaneously such that the performance of network for 3D object detection can be improved naturally. Specifically, the network generates candidate boxes from stereo pairs and combines different region-wise features using a deep fusion scheme. The stereo strategy offers more information for prediction compared with prior works. Then, several local and global feature extractors are stacked in the segmentation module to capture richer deep semantic geometric features from point clouds. After aligning the interior points with fused features, the proposed network refines the prediction in a more accurate manner and encodes the whole box in a novel compact method. The decent experimental results on the challenging KITTI detection benchmark demonstrate the effectiveness of utilizing both stereo images and point clouds for 3D object detection.


page 3

page 8


FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection

Accurate detection of obstacles in 3D is an essential task for autonomou...

SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds

Accurate 3D object detection from point clouds has become a crucial comp...

Frustum Fusion: Pseudo-LiDAR and LiDAR Fusion for 3D Detection

Most autonomous vehicles are equipped with LiDAR sensors and stereo came...

PV-SSD: A Projection and Voxel-based Double Branch Single-Stage 3D Object Detector

LIDAR-based 3D object detection and classification is crucial for autono...

Similarity-Aware Fusion Network for 3D Semantic Segmentation

In this paper, we propose a similarity-aware fusion network (SAFNet) to ...

Stereo Superpixel Segmentation Via Decoupled Dynamic Spatial-Embedding Fusion Network

Stereo superpixel segmentation aims at grouping the discretizing pixels ...

Pedestrian Detection in 3D Point Clouds using Deep Neural Networks

Detecting pedestrians is a crucial task in autonomous driving systems to...

Please sign up or login with your details

Forgot password? Click here to reset