False Positive Removal for 3D Vehicle Detection with Penetrated Point Classifier

by   Sungmin Woo, et al.
Yonsei University

Recently, researchers have been leveraging LiDAR point cloud for higher accuracy in 3D vehicle detection. Most state-of-the-art methods are deep learning based, but are easily affected by the number of points generated on the object. This vulnerability leads to numerous false positive boxes at high recall positions, where objects are occasionally predicted with few points. To address the issue, we introduce Penetrated Point Classifier (PPC) based on the underlying property of LiDAR that points cannot be generated behind vehicles. It determines whether a point exists behind the vehicle of the predicted box, and if does, the box is distinguished as false positive. Our straightforward yet unprecedented approach is evaluated on KITTI dataset and achieved performance improvement of PointRCNN, one of the state-of-the-art methods. The experiment results show that precision at the highest recall position is dramatically increased by 15.46 percentage points and 14.63 percentage points on the moderate and hard difficulty of car class, respectively.


page 2

page 4


False Positive Detection and Prediction Quality Estimation for LiDAR Point Cloud Segmentation

We present a novel post-processing tool for semantic segmentation of LiD...

Key Points Estimation and Point Instance Segmentation Approach for Lane Detection

State-of-the-art lane detection methods achieve successful performance. ...

MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion

3D vehicle detection based on multi-modal fusion is an important task of...

DSOR: A Scalable Statistical Filter for Removing Falling Snow from LiDAR Point Clouds in Severe Winter Weather

For autonomous vehicles to viably replace human drivers they must conten...

BoxNet: A Deep Learning Method for 2D Bounding Box Estimation from Bird's-Eye View Point Cloud

We present a learning-based method to estimate the object bounding box f...

Temporal Context for Robust Maritime Obstacle Detection

Robust maritime obstacle detection is essential for fully autonomous unm...

PromID: human promoter prediction by deep learning

Computational identification of promoters is notoriously difficult as hu...

1 Introduction

3D object detection is a crucial task in autonomous driving since vehicles must be provided with accurate scene information. It is essential for safe driving to recognize the other cars driving ahead or parked at the side of the road precisely. To achieve this goal, recent works of 3D object detection utilize the data from several types of sensors, such as a monocular camera, stereo cameras, and LiDAR. They provide various and useful data, but each sensor has its limitations.

RGB images from cameras contain appearance information that can be efficiently exploited by the advanced techniques of 2D Convolutional Neural Network (CNN). However, they lack depth information which is material to 3D object detection. This drawback results in considerably low performance for monocular 3D object detection

[naiden2019shift, Wang_2019_CVPR, Chen_2016_CVPR, Mousavian_2017_CVPR], that only uses a single image to detect the objects in 3D space. To overcome the limitation, [you2019pseudo, li2019stereo, chen2020dsgn]

used two images from the stereo camera to estimate the depth of the scene more precisely, by sharing the weights of image feature extractors. The estimated depth of the stereo images is more enhanced than monocular images, but due to the discrete nature of the pixels, they still are insufficient for real-world applications. Therefore, depth prediction from 2D images itself is not adequate for autonomous driving, where an accurate location of the object is required.

On the other hand, point clouds from LiDAR provide precise distance information, which makes it easy to capture the spatial structures of the scenes. Thus, most 3D object detection methods with high performance are point-based detection. However, the LiDAR point cloud has fatal weaknesses of being sparse in the long range, and that the points are irregular and unordered. To handle these challenges, existing methods such as PointNet [Qi_2017_CVPR] and PointNet++ [qi2017pointnet++] directly process the raw point cloud regardless of the order of the points. Alternatively, point clouds are converted into specific forms that can be easily processed by projecting a point cloud onto the corresponding image for 2D CNN [ku2018joint, yang2018pixor, liang2018deep, chen2017multi] or dividing into voxels for 3D CNN [zhou2018voxelnet, song2016deep, yan2018second].