Temp-Frustum Net: 3D Object Detection with Temporal Fusion

by   Emec Ercelik, et al.

3D object detection is a core component of automated driving systems. State-of-the-art methods fuse RGB imagery and LiDAR point cloud data frame-by-frame for 3D bounding box regression. However, frame-by-frame 3D object detection suffers from noise, field-of-view obstruction, and sparsity. We propose a novel Temporal Fusion Module (TFM) to use information from previous time-steps to mitigate these problems. First, a state-of-the-art frustum network extracts point cloud features from raw RGB and LiDAR point cloud data frame-by-frame. Then, our TFM module fuses these features with a recurrent neural network. As a result, 3D object detection becomes robust against single frame failures and transient occlusions. Experiments on the KITTI object tracking dataset show the efficiency of the proposed TFM, where we obtain  6 respectively, compared to frame-by-frame baselines. Furthermore, ablation studies reinforce that the subject of improvement is temporal fusion and show the effects of different placements of TFM in the object detection pipeline. Our code is open-source and available at https://gitlab.lrz.de/emec_ercelik/temp-frustnet.


page 3

page 4

page 5


An LSTM Approach to Temporal 3D Object Detection in LiDAR Point Clouds

Detecting objects in 3D LiDAR data is a core technology for autonomous d...

PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation

We present PointFusion, a generic 3D object detection method that levera...

MVX-Net: Multimodal VoxelNet for 3D Object Detection

Many recent works on 3D object detection have focused on designing neura...

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

Accurate and reliable 3D detection is vital for many applications includ...

4D-Net for Learned Multi-Modal Alignment

We present 4D-Net, a 3D object detection approach, which utilizes 3D Poi...

Multi-Echo LiDAR for 3D Object Detection

LiDAR sensors can be used to obtain a wide range of measurement signals ...

EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection

Recently, fusing the LiDAR point cloud and camera image to improve the p...