MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

by   Xuesong Chen, et al.

Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots. In this paper, we present a flexible and high-performance 3D detection framework, named MPPNet, for 3D temporal object detection with point cloud sequences. We propose a novel three-hierarchy framework with proxy points for multi-frame feature encoding and interactions to achieve better detection. The three hierarchies conduct per-frame feature encoding, short-clip feature fusion, and whole-sequence feature aggregation, respectively. To enable processing long-sequence point clouds with reasonable computational resources, intra-group feature mixing and inter-group feature attention are proposed to form the second and third feature encoding hierarchies, which are recurrently applied for aggregating multi-frame trajectory features. The proxy points not only act as consistent object representations for each frame, but also serve as the courier to facilitate feature interaction between frames. The experiments on largeWaymo Open dataset show that our approach outperforms state-of-the-art methods with large margins when applied to both short (e.g., 4-frame) and long (e.g., 16-frame) point cloud sequences. Specifically, MPPNet achieves 74.21 vehicle, pedestrian and cyclist classes on the LEVEL 2 mAPH metric with 16-frame input.


TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection

3D object detection using point clouds has attracted increasing attentio...

TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses

3D multi-object tracking (MOT) is vital for many applications including ...

Temp-Frustum Net: 3D Object Detection with Temporal Fusion

3D object detection is a core component of automated driving systems. St...

Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame Point Clouds

To boost a detector for single-frame 3D object detection, we present a n...

MoDAR: Using Motion Forecasting for 3D Object Detection in Point Cloud Sequences

Occluded and long-range objects are ubiquitous and challenging for 3D ob...

Multi-Frame to Single-Frame: Knowledge Distillation for 3D Object Detection

A common dilemma in 3D object detection for autonomous driving is that h...

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors

3D single object tracking has been a crucial problem for decades with nu...

Please sign up or login with your details

Forgot password? Click here to reset