3D-MAN: 3D Multi-frame Attention Network for Object Detection

03/30/2021
by   Zetong Yang, et al.
0

3D object detection is an important module in autonomous driving and robotics. However, many existing methods focus on using single frames to perform 3D detection, and do not fully utilize information from multiple frames. In this paper, we present 3D-MAN: a 3D multi-frame attention network that effectively aggregates features from multiple perspectives and achieves state-of-the-art performance on Waymo Open Dataset. 3D-MAN first uses a novel fast single-frame detector to produce box proposals. The box proposals and their corresponding feature maps are then stored in a memory bank. We design a multi-view alignment and aggregation module, using attention networks, to extract and aggregate the temporal features stored in the memory bank. This effectively combines the features coming from different perspectives of the scene. We demonstrate the effectiveness of our approach on the large-scale complex Waymo Open Dataset, achieving state-of-the-art results compared to published single-frame and multi-frame methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/04/2022

TransPillars: Coarse-to-Fine Aggregation for Multi-Frame 3D Object Detection

3D object detection using point clouds has attracted increasing attentio...
research
09/30/2022

INT: Towards Infinite-frames 3D Detection with An Efficient Framework

It is natural to construct a multi-frame instead of a single-frame 3D de...
research
03/24/2020

RN-VID: A Feature Fusion Architecture for Video Object Detection

Consecutive frames in a video are highly redundant. Therefore, to perfor...
research
12/15/2022

DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention

3D object detection with surround-view images is an essential task for a...
research
03/09/2023

MBPTrack: Improving 3D Point Cloud Tracking with Memory Networks and Box Priors

3D single object tracking has been a crucial problem for decades with nu...
research
09/30/2022

D-Align: Dual Query Co-attention Network for 3D Object Detection Based on Multi-frame Point Cloud Sequence

LiDAR sensors are widely used for 3D object detection in various mobile ...
research
07/06/2022

Context Sensing Attention Network for Video-based Person Re-identification

Video-based person re-identification (ReID) is challenging due to the pr...

Please sign up or login with your details

Forgot password? Click here to reset