M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers

04/24/2021
by   Tianrui Guan, et al.
19

We present a novel architecture for 3D object detection, M3DeTR, which combines different point cloud representations (raw, voxels, bird-eye view) with different feature scales based on multi-scale feature pyramids. M3DeTR is the first approach that unifies multiple point cloud representations, feature scales, as well as models mutual relationships between point clouds simultaneously using transformers. We perform extensive ablation experiments that highlight the benefits of fusing representation and scale, and modeling the relationships. Our method achieves state-of-the-art performance on the KITTI 3D object detection dataset and Waymo Open Dataset. Results show that M3DeTR improves the baseline significantly by 1.48 Waymo Open Dataset. In particular, our approach ranks 1st on the well-known KITTI 3D Detection Benchmark for both car and cyclist classes, and ranks 1st on Waymo Open Dataset with single frame point cloud input.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

page 9

07/27/2021

DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization

In this work, we propose a novel two-stage framework for the efficient 3...
04/17/2019

3D Object Recognition with Ensemble Learning --- A Study of Point Cloud-Based Deep Learning Models

In this study, we present an analysis of model-based ensemble learning f...
12/10/2019

Pillar in Pillar: Multi-Scale and Dynamic Feature Extraction for 3D Object Detection in Point Clouds

Sparsity and varied density are two of the main obstacles for 3D detecti...
06/08/2020

Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection

Object detection from 3D point clouds remains a challenging task, though...
12/23/2020

Multi-Modality Cut and Paste for 3D Object Detection

Three-dimensional (3D) object detection is essential in autonomous drivi...
05/12/2022

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

Accurate and reliable 3D detection is vital for many applications includ...
12/31/2021

PiFeNet: Pillar-Feature Network for Real-Time 3D Pedestrian Detection from Point Cloud

We present PiFeNet, an efficient and accurate real-time 3D detector for ...

Code Repositories

M3DeTR

Code base for M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers


view repo
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.