MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer

03/21/2022
by   Kuan-Chih Huang, et al.
21

Monocular 3D object detection is an important yet challenging task in autonomous driving. Some existing methods leverage depth information from an off-the-shelf depth estimator to assist 3D detection, but suffer from the additional computational burden and achieve limited performance caused by inaccurate depth priors. To alleviate this, we propose MonoDTR, a novel end-to-end depth-aware transformer network for monocular 3D object detection. It mainly consists of two components: (1) the Depth-Aware Feature Enhancement (DFE) module that implicitly learns depth-aware features with auxiliary supervision without requiring extra computation, and (2) the Depth-Aware Transformer (DTR) module that globally integrates context- and depth-aware features. Moreover, different from conventional pixel-wise positional encodings, we introduce a novel depth positional encoding (DPE) to inject depth positional hints into transformers. Our proposed depth-aware modules can be easily plugged into existing image-only monocular 3D object detectors to improve the performance. Extensive experiments on the KITTI dataset demonstrate that our approach outperforms previous state-of-the-art monocular-based methods and achieves real-time detection. Code is available at https://github.com/kuanchihhuang/MonoDTR

READ FULL TEXT

page 1

page 3

page 8

page 12

page 13

research
03/24/2022

MonoDETR: Depth-aware Transformer for Monocular 3D Object Detection

Monocular 3D object detection has long been a challenging task in autono...
research
02/01/2021

Ground-aware Monocular 3D Object Detection for Autonomous Driving

Estimating the 3D position and orientation of objects in the environment...
research
11/30/2022

Attention-based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection

Monocular 3D object detection is a low-cost but challenging task, as it ...
research
03/30/2021

Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection

The objective of this paper is to learn context- and depth-aware feature...
research
09/27/2022

CrossDTR: Cross-view and Depth-guided Transformers for 3D Object Detection

To achieve accurate 3D object detection at a low cost for autonomous dri...
research
08/24/2023

Perspective-aware Convolution for Monocular 3D Object Detection

Monocular 3D object detection is a crucial and challenging task for auto...
research
07/21/2022

DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection

Modern neural networks use building blocks such as convolutions that are...

Please sign up or login with your details

Forgot password? Click here to reset