Progressive Coordinate Transforms for Monocular 3D Object Detection

08/12/2021
by   Li Wang, et al.
3

Recognizing and localizing objects in the 3D space is a crucial ability for an AI agent to perceive its surrounding environment. While significant progress has been achieved with expensive LiDAR point clouds, it poses a great challenge for 3D object detection given only a monocular image. While there exist different alternatives for tackling this problem, it is found that they are either equipped with heavy networks to fuse RGB and depth information or empirically ineffective to process millions of pseudo-LiDAR points. With in-depth examination, we realize that these limitations are rooted in inaccurate object localization. In this paper, we propose a novel and lightweight approach, dubbed Progressive Coordinate Transforms (PCT) to facilitate learning coordinate representations. Specifically, a localization boosting mechanism with confidence-aware loss is introduced to progressively refine the localization prediction. In addition, semantic image representation is also exploited to compensate for the usage of patch proposals. Despite being lightweight and simple, our strategy leads to superior improvements on the KITTI and Waymo Open Dataset monocular 3D detection benchmarks. At the same time, our proposed PCT shows great generalization to most coordinate-based 3D detection frameworks. The code is available at: https://github.com/amazon-research/progressive-coordinate-transforms .

READ FULL TEXT

page 10

page 16

page 17

page 18

research
08/11/2020

Rethinking Pseudo-LiDAR Representation

The recently proposed pseudo-LiDAR based 3D detectors greatly improve th...
research
06/29/2020

MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time

Monocular multi-object detection and localization in 3D space has been p...
research
05/01/2021

Lite-FPN for Keypoint-based Monocular 3D Object Detection

3D object detection with a single image is an essential and challenging ...
research
03/30/2021

Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection

The objective of this paper is to learn context- and depth-aware feature...
research
03/16/2022

WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection

Monocular 3D object detection is one of the most challenging tasks in 3D...
research
07/06/2021

Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting

As cameras are increasingly deployed in new application domains such as ...

Please sign up or login with your details

Forgot password? Click here to reset