MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization

11/26/2018
by   Zengyi Qin, et al.
0

Localizing objects in the real 3D space, which plays a crucial role in scene understanding, is particularly challenging given only a single RGB image due to the geometric information loss during imagery projection. We propose MonoGRNet for the amodal 3D object localization from a monocular RGB image via geometric reasoning in both the observed 2D projection and the unobserved depth dimension. MonoGRNet is a single, unified network composed of four task-specific subnetworks, responsible for 2D object detection, instance depth estimation (IDE), 3D localization and local corner regression. Unlike the pixel-level depth estimation that needs per-pixel annotations, we propose a novel IDE method that directly predicts the depth of the targeting 3D bounding box's center using sparse supervision. The 3D localization is further achieved by estimating the position in the horizontal and vertical dimensions. Finally, MonoGRNet is jointly learned by optimizing the locations and poses of the 3D bounding boxes in the global context. We demonstrate that MonoGRNet achieves state-of-the-art performance on challenging datasets.

READ FULL TEXT

page 3

page 7

research
04/18/2021

MonoGRNet: A General Framework for Monocular 3D Object Detection

Detecting and localizing objects in the real 3D space, which plays a cru...
research
05/14/2019

Monocular 3D Object Detection via Geometric Reasoning on Keypoints

Monocular 3D object detection is well-known to be a challenging vision t...
research
07/20/2020

Object-Aware Centroid Voting for Monocular 3D Object Detection

Monocular 3D object detection aims to detect objects in a 3D physical wo...
research
03/05/2021

IAFA: Instance-aware Feature Aggregation for 3D Object Detection from a Single Image

3D object detection from a single image is an important task in Autonomo...
research
03/16/2022

MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection

Due to the inherent ill-posed nature of 2D-3D projection, monocular 3D o...
research
07/20/2022

Densely Constrained Depth Estimator for Monocular 3D Object Detection

Estimating accurate 3D locations of objects from monocular images is a c...
research
04/06/2022

"The Pedestrian next to the Lamppost" Adaptive Object Graphs for Better Instantaneous Mapping

Estimating a semantically segmented bird's-eye-view (BEV) map from a sin...

Please sign up or login with your details

Forgot password? Click here to reset