MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection

03/16/2022
by   Qing Lian, et al.
0

Due to the inherent ill-posed nature of 2D-3D projection, monocular 3D object detection lacks accurate depth recovery ability. Although the deep neural network (DNN) enables monocular depth-sensing from high-level learned features, the pixel-level cues are usually omitted due to the deep convolution mechanism. To benefit from both the powerful feature representation in DNN and pixel-level geometric constraints, we reformulate the monocular object depth estimation as a progressive refinement problem and propose a joint semantic and geometric cost volume to model the depth error. Specifically, we first leverage neural networks to learn the object position, dimension, and dense normalized 3D object coordinates. Based on the object depth, the dense coordinates patch together with the corresponding object features is reprojected to the image space to build a cost volume in a joint semantic and geometric error manner. The final depth is obtained by feeding the cost volume to a refinement network, where the distribution of semantic and geometric error is regularized by direct depth supervision. Through effectively mitigating depth error by the refinement framework, we achieve state-of-the-art results on both the KITTI and Waymo datasets.

READ FULL TEXT

page 1

page 7

page 8

research
10/23/2019

Deep Classification Network for Monocular Depth Estimation

Monocular Depth Estimation is usually treated as a supervised and regres...
research
04/18/2021

MonoGRNet: A General Framework for Monocular 3D Object Detection

Detecting and localizing objects in the real 3D space, which plays a cru...
research
06/15/2022

MonoGround: Detecting Monocular 3D Objects from the Ground

Monocular 3D object detection has attracted great attention for its adva...
research
11/26/2018

MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization

Localizing objects in the real 3D space, which plays a crucial role in s...
research
06/04/2019

Triangulation Learning Network: from Monocular to Stereo 3D Object Detection

In this paper, we study the problem of 3D object detection from stereo i...
research
05/23/2019

Shift R-CNN: Deep Monocular 3D Object Detection with Closed-Form Geometric Constraints

We propose Shift R-CNN, a hybrid model for monocular 3D object detection...
research
08/04/2019

Unsupervised Learning of Depth and Deep Representation for Visual Odometry from Monocular Videos in a Metric Space

For ego-motion estimation, the feature representation of the scenes is c...

Please sign up or login with your details

Forgot password? Click here to reset