PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points

by   Siyuan Huang, et al.

Detecting 3D objects from a single RGB image is intrinsically ambiguous, thus requiring appropriate prior knowledge and intermediate representations as constraints to reduce the uncertainties and improve the consistencies between the 2D image plane and the 3D world coordinate. To address this challenge, we propose to adopt perspective points as a new intermediate representation for 3D object detection, defined as the 2D projections of local Manhattan 3D keypoints to locate an object; these perspective points satisfy geometric constraints imposed by the perspective projection. We further devise PerspectiveNet, an end-to-end trainable model that simultaneously detects the 2D bounding box, 2D perspective points, and 3D object bounding box for each object from a single RGB image. PerspectiveNet yields three unique advantages: (i) 3D object bounding boxes are estimated based on perspective points, bridging the gap between 2D and 3D bounding boxes without the need of category-specific 3D shape priors. (ii) It predicts the perspective points by a template-based method, and a perspective loss is formulated to maintain the perspective constraints. (iii) It maintains the consistency between the 2D perspective points and 3D bounding boxes via a differentiable projective function. Experiments on SUN RGB-D dataset show that the proposed method significantly outperforms existing RGB-based approaches for 3D object detection.


page 2

page 6

page 8

page 9


From Points to Multi-Object 3D Reconstruction

We propose a method to detect and reconstruct multiple 3D objects from a...

Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation

Holistic 3D indoor scene understanding refers to jointly recovering the ...

Single Multi-feature detector for Amodal 3D Object Detection in RGB-D Images

This paper aims at fast and high-accuracy amodal 3D object detections in...

Deep Cuboid Detection: Beyond 2D Bounding Boxes

We present a Deep Cuboid Detector which takes a consumer-quality RGB ima...

Detecting Small, Densely Distributed Objects with Filter-Amplifier Networks and Loss Boosting

Detecting small, densely distributed objects is a significant challenge:...

Field-of-View IoU for Object Detection in 360° Images

360 cameras have gained popularity over the last few years. In this pape...

Localizing Firearm Carriers by Identifying Human-Object Pairs

Visual identification of gunmen in a crowd is a challenging problem, tha...

Please sign up or login with your details

Forgot password? Click here to reset