Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing

12/08/2016
by   Chi Li, et al.
0

Monocular 3D object parsing is highly desirable in various scenarios including occlusion reasoning and holistic scene interpretation. We present a deep convolutional neural network (CNN) architecture to localize semantic parts in 2D image and 3D space while inferring their visibility states, given a single RGB image. Our key insight is to exploit domain knowledge to regularize the network by deeply supervising its hidden layers, in order to sequentially infer intermediate concepts associated with the final task. To acquire training data in desired quantities with ground truth 3D shape and relevant concepts, we render 3D object CAD models to generate large-scale synthetic data and simulate challenging occlusion configurations between objects. We train the network only on synthetic data and demonstrate state-of-the-art performances on real image benchmarks including an extended version of KITTI, PASCAL VOC, PASCAL3D+ and IKEA for 2D and 3D keypoint localization and instance segmentation. The empirical results substantiate the utility of our deep supervision scheme by demonstrating effective transfer of knowledge from synthetic data to real images, resulting in less overfitting compared to standard end-to-end training.

READ FULL TEXT

page 4

page 8

research
01/08/2018

Deep Supervision with Intermediate Concepts

Recent data-driven approaches to scene interpretation predominantly pose...
research
04/03/2018

3D Interpreter Networks for Viewer-Centered Wireframe Modeling

Understanding 3D object structure from a single image is an important bu...
research
12/03/2020

Robust Instance Segmentation through Reasoning about Multi-Object Occlusion

Analyzing complex scenes with Deep Neural Networks is a challenging task...
research
12/13/2018

Scene Recomposition by Learning-based ICP

By moving a depth sensor around a room, we compute a 3D CAD model of the...
research
10/25/2017

Complete 3D Scene Parsing from Single RGBD Image

Inferring the location, shape, and class of each object in a single imag...
research
03/08/2021

Look, Evolve and Mold: Learning 3D Shape Manifold via Single-view Synthetic Data

With daily observation and prior knowledge, it is easy for us human to i...
research
04/14/2021

Weakly But Deeply Supervised Occlusion-Reasoned Parametric Layouts

We propose an end-to-end network that takes a single perspective RGB ima...

Please sign up or login with your details

Forgot password? Click here to reset