Compositionally Generalizable 3D Structure Prediction

by   Songfang Han, et al.

Single-image 3D shape reconstruction is an important and long-standing problem in computer vision. A plethora of existing works is constantly pushing the state-of-the-art performance in the deep learning era. However, there remains a much difficult and largely under-explored issue on how to generalize the learned skills over novel unseen object categories that have very different shape geometry distribution. In this paper, we bring in the concept of compositional generalizability and propose a novel framework that factorizes the 3D shape reconstruction problem into proper sub-problems, each of which is tackled by a carefully designed neural sub-module with generalizability guarantee. The intuition behind our formulation is that object parts (slates and cylindrical parts), their relationships (adjacency, equal-length, and parallelism) and shape substructures (T-junctions and a symmetric group of parts) are mostly shared across object categories, even though the object geometry may look very different (chairs and cabinets). Experiments on PartNet show that we achieve superior performance than baseline methods, which validates our problem factorization and network designs.


page 4

page 13

page 15

page 17

page 22

page 24

page 25

page 26


Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories

We address the problem of discovering 3D parts for objects in unseen cat...

3D Reconstruction of Novel Object Shapes from Single Images

The key challenge in single image 3D shape reconstruction is to ensure t...

PartGlot: Learning Shape Part Segmentation from Language Reference Games

We introduce PartGlot, a neural framework and associated architectures f...

A Graph Theoretic Approach for Object Shape Representation in Compositional Hierarchies Using a Hybrid Generative-Descriptive Model

A graph theoretic approach is proposed for object shape representation i...

Object-Centric Photometric Bundle Adjustment with Deep Shape Prior

Reconstructing 3D shapes from a sequence of images has long been a probl...

Im2Avatar: Colorful 3D Reconstruction from a Single Image

Existing works on single-image 3D reconstruction mainly focus on shape r...

GMNet: Graph Matching Network for Large Scale Part Semantic Segmentation in the Wild

The semantic segmentation of parts of objects in the wild is a challengi...