Self-supervised Single-view 3D Reconstruction via Semantic Consistency

by   Xueting Li, et al.

We learn a self-supervised, single-view 3D reconstruction model that predicts the 3D mesh shape, texture and camera pose of a target object with a collection of 2D images and silhouettes. The proposed method does not necessitate 3D supervision, manually annotated keypoints, multi-view images of an object or a prior 3D template. The key insight of our work is that objects can be represented as a collection of deformable parts, and each part is semantically coherent across different instances of the same category (e.g., wings on birds and wheels on cars). Therefore, by leveraging self-supervisedly learned part segmentation of a large collection of category-specific images, we can effectively enforce semantic consistency between the reconstructed meshes and the original images. This significantly reduces ambiguities during joint prediction of shape and camera pose of an object, along with texture. To the best of our knowledge, we are the first to try and solve the single-view reconstruction problem without a category-specific template mesh or semantic keypoints. Thus our model can easily generalize to various object categories without such labels, e.g., horses, penguins, etc. Through a variety of experiments on several categories of deformable and rigid objects, we demonstrate that our unsupervised method performs comparably if not better than existing category-specific reconstruction methods learned with supervision.


page 11

page 12

page 15

page 23

page 24

page 25

page 26

page 27


Learning Category-Specific Mesh Reconstruction from Image Collections

We present a learning framework for recovering the 3D shape, camera, and...

Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency

Approaches for single-view reconstruction typically rely on viewpoint an...

Weak Multi-View Supervision for Surface Mapping Estimation

We propose a weakly-supervised multi-view learning approach to learn cat...

3D Magic Mirror: Clothing Reconstruction from a Single Image via a Causal Perspective

This research aims to study a self-supervised 3D clothing reconstruction...

GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed Silhouettes

We present an algorithm that learns a coarse 3D representation of object...

S3K: Self-Supervised Semantic Keypoints for Robotic Manipulation via Multi-View Consistency

A robot's ability to act is fundamentally constrained by what it can per...

Unsupervised Severely Deformed Mesh Reconstruction (DMR) from a Single-View Image

Much progress has been made in the supervised learning of 3D reconstruct...

Code Repositories


Self-supervised Single-view 3D Reconstruction

view repo