DeepAI AI Chat
Log In Sign Up

Self-Supervised Viewpoint Learning From Image Collections

by   Siva Karthik Mustikovela, et al.
University of Heidelberg

Training deep neural networks to estimate the viewpoint of objects requires large labeled training datasets. However, manually labeling viewpoints is notoriously hard, error-prone, and time-consuming. On the other hand, it is relatively easy to mine many unlabelled images of an object category from the internet, e.g., of cars or faces. We seek to answer the research question of whether such unlabeled collections of in-the-wild images can be successfully utilized to train viewpoint estimation networks for general object categories purely via self-supervision. Self-supervision here refers to the fact that the only true supervisory signal that the network has is the input image itself. We propose a novel learning framework which incorporates an analysis-by-synthesis paradigm to reconstruct images in a viewpoint aware manner with a generative network, along with symmetry and adversarial constraints to successfully supervise our viewpoint estimation network. We show that our approach performs competitively to fully-supervised approaches for several object categories like human faces, cars, buses, and trains. Our work opens up further research in self-supervised viewpoint learning and serves as a robust baseline for it. We open-source our code at


page 1

page 5

page 8

page 15

page 16

page 17

page 18


Self-Supervised Object Detection via Generative Image Synthesis

We present SSOD, the first end-to-end analysis-by synthesis framework wi...

SSR: Semi-supervised Soft Rasterizer for single-view 2D to 3D Reconstruction

Recent work has made significant progress in learning object meshes with...

VIBUS: Data-efficient 3D Scene Parsing with VIewpoint Bottleneck and Uncertainty-Spectrum Modeling

Recently, 3D scenes parsing with deep learning approaches has been a hea...

Few-shot Geometry-Aware Keypoint Localization

Supervised keypoint localization methods rely on large manually labeled ...

Unsupervised Learning of Probably Symmetric Deformable 3D Objects from Images in the Wild

We propose a method to learn 3D deformable object categories from raw si...

ViewNeRF: Unsupervised Viewpoint Estimation Using Category-Level Neural Radiance Fields

We introduce ViewNeRF, a Neural Radiance Field-based viewpoint estimatio...

ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation

Understanding the 3D world without supervision is currently a major chal...

Code Repositories


SPACE: Unsupervised Object-Oriented Scene Representation via Spatial Attention and Decomposition

view repo