Self-supervised 3D Shape and Viewpoint Estimation from Single Images for Robotics

10/17/2019
by   Oier Mees, et al.
27

We present a convolutional neural network for joint 3D shape prediction and viewpoint estimation from a single input image. During training, our network gets the learning signal from a silhouette of an object in the input image - a form of self-supervision. It does not require ground truth data for 3D shapes and the viewpoints. Because it relies on such a weak form of supervision, our approach can easily be applied to real-world data. We demonstrate that our method produces reasonable qualitative and quantitative results on natural images for both shape estimation and viewpoint prediction. Unlike previous approaches, our method does not require multiple views of the same object instance in the dataset, which significantly expands the applicability in practical robotics scenarios. We showcase it by using the hallucinated shapes to improve the performance on the task of grasping real-world objects both in simulation and with a PR2 robot.

READ FULL TEXT

page 4

page 5

page 6

research
03/22/2020

Self-Supervised 2D Image to 3D Shape Translation with Disentangled Representations

We present a framework to translate between 2D image views and 3D object...
research
10/18/2021

Learning multiplane images from single views with self-supervision

Generating static novel views from an already captured image is a hard t...
research
07/02/2017

Automatic Trimap Generation for Image Matting

Image matting is a longstanding problem in computational photography. Al...
research
05/10/2017

Learning 3D Object Categories by Looking Around Them

Traditional approaches for learning 3D object categories use either synt...
research
01/23/2020

Learning Object Placements For Relational Instructions by Hallucinating Scene Representations

Robots coexisting with humans in their environment and performing servic...
research
08/02/2023

HANDAL: A Dataset of Real-World Manipulable Object Categories with Pose Annotations, Affordances, and Reconstructions

We present the HANDAL dataset for category-level object pose estimation ...
research
12/01/2022

ViewNet: Unsupervised Viewpoint Estimation from Conditional Generation

Understanding the 3D world without supervision is currently a major chal...

Please sign up or login with your details

Forgot password? Click here to reset