NeRF-Supervision: Learning Dense Object Descriptors from Neural Radiance Fields

03/03/2022
by   Lin Yen-Chen, et al.
8

Thin, reflective objects such as forks and whisks are common in our daily lives, but they are particularly challenging for robot perception because it is hard to reconstruct them using commodity RGB-D cameras or multi-view stereo techniques. While traditional pipelines struggle with objects like these, Neural Radiance Fields (NeRFs) have recently been shown to be remarkably effective for performing view synthesis on objects with thin structures or reflective materials. In this paper we explore the use of NeRF as a new source of supervision for robust robot vision systems. In particular, we demonstrate that a NeRF representation of a scene can be used to train dense object descriptors. We use an optimized NeRF to extract dense correspondences between multiple views of an object, and then use these correspondences as training data for learning a view-invariant representation of the object. NeRF's usage of a density field allows us to reformulate the correspondence problem with a novel distribution-of-depths formulation, as opposed to the conventional approach of using a depth map. Dense correspondence models supervised with our method significantly outperform off-the-shelf learned descriptors by 106 (PCK@3px metric, more than doubling performance) and outperform our baseline supervised with multi-view stereo by 29 learned dense descriptors enable robots to perform accurate 6-degree of freedom (6-DoF) pick and place of thin and reflective objects.

READ FULL TEXT

page 1

page 2

page 4

page 6

page 7

research
09/12/2022

Learning Dense Visual Descriptors using Image Augmentations for Robot Manipulation Tasks

We propose a self-supervised training approach for learning view-invaria...
research
09/28/2020

Learning to Adapt Multi-View Stereo by Self-Supervision

3D scene reconstruction from multiple views is an important classical pr...
research
07/06/2022

DPODv2: Dense Correspondence-Based 6 DoF Pose Estimation

We propose a three-stage 6 DoF object detection method called DPODv2 (De...
research
09/16/2016

Dense Wide-Baseline Scene Flow From Two Handheld Video Cameras

We propose a new technique for computing dense scene flow from two handh...
research
02/16/2021

Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications

Dense Object Nets (DONs) by Florence, Manuelli and Tedrake (2018) introd...
research
10/05/2021

Fully Self-Supervised Class Awareness in Dense Object Descriptors

We address the problem of inferring self-supervised dense semantic corre...
research
10/09/2020

MMGSD: Multi-Modal Gaussian Shape Descriptors for Correspondence Matching in 1D and 2D Deformable Objects

We explore learning pixelwise correspondences between images of deformab...

Please sign up or login with your details

Forgot password? Click here to reset