Rapid Pose Label Generation through Sparse Representation of Unknown Objects

11/07/2020
by   Rohan Pratap Singh, et al.
5

Deep Convolutional Neural Networks (CNNs) have been successfully deployed on robots for 6-DoF object pose estimation through visual perception. However, obtaining labeled data on a scale required for the supervised training of CNNs is a difficult task - exacerbated if the object is novel and a 3D model is unavailable. To this end, this work presents an approach for rapidly generating real-world, pose-annotated RGB-D data for unknown objects. Our method not only circumvents the need for a prior 3D object model (textured or otherwise) but also bypasses complicated setups of fiducial markers, turntables, and sensors. With the help of a human user, we first source minimalistic labelings of an ordered set of arbitrarily chosen keypoints over a set of RGB-D videos. Then, by solving an optimization problem, we combine these labels under a world frame to recover a sparse, keypoint-based representation of the object. The sparse representation leads to the development of a dense model and the pose labels for each image frame in the set of scenes. We show that the sparse model can also be efficiently used for scaling to a large number of new scenes. We demonstrate the practicality of the generated labeled dataset by training a pipeline for 6-DoF object pose estimation and a pixel-wise segmentation network.

READ FULL TEXT
research
10/08/2018

Robust 6D Object Pose Estimation in Cluttered Scenes using Semantic Segmentation and Pose Regression Networks

Object pose estimation is a crucial prerequisite for robots to perform a...
research
03/07/2022

Weakly Supervised Learning of Keypoints for 6D Object Pose Estimation

State-of-the-art approaches for 6D object pose estimation require large ...
research
07/15/2017

LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

Deep neural network (DNN) architectures have been shown to outperform tr...
research
08/04/2015

Semantic Pose using Deep Networks Trained on Synthetic RGB-D

In this work we address the problem of indoor scene understanding from R...
research
10/16/2013

ImageSpirit: Verbal Guided Image Parsing

Humans describe images in terms of nouns and adjectives while algorithms...
research
04/12/2022

Semantic keypoint-based pose estimation from single RGB frames

This paper presents an approach to estimating the continuous 6-DoF pose ...
research
03/01/2022

ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception

Visual perception tasks often require vast amounts of labelled data, inc...

Please sign up or login with your details

Forgot password? Click here to reset