Robobarista: Learning to Manipulate Novel Objects via Deep Multimodal Embedding

01/12/2016
by   Jaeyong Sung, et al.
0

There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and learn to transfer manipulation strategy across different objects by embedding point-cloud, natural language, and manipulation trajectory data into a shared embedding space using a deep neural network. In order to learn semantically meaningful spaces throughout our network, we introduce a method for pre-training its lower layers for multimodal feature embedding and a method for fine-tuning this embedding space using a loss-based margin. In order to collect a large number of manipulation demonstrations for different objects, we develop a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects and appliances with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot with our model can even prepare a cup of a latte with appliances it has never seen before.

READ FULL TEXT

page 1

page 4

page 9

page 10

page 12

page 13

page 15

page 17

research
04/13/2015

Robobarista: Object Part based Transfer of Manipulation Trajectories from Crowd-sourcing in 3D Pointclouds

There is a large variety of objects and appliances in human environments...
research
09/25/2015

Deep Multimodal Embedding: Manipulating Novel Objects with Point-clouds, Language and Trajectories

A robot operating in a real-world environment needs to perform reasoning...
research
04/30/2020

Plan-Space State Embeddings for Improved Reinforcement Learning

Robot control problems are often structured with a policy function that ...
research
11/17/2015

Learning Articulated Motion Models from Visual and Lingual Signals

In order for robots to operate effectively in homes and workplaces, they...
research
01/22/2017

Large Scale Novel Object Discovery in 3D

We present a method for discovering objects in 3D point clouds from sens...
research
10/19/2021

StructFormer: Learning Spatial Structure for Language-Guided Semantic Rearrangement of Novel Objects

Geometric organization of objects into semantically meaningful arrangeme...
research
02/07/2023

Local Neural Descriptor Fields: Locally Conditioned Object Representations for Manipulation

A robot operating in a household environment will see a wide range of un...

Please sign up or login with your details

Forgot password? Click here to reset