Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

10/10/2022
by   Albert Yu, et al.
14

Demonstrations and natural language instructions are two common ways to specify and teach robots novel tasks. However, for many complex tasks, a demonstration or language instruction alone contains ambiguities, preventing tasks from being specified clearly. In such cases, a combination of both a demonstration and an instruction more concisely and effectively conveys the task to the robot than either modality alone. To instantiate this problem setting, we train a single multi-task policy on a few hundred challenging robotic pick-and-place tasks and propose DeL-TaCo (Joint Demo-Language Task Conditioning), a method for conditioning a robotic policy on task embeddings comprised of two components: a visual demonstration and a language instruction. By allowing these two modalities to mutually disambiguate and clarify each other during novel task specification, DeL-TaCo (1) substantially decreases the teacher effort needed to specify a new task and (2) achieves better generalization performance on novel objects and instructions over previous task-conditioning methods. To our knowledge, this is the first work to show that simultaneously conditioning a multi-task robotic manipulation policy on both demonstration and language embeddings improves sample efficiency and generalization over conditioning on either modality alone. See additional materials at https://sites.google.com/view/del-taco-learning

READ FULL TEXT

page 2

page 5

page 6

page 15

page 16

page 17

research
11/19/2018

Guiding Policies with Language via Meta-Learning

Behavioral skills or policies for autonomous agents are conventionally l...
research
06/30/2023

Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control

Our goal is for robots to follow natural language instructions like "put...
research
05/18/2023

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

Foundation models have made significant strides in various applications,...
research
05/26/2023

Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought

Language instructions and demonstrations are two natural ways for users ...
research
08/04/2023

Forget Demonstrations, Focus on Learning from Textual Instructions

This work studies a challenging yet more realistic setting for zero-shot...
research
08/22/2023

ROSGPT_Vision: Commanding Robots Using Only Language Models' Prompts

In this paper, we argue that the next generation of robots can be comman...
research
07/17/2020

Verbal Focus-of-Attention System for Learning-from-Demonstration

The Learning-from-Demonstration (LfD) framework aims to map human demons...

Please sign up or login with your details

Forgot password? Click here to reset