Instruction-driven history-aware policies for robotic manipulations

09/11/2022
by   Pierre-Louis Guhur, et al.
6

In human environments, robots are expected to accomplish a variety of manipulation tasks given simple natural language instructions. Yet, robotic manipulation is extremely challenging as it requires fine-grained motor control, long-term memory as well as generalization to previously unseen tasks and environments. To address these challenges, we propose a unified transformer-based approach that takes into account multiple inputs. In particular, our transformer architecture integrates (i) natural language instructions and (ii) multi-view scene observations while (iii) keeping track of the full history of observations and actions. Such an approach enables learning dependencies between history and instructions and improves manipulation precision using multiple views. We evaluate our method on the challenging RLBench benchmark and on a real-world robot. Notably, our approach scales to 74 diverse RLBench tasks and outperforms the state of the art. We also address instruction-conditioned tasks and demonstrate excellent generalization to previously unseen variations.

READ FULL TEXT

page 2

page 15

page 17

page 18

page 19

research
05/13/2021

Episodic Transformer for Vision-and-Language Navigation

Interaction and navigation defined by natural language instructions in d...
research
10/24/2020

Modularity Improves Out-of-Domain Instruction Following

We propose a modular architecture for following natural language instruc...
research
06/29/2023

KITE: Keypoint-Conditioned Policies for Semantic Manipulation

While natural language offers a convenient shared interface for humans a...
research
05/18/2022

On the Limits of Evaluating Embodied Agent Model Generalization Using Validation Sets

Natural language guided embodied task completion is a challenging proble...
research
11/21/2022

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

In recent years, much progress has been made in learning robotic manipul...
research
12/16/2020

Visually Grounding Instruction for History-Dependent Manipulation

This paper emphasizes the importance of robot's ability to refer its tas...
research
08/26/2021

Visual-and-Language Navigation: A Survey and Taxonomy

An agent that can understand natural-language instruction and carry out ...

Please sign up or login with your details

Forgot password? Click here to reset