PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

03/15/2023
by   Garrett Thomas, et al.
0

A rich representation is key to general robotic manipulation, but existing model architectures require a lot of data to learn it. Unfortunately, ideal robotic manipulation training data, which comes in the form of expert visuomotor demonstrations for a variety of annotated tasks, is scarce. In this work we propose PLEX, a transformer-based architecture that learns from task-agnostic visuomotor trajectories accompanied by a much larger amount of task-conditioned object manipulation videos – a type of robotics-relevant data available in quantity. The key insight behind PLEX is that the trajectories with observations and actions help induce a latent feature space and train a robot to execute task-agnostic manipulation routines, while a diverse set of video-only demonstrations can efficiently teach the robot how to plan in this feature space for a wide variety of tasks. In contrast to most works on robotic manipulation pretraining, PLEX learns a generalizable sensorimotor multi-task policy, not just an observational representation. We also show that using relative positional encoding in PLEX's transformers further increases its data efficiency when learning from human-collected demonstrations. Experiments showcase 's generalization on Meta-World-v2 benchmark and establish state-of-the-art performance in challenging Robosuite environments.

READ FULL TEXT
research
05/10/2023

Learning Video-Conditioned Policies for Unseen Manipulation Tasks

The ability to specify robot commands by a non-expert user is critical f...
research
06/26/2023

RVT: Robotic View Transformer for 3D Object Manipulation

For 3D object manipulation, methods that build an explicit 3D representa...
research
09/12/2022

Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation

Transformers have revolutionized vision and natural language processing ...
research
03/05/2019

Learning Latent Plans from Play

We propose learning from teleoperated play data (LfP) as a way to scale ...
research
07/10/2017

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

In this paper, we propose a multi-task learning from demonstration metho...
research
11/13/2020

Learning Object Manipulation Skills via Approximate State Estimation from Real Videos

Humans are adept at learning new tasks by watching a few instructional v...
research
07/20/2022

Learning Deformable Object Manipulation from Expert Demonstrations

We present a novel Learning from Demonstration (LfD) method, Deformable ...

Please sign up or login with your details

Forgot password? Click here to reset