Efficient Learning of High Level Plans from Play

03/16/2023
by   Núria Armengol Urpí, et al.
0

Real-world robotic manipulation tasks remain an elusive challenge, since they involve both fine-grained environment interaction, as well as the ability to plan for long-horizon goals. Although deep reinforcement learning (RL) methods have shown encouraging results when planning end-to-end in high-dimensional environments, they remain fundamentally limited by poor sample efficiency due to inefficient exploration, and by the complexity of credit assignment over long horizons. In this work, we present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL to achieve long-horizon complex manipulation tasks. We leverage task-agnostic play data to learn a discrete behavioral prior over object-centric primitives, modeling their feasibility given the current context. We then design a high-level goal-conditioned policy which (1) uses primitives as building blocks to scaffold complex long-horizon tasks and (2) leverages the behavioral prior to accelerate learning. We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks and learns policies that can be easily transferred to physical hardware.

READ FULL TEXT

page 1

page 3

page 4

research
11/18/2021

Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning

Operating in the real-world often requires agents to learn about a compl...
research
05/22/2023

FurnitureBench: Reproducible Real-World Benchmark for Long-Horizon Complex Manipulation

Reinforcement learning (RL), imitation learning (IL), and task and motio...
research
02/23/2022

Learning Multi-step Robotic Manipulation Policies from Visual Observation of Scene and Q-value Predictions of Previous Action

In this work, we focus on multi-step manipulation tasks that involve lon...
research
11/15/2021

Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics

Applications of Reinforcement Learning (RL) in robotics are often limite...
research
02/26/2019

Autonomous Identification and Goal-Directed Invocation of Event-Predictive Behavioral Primitives

Voluntary behavior of humans appears to be composed of small, elementary...
research
10/13/2020

Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning

Long-horizon planning in realistic environments requires the ability to ...
research
04/05/2022

Learning Pneumatic Non-Prehensile Manipulation with a Mobile Blower

We investigate pneumatic non-prehensile manipulation (i.e., blowing) as ...

Please sign up or login with your details

Forgot password? Click here to reset