Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

11/09/2022
by   Alexandre Chenu, et al.
0

Deep Reinforcement Learning has been successfully applied to learn robotic control. However, the corresponding algorithms struggle when applied to problems where the agent is only rewarded after achieving a complex task. In this context, using demonstrations can significantly speed up the learning process, but demonstrations can be costly to acquire. In this paper, we propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration. To do so, our method learns a goal-conditioned policy to control a system between successive low-dimensional goals. This sequential goal-reaching approach raises a problem of compatibility between successive goals: we need to ensure that the state resulting from reaching a goal is compatible with the achievement of the following goals. To tackle this problem, we present a new algorithm called DCIL-II. We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up as well as fast running with a simulated Cassie robot. Our method leveraging sequentiality is a step towards the resolution of complex robotic tasks under minimal specification effort, a key feature for the next generation of autonomous robots.

READ FULL TEXT

page 1

page 11

research
04/15/2022

Divide Conquer Imitation Learning

When cast into the Deep Reinforcement Learning framework, many robotics ...
research
12/12/2019

Learning To Reach Goals Without Reinforcement Learning

Imitation learning algorithms provide a simple and straightforward appro...
research
11/11/2020

Reinforcement Learning with Time-dependent Goals for Robotic Musicians

Reinforcement learning is a promising method to accomplish robotic contr...
research
07/06/2022

Transformers are Adaptable Task Planners

Every home is different, and every person likes things done in their par...
research
09/22/2022

Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning

The use of human demonstrations in reinforcement learning has proven to ...
research
06/07/2022

How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via f-Advantage Regression

Offline goal-conditioned reinforcement learning (GCRL) promises general-...
research
09/15/2019

State Representation Learning from Demonstration

In a context where several policies can be observed as black boxes on di...

Please sign up or login with your details

Forgot password? Click here to reset