DeepAI AI Chat
Log In Sign Up

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

by   Alexandre Chenu, et al.

Deep Reinforcement Learning has been successfully applied to learn robotic control. However, the corresponding algorithms struggle when applied to problems where the agent is only rewarded after achieving a complex task. In this context, using demonstrations can significantly speed up the learning process, but demonstrations can be costly to acquire. In this paper, we propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration. To do so, our method learns a goal-conditioned policy to control a system between successive low-dimensional goals. This sequential goal-reaching approach raises a problem of compatibility between successive goals: we need to ensure that the state resulting from reaching a goal is compatible with the achievement of the following goals. To tackle this problem, we present a new algorithm called DCIL-II. We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up as well as fast running with a simulated Cassie robot. Our method leveraging sequentiality is a step towards the resolution of complex robotic tasks under minimal specification effort, a key feature for the next generation of autonomous robots.


page 1

page 11


Divide Conquer Imitation Learning

When cast into the Deep Reinforcement Learning framework, many robotics ...

Learning To Reach Goals Without Reinforcement Learning

Imitation learning algorithms provide a simple and straightforward appro...

Reinforcement Learning with Time-dependent Goals for Robotic Musicians

Reinforcement learning is a promising method to accomplish robotic contr...

Transformers are Adaptable Task Planners

Every home is different, and every person likes things done in their par...

Minimizing Human Assistance: Augmenting a Single Demonstration for Deep Reinforcement Learning

The use of human demonstrations in reinforcement learning has proven to ...