Log In Sign Up

Automatic Curricula via Expert Demonstrations

by   Siyu Dai, et al.

We propose Automatic Curricula via Expert Demonstrations (ACED), a reinforcement learning (RL) approach that combines the ideas of imitation learning and curriculum learning in order to solve challenging robotic manipulation tasks with sparse reward functions. Curriculum learning solves complicated RL tasks by introducing a sequence of auxiliary tasks with increasing difficulty, yet how to automatically design effective and generalizable curricula remains a challenging research problem. ACED extracts curricula from a small amount of expert demonstration trajectories by dividing demonstrations into sections and initializing training episodes to states sampled from different sections of demonstrations. Through moving the reset states from the end to the beginning of demonstrations as the learning agent improves its performance, ACED not only learns challenging manipulation tasks with unseen initializations and goals, but also discovers novel solutions that are distinct from the demonstrations. In addition, ACED can be naturally combined with other imitation learning methods to utilize expert demonstrations in a more efficient manner, and we show that a combination of ACED with behavior cloning allows pick-and-place tasks to be learned with as few as 1 demonstration and block stacking tasks to be learned with 20 demonstrations.


Robust Imitation of a Few Demonstrations with a Backwards Model

Behavior cloning of expert demonstrations can speed up learning optimal ...

Third-Person Imitation Learning

Reinforcement learning (RL) makes it possible to train agents capable of...

Combining learned skills and reinforcement learning for robotic manipulations

Manipulation tasks such as preparing a meal or assembling furniture rema...

Self-Imitation Learning by Planning

Imitation learning (IL) enables robots to acquire skills quickly by tran...

Action Priors for Large Action Spaces in Robotics

In robotics, it is often not possible to learn useful policies using pur...

Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control

We propose Adaptive Curriculum Generation from Demonstrations (ACGD) for...

Automated curriculum generation for Policy Gradients from Demonstrations

In this paper, we present a technique that improves the process of train...