Overcoming Exploration in Reinforcement Learning with Demonstrations

09/28/2017
by   Ashvin Nair, et al.
0

Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL). Many tasks are natural to specify with a sparse reward, and manually shaping a reward function can result in suboptimal performance. However, finding a non-zero reward is exponentially more difficult with increasing task horizon or action dimensionality. This puts many real-world tasks out of practical reach of RL methods. In this work, we use demonstrations to overcome the exploration problem and successfully learn to perform long-horizon, multi-step robotics tasks with continuous control such as stacking blocks with a robot arm. Our method, which builds on top of Deep Deterministic Policy Gradients and Hindsight Experience Replay, provides an order of magnitude of speedup over RL on simulated robotics tasks. It is simple to implement and makes only the additional assumption that we can collect a small set of demonstrations. Furthermore, our method is able to solve tasks not solvable by either RL or behavior cloning alone, and often ends up outperforming the demonstrator policy.

READ FULL TEXT

page 2

page 5

page 7

research
06/09/2019

Curiosity-Driven Multi-Criteria Hindsight Experience Replay

Dealing with sparse rewards is a longstanding challenge in reinforcement...
research
07/27/2017

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards

We propose a general and model-free approach for Reinforcement Learning ...
research
06/15/2021

Residual Reinforcement Learning from Demonstrations

Residual reinforcement learning (RL) has been proposed as a way to solve...
research
12/01/2021

Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation

Complex sequential tasks in continuous-control settings often require ag...
research
10/28/2021

Accelerating Robotic Reinforcement Learning via Parameterized Action Primitives

Despite the potential of reinforcement learning (RL) for building genera...
research
07/19/2022

Abstract Demonstrations and Adaptive Exploration for Efficient and Stable Multi-step Sparse Reward Reinforcement Learning

Although Deep Reinforcement Learning (DRL) has been popular in many disc...
research
09/16/2018

Deep Learning with Experience Ranking Convolutional Neural Network for Robot Manipulator

Supervised learning, more specifically Convolutional Neural Networks (CN...

Please sign up or login with your details

Forgot password? Click here to reset