Active Deep Q-learning with Demonstration

12/06/2018
by   Si-An Chen, et al.
4

Recent research has shown that although Reinforcement Learning (RL) can benefit from expert demonstration, it usually takes considerable efforts to obtain enough demonstration. The efforts prevent training decent RL agents with expert demonstration in practice. In this work, we propose Active Reinforcement Learning with Demonstration (ARLD), a new framework to streamline RL in terms of demonstration efforts by allowing the RL agent to query for demonstration actively during training. Under the framework, we propose Active Deep Q-Network, a novel query strategy which adapts to the dynamically-changing distributions during the RL training process by estimating the uncertainty of recent states. The expert demonstration data within Active DQN are then utilized by optimizing supervised max-margin loss in addition to temporal difference loss within usual DQN training. We propose two methods of estimating the uncertainty based on two state-of-the-art DQN models, namely the divergence of bootstrapped DQN and the variance of noisy DQN. The empirical results validate that both methods not only learn faster than other passive expert demonstration methods with the same amount of demonstration and but also reach super-expert level of performance across four different tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2018

Efficient Model-Free Reinforcement Learning Using Gaussian Process

Efficient Reinforcement Learning usually takes advantage of demonstratio...
research
09/27/2021

Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

In complex environments with high dimension, training a reinforcement le...
research
12/21/2018

Pre-training with Non-expert Human Demonstration for Deep Reinforcement Learning

Deep reinforcement learning (deep RL) has achieved superior performance ...
research
03/01/2018

Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling

Recent advances in the field of inverse reinforcement learning (IRL) hav...
research
04/12/2017

Deep Q-learning from Demonstrations

Deep reinforcement learning (RL) has achieved several high profile succe...
research
10/09/2021

Human-Aware Robot Navigation via Reinforcement Learning with Hindsight Experience Replay and Curriculum Learning

In recent years, the growing demand for more intelligent service robots ...
research
08/18/2020

Residual Learning from Demonstration

Contacts and friction are inherent to nearly all robotic manipulation ta...

Please sign up or login with your details

Forgot password? Click here to reset