Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

09/27/2021
by   Zhaorun Chen, et al.
0

In complex environments with high dimension, training a reinforcement learning (RL) model from scratch often suffers from lengthy and tedious collection of agent-environment interactions. Instead, leveraging expert demonstration to guide RL agent can boost sample efficiency and improve final convergence. In order to better integrate expert prior with on-policy RL models, we propose a generic framework for Learning from Demonstration (LfD) based on actor-critic algorithms. Technically, we first employ K-Means clustering to evaluate the similarity of sampled exploration with demonstration data. Then we increase the likelihood of actions in similar frames by modifying the gradient update strategy to leverage demonstration. We conduct experiments on 4 standard benchmark environments in Mujoco and 2 self-designed robotic environments. Results show that, under certain condition, our algorithm can improve sample efficiency by 20 on-policy algorithms, RL models can accelerate convergence and obtain better final mean episode rewards especially in complex robotic context where interactions are expensive.

READ FULL TEXT

page 1

page 5

research
12/21/2018

Pre-training with Non-expert Human Demonstration for Deep Reinforcement Learning

Deep reinforcement learning (deep RL) has achieved superior performance ...
research
02/24/2018

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Reinforcement learning (RL) agents improve through trial-and-error, but ...
research
12/14/2022

Efficient Exploration in Resource-Restricted Reinforcement Learning

In many real-world applications of reinforcement learning (RL), performi...
research
09/09/2019

AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers

The exploration mechanism used by a Deep Reinforcement Learning (RL) age...
research
09/17/2021

Efficient State Representation Learning for Dynamic Robotic Scenarios

While the rapid progress of deep learning fuels end-to-end reinforcement...
research
12/06/2018

Active Deep Q-learning with Demonstration

Recent research has shown that although Reinforcement Learning (RL) can ...
research
05/04/2023

Simple Noisy Environment Augmentation for Reinforcement Learning

Data augmentation is a widely used technique for improving model perform...

Please sign up or login with your details

Forgot password? Click here to reset