Self-Imitation Learning from Demonstrations

03/21/2022
by   Georgiy Pshikhachev, et al.
0

Despite the numerous breakthroughs achieved with Reinforcement Learning (RL), solving environments with sparse rewards remains a challenging task that requires sophisticated exploration. Learning from Demonstrations (LfD) remedies this issue by guiding the agent's exploration towards states experienced by an expert. Naturally, the benefits of this approach hinge on the quality of demonstrations, which are rarely optimal in realistic scenarios. Modern LfD algorithms require meticulous tuning of hyperparameters that control the influence of demonstrations and, as we show in the paper, struggle with learning from suboptimal demonstrations. To address these issues, we extend Self-Imitation Learning (SIL), a recent RL algorithm that exploits the agent's past good experience, to the LfD setup by initializing its replay buffer with demonstrations. We denote our algorithm as SIL from Demonstrations (SILfD). We empirically show that SILfD can learn from demonstrations that are noisy or far from optimal and can automatically adjust the influence of demonstrations throughout the training without additional hyperparameters or handcrafted schedules. We also find SILfD superior to the existing state-of-the-art LfD algorithms in sparse environments, especially when demonstrations are highly suboptimal.

READ FULL TEXT

page 6

page 8

research
07/25/2021

Reinforced Imitation Learning by Free Energy Principle

Reinforcement Learning (RL) requires a large amount of exploration espec...
research
06/17/2020

Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations

Currently, deep reinforcement learning (RL) shows impressive results in ...
research
10/18/2022

CEIP: Combining Explicit and Implicit Priors for Reinforcement Learning with Demonstrations

Although reinforcement learning has found widespread use in dense reward...
research
11/16/2021

Improving Learning from Demonstrations by Learning from Experience

How to make imitation learning more general when demonstrations are rela...
research
01/11/2022

Reward Relabelling for combined Reinforcement and Imitation Learning on sparse-reward tasks

During recent years, deep reinforcement learning (DRL) has made successf...
research
01/30/2019

Go-Explore: a New Approach for Hard-Exploration Problems

A grand challenge in reinforcement learning is intelligent exploration, ...
research
10/26/2022

D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning

While combining imitation learning (IL) and reinforcement learning (RL) ...

Please sign up or login with your details

Forgot password? Click here to reset