Hierarchical Deep Q-Network with Forgetting from Imperfect Demonstrations in Minecraft

12/18/2019
by   Alexey Skrynnik, et al.
0

We present hierarchical Deep Q-Network with Forgetting (HDQF) that took first place in MineRL competition. HDQF works on imperfect demonstrations utilize hierarchical structure of expert trajectories extracting effective sequence of meta-actions and subgoals. We introduce structured task dependent replay buffer and forgetting technique that allow the HDQF agent to gradually erase poor-quality expert data from the buffer. In this paper we present the details of the HDQF algorithm and give the experimental results in Minecraft domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2020

Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations

Currently, deep reinforcement learning (RL) shows impressive results in ...
research
10/01/2022

Bayesian Q-learning With Imperfect Expert Demonstrations

Guided exploration with expert demonstrations improves data efficiency f...
research
03/15/2023

Replay Buffer With Local Forgetting for Adaptive Deep Model-Based Reinforcement Learning

One of the key behavioral characteristics used in neuroscience to determ...
research
10/27/2021

Learning from demonstrations with SACR2: Soft Actor-Critic with Reward Relabeling

During recent years, deep reinforcement learning (DRL) has made successf...
research
11/16/2019

Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance

In this paper, we study Reinforcement Learning from Demonstrations (RLfD...
research
02/07/2022

Learning from Imperfect Demonstrations via Adversarial Confidence Transfer

Existing learning from demonstration algorithms usually assume access to...

Please sign up or login with your details

Forgot password? Click here to reset