Hierarchical Deep Q-Network with Forgetting from Imperfect Demonstrations in Minecraft

12/18/2019
by   Alexey Skrynnik, et al.
0

We present hierarchical Deep Q-Network with Forgetting (HDQF) that took first place in MineRL competition. HDQF works on imperfect demonstrations utilize hierarchical structure of expert trajectories extracting effective sequence of meta-actions and subgoals. We introduce structured task dependent replay buffer and forgetting technique that allow the HDQF agent to gradually erase poor-quality expert data from the buffer. In this paper we present the details of the HDQF algorithm and give the experimental results in Minecraft domain.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset