Learning Montezuma's Revenge from a Single Demonstration

12/08/2018
by   Tim Salimans, et al.
0

We propose a new method for learning from a single demonstration to solve hard exploration tasks like the Atari game Montezuma's Revenge. Instead of imitating human demonstrations, as proposed in other recent works, our approach is to maximize rewards directly. Our agent is trained using off-the-shelf reinforcement learning, but starts every episode by resetting to a state from a demonstration. By starting from such demonstration states, the agent requires much less exploration to learn a game compared to when it starts from the beginning of the game at every episode. We analyze reinforcement learning for tasks with sparse rewards in a simple toy environment, where we show that the run-time of standard RL methods scales exponentially in the number of states between rewards. Our method reduces this to quadratic scaling, opening up many tasks that were previously infeasible. We then apply our method to Montezuma's Revenge, for which we present a trained agent achieving a high-score of 74,500, better than any previously published result.

READ FULL TEXT
research
07/18/2018

Backplay: "Man muss immer umkehren"

A long-standing problem in model free reinforcement learning (RL) is tha...
research
12/03/2022

Reinforcement learning with Demonstrations from Mismatched Task under Sparse Reward

Reinforcement learning often suffer from the sparse reward issue in real...
research
07/01/2022

Lifelong Inverse Reinforcement Learning

Methods for learning from demonstration (LfD) have shown success in acqu...
research
07/07/2020

Guided Exploration with Proximal Policy Optimization using a Single Demonstration

Solving sparse reward tasks through exploration is one of the major chal...
research
07/25/2020

Human Preference Scaling with Demonstrations For Deep Reinforcement Learning

The current reward learning from human preferences could be used for res...
research
09/21/2021

Long-Term Exploration in Persistent MDPs

Exploration is an essential part of reinforcement learning, which restri...
research
02/14/2020

Never Give Up: Learning Directed Exploration Strategies

We propose a reinforcement learning agent to solve hard exploration game...

Please sign up or login with your details

Forgot password? Click here to reset