Model-Based Reinforcement Learning for Atari

03/01/2019
by   Łukasz Kaiser, et al.
20

Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with orders of magnitude fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games and achieve competitive results with only 100K interactions between the agent and the environment (400K frames), which corresponds to about two hours of real-time play.

READ FULL TEXT

page 2

page 4

page 7

page 8

page 9

page 10

research
07/21/2020

Deep vs. Deep Bayesian: Reinforcement Learning on a Multi-Robot Competitive Experiment

Deep Reinforcement Learning (RL) experiments are commonly performed in s...
research
05/03/2022

RLFlow: Optimising Neural Network Subgraph Transformation with World Models

We explored the use of reinforcement learning (RL) agents that can learn...
research
06/25/2019

On Multi-Agent Learning in Team Sports Games

In recent years, reinforcement learning has been successful in solving v...
research
11/21/2016

A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

Reinforcement learning is concerned with identifying reward-maximizing b...
research
05/27/2019

Learning Policies from Human Data for Skat

Decision-making in large imperfect information games is difficult. Thank...
research
07/17/2017

Trial without Error: Towards Safe Reinforcement Learning via Human Intervention

AI systems are increasingly applied to complex tasks that involve intera...
research
03/13/2023

Loss of Plasticity in Continual Deep Reinforcement Learning

The ability to learn continually is essential in a complex and changing ...

Please sign up or login with your details

Forgot password? Click here to reset