Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation

02/12/2018
by   Dane Corneil, et al.
0

Modern reinforcement learning algorithms reach super-human performance in many board and video games, but they are sample inefficient, i.e. they typically require significantly more playing experience than humans to reach an equal performance level. To improve sample efficiency, an agent may build a model of the environment and use planning methods to update its policy. In this article we introduce VaST (Variational State Tabulation), which maps an environment with a high-dimensional state space (e.g. the space of visual inputs) to an abstract tabular environment. Prioritized sweeping with small backups, a highly efficient planning method, can then be used to update state-action values. We show how VaST can rapidly learn to maximize reward in tasks like 3D navigation and efficiently adapt to sudden changes in rewards or transition probabilities.

READ FULL TEXT

page 2

page 5

page 6

page 13

page 14

research
05/31/2018

Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update

We propose Episodic Backward Update - a new algorithm to boost the perfo...
research
12/02/2021

Maximum Entropy Model-based Reinforcement Learning

Recent advances in reinforcement learning have demonstrated its ability ...
research
04/30/2023

Posterior Sampling for Deep Reinforcement Learning

Despite remarkable successes, deep reinforcement learning algorithms rem...
research
06/15/2018

Improving width-based planning with compact policies

Optimal action selection in decision problems characterized by sparse, d...
research
06/05/2018

The Effect of Planning Shape on Dyna-style Planning in High-dimensional State Spaces

Dyna is an architecture for reinforcement learning agents that interleav...
research
07/12/2018

The Bottleneck Simulator: A Model-based Deep Reinforcement Learning Approach

Deep reinforcement learning has recently shown many impressive successes...
research
03/11/2021

Generalizable Episodic Memory for Deep Reinforcement Learning

Episodic memory-based methods can rapidly latch onto past successful str...

Please sign up or login with your details

Forgot password? Click here to reset