Deeper & Sparser Exploration

02/07/2019
by   Divya Grover, et al.
0

We address the problem of efficient exploration by proposing a new meta algorithm in the context of model-based online planning for Bayesian Reinforcement Learning (BRL). We beat the state-of-the-art, while staying computationally faster, in some cases by two orders of magnitude. This is the first Optimism free BRL algorithm to beat all previous state-of-the-art in tabular RL. The main novelty is the use of a candidate policy generator, to generate long-term options in the belief tree, which allows us to create much sparser and deeper trees. We present results on many standard environments and empirically prove its performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2021

Online and Offline Reinforcement Learning by Planning with a Learned Model

Learning efficiently from small amounts of data has long been the focus ...
research
10/29/2018

Model-Based Active Exploration

Efficient exploration is an unsolved problem in Reinforcement Learning. ...
research
08/24/2022

A model-based approach to meta-Reinforcement Learning: Transformers and tree search

Meta-learning is a line of research that develops the ability to leverag...
research
03/01/2022

AI Planning Annotation for Sample Efficient Reinforcement Learning

AI planning and Reinforcement Learning (RL) both solve sequential decisi...
research
06/01/2018

Strategic Object Oriented Reinforcement Learning

Humans learn to play video games significantly faster than state-of-the-...
research
05/02/2023

Unlocking the Power of Representations in Long-term Novelty-based Exploration

We introduce Robust Exploration via Clustering-based Online Density Esti...
research
10/31/2017

TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning

Combining deep model-free reinforcement learning with on-line planning i...

Please sign up or login with your details

Forgot password? Click here to reset