Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

10/30/2022
by   Dilip Arumugam, et al.
0

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the Bayes-optimal solution to the exploration-exploitation trade-off in reinforcement learning. As the computation of exact solutions to Bayesian reinforcement-learning problems is intractable, much of the literature has focused on developing suitable approximation algorithms. In this work, before diving into algorithm design, we first define, under mild structural assumptions, a complexity measure for BAMDP planning. As efficient exploration in BAMDPs hinges upon the judicious acquisition of information, our complexity measure highlights the worst-case difficulty of gathering information and exhausting epistemic uncertainty. To illustrate its significance, we establish a computationally-intractable, exact planning algorithm that takes advantage of this measure to show more efficient planning. We then conclude by introducing a specific form of state abstraction with the potential to reduce BAMDP complexity and gives rise to a computationally-tractable, approximate planning algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2016

POMDP-lite for Robust Robot Planning under Uncertainty

The partially observable Markov decision process (POMDP) provides a prin...
research
05/14/2012

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

Bayesian model-based reinforcement learning is a formally elegant approa...
research
11/13/2019

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

We present an algorithm, HOMER, for exploration and reinforcement learni...
research
12/16/2019

Planning with Abstract Learned Models While Learning Transferable Subtasks

We introduce an algorithm for model-based hierarchical reinforcement lea...
research
02/17/2020

Langevin DQN

Algorithms that tackle deep exploration – an important challenge in rein...
research
02/09/2014

Better Optimism By Bayes: Adaptive Planning with Rich Models

The computational costs of inference and planning have confined Bayesian...
research
10/08/2019

Receding Horizon Curiosity

Sample-efficient exploration is crucial not only for discovering rewardi...

Please sign up or login with your details

Forgot password? Click here to reset