Better Optimism By Bayes: Adaptive Planning with Rich Models

02/09/2014
by   Arthur Guez, et al.
0

The computational costs of inference and planning have confined Bayesian model-based reinforcement learning to one of two dismal fates: powerful Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian non-parametric models but using simple, myopic planning strategies such as Thompson sampling. We ask whether it is feasible and truly beneficial to combine rich probabilistic models with a closer approximation to fully Bayesian planning. First, we use a collection of counterexamples to show formal problems with the over-optimism inherent in Thompson sampling. Then we leverage state-of-the-art techniques in efficient Bayes-adaptive planning and non-parametric Bayesian methods to perform qualitatively better than both existing conventional algorithms and Thompson sampling on two contextual bandit-like problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/14/2012

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

Bayesian model-based reinforcement learning is a formally elegant approa...
research
06/13/2012

Church: a language for generative models

We introduce Church, a universal language for describing stochastic gene...
research
11/05/2016

Detecting Dependencies in Sparse, Multivariate Databases Using Probabilistic Programming and Non-parametric Bayes

Datasets with hundreds of variables and many missing values are commonpl...
research
09/21/2023

Model-based Clustering using Non-parametric Hidden Markov Models

Thanks to their dependency structure, non-parametric Hidden Markov Model...
research
10/30/2022

Planning to the Information Horizon of BAMDPs via Epistemic State Abstraction

The Bayes-Adaptive Markov Decision Process (BAMDP) formalism pursues the...
research
01/26/2012

Dynamic trees for streaming and massive data contexts

Data collection at a massive scale is becoming ubiquitous in a wide vari...
research
08/09/2014

Active Sensing as Bayes-Optimal Sequential Decision Making

Sensory inference under conditions of uncertainty is a major problem in ...

Please sign up or login with your details

Forgot password? Click here to reset