Bayesian learning of noisy Markov decision processes

11/26/2012
by   Sumeetpal S. Singh, et al.
0

We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/17/2017

Pseudo-extended Markov chain Monte Carlo

Sampling from the posterior distribution using Markov chain Monte Carlo ...
research
02/10/2022

Order-based Structure Learning without Score Equivalence

We consider the structure learning problem with all node variables havin...
research
10/28/2020

Ensemble sampler for infinite-dimensional inverse problems

We introduce a new Markov chain Monte Carlo (MCMC) sampler for infinite-...
research
09/25/2021

Contributions to Large Scale Bayesian Inference and Adversarial Machine Learning

The rampant adoption of ML methodologies has revealed that models are us...
research
03/25/2020

Applying Bayesian Hierarchical Probit Model to Interview Grade Evaluation

Job interviews are a fundamental activity for most corporations to acqui...
research
05/29/2023

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo

We present a scalable and effective exploration strategy based on Thomps...
research
05/09/2012

New inference strategies for solving Markov Decision Processes using reversible jump MCMC

In this paper we build on previous work which uses inferences techniques...

Please sign up or login with your details

Forgot password? Click here to reset