A Minimum Relative Entropy Controller for Undiscounted Markov Decision Processes

02/07/2010
by   Pedro A. Ortega, et al.
0

Adaptive control problems are notoriously difficult to solve even in the presence of plant-specific controllers. One way to by-pass the intractable computation of the optimal policy is to restate the adaptive control as the minimization of the relative entropy of a controller that ignores the true plant dynamics from an informed controller. The solution is given by the Bayesian control rule-a set of equations characterizing a stochastic adaptive controller for the class of possible plant dynamics. Here, the Bayesian control rule is applied to derive BCR-MDP, a controller to solve undiscounted Markov decision processes with finite state and action spaces and unknown dynamics. In particular, we derive a non-parametric conjugate prior distribution over the policy space that encapsulates the agent's whole relevant history and we present a Gibbs sampler to draw random policies from this distribution. Preliminary results show that BCR-MDP successfully avoids sub-optimal limit cycles due to its built-in mechanism to balance exploration versus exploitation.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/03/2017

Answer Set Programming for Non-Stationary Markov Decision Processes

Non-stationary domains, where unforeseen changes happen, present a chall...
02/16/2010

Convergence of Bayesian Control Rule

Recently, new approaches to adaptive control have sought to reformulate ...
10/08/2020

Adaptive Shielding under Uncertainty

This paper targets control problems that exhibit specific safety and per...
12/31/2021

Robust Entropy-regularized Markov Decision Processes

Stochastic and soft optimal policies resulting from entropy-regularized ...
12/31/2020

Robust Asymmetric Learning in POMDPs

Policies for partially observed Markov decision processes can be efficie...
10/25/2018

Stochastic Control with Stale Information--Part I: Fully Observable Systems

In this study, we adopt age of information as a measure of the staleness...
12/07/2017

Remarks on Bayesian Control Charts

There is a considerable amount of ongoing research on the use of Bayesia...