Generalised Entropy MDPs and Minimax Regret

12/10/2014

∙

by Emmanouil G. Androulakis, et al.

∙

Bayesian methods suffer from the problem of how to specify prior beliefs. One interesting idea is to consider worst-case priors. This requires solving a stochastic zero-sum game. In this paper, we extend well-known results from bandit theory in order to discover minimax-Bayes policies and discuss when they are practical.

READ FULL TEXT

Generalised Entropy MDPs and Minimax Regret

Sign in with Google

Consider DeepAI Pro