Generalised Entropy MDPs and Minimax Regret

12/10/2014
by   Emmanouil G. Androulakis, et al.
0

Bayesian methods suffer from the problem of how to specify prior beliefs. One interesting idea is to consider worst-case priors. This requires solving a stochastic zero-sum game. In this paper, we extend well-known results from bandit theory in order to discover minimax-Bayes policies and discuss when they are practical.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2020

On Worst-case Regret of Linear Thompson Sampling

In this paper, we consider the worst-case regret of Linear Thompson Samp...
research
02/01/2019

An Information-Theoretic Approach to Minimax Regret in Partial Monitoring

We prove a new minimax theorem connecting the worst-case Bayesian regret...
research
09/09/2022

Expected Worst Case Regret via Stochastic Sequential Covering

We study the problem of sequential prediction and online minimax regret ...
research
12/10/2020

Leveraging vague prior information in general models via iteratively constructed Gamma-minimax estimators

Gamma-minimax estimation is an approach to incorporate prior information...
research
06/19/2020

Learning Minimax Estimators via Online Learning

We consider the problem of designing minimax estimators for estimating t...
research
05/19/2019

Teaching decision theory proof strategies using a crowdsourcing problem

Teaching how to derive minimax decision rules can be challenging because...
research
01/25/2019

Gaussian One-Armed Bandit and Optimization of Batch Data Processing

We consider the minimax setup for Gaussian one-armed bandit problem, i.e...

Please sign up or login with your details

Forgot password? Click here to reset