An Information-Theoretic Approach to Minimax Regret in Partial Monitoring

02/01/2019
by   Tor Lattimore, et al.
6

We prove a new minimax theorem connecting the worst-case Bayesian regret and minimax regret under partial monitoring with no assumptions on the space of signals or decisions of the adversary. We then generalise the information-theoretic tools of Russo and Van Roy (2016) for proving Bayesian regret bounds and combine them with the minimax theorem to derive minimax regret bounds for various partial monitoring settings. The highlight is a clean analysis of `non-degenerate easy' and `hard' finite partial monitoring, with new regret bounds that are independent of arbitrarily large game-dependent constants. The power of the generalised machinery is further demonstrated by proving that the minimax regret for k-armed adversarial bandits is at most sqrt2kn, improving on existing results by a factor of 2. Finally, we provide a simple analysis of the cops and robbers game, also improving best known constants.

READ FULL TEXT
research
07/12/2019

Exploration by Optimisation in Partial Monitoring

We provide a simple and efficient algorithm for adversarial k-action d-o...
research
02/22/2022

Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret

We show that a version of the generalised information ratio of Lattimore...
research
12/10/2014

Generalised Entropy MDPs and Minimax Regret

Bayesian methods suffer from the problem of how to specify prior beliefs...
research
09/02/2022

A PDE approach for regret bounds under partial monitoring

In this paper, we study a learning problem in which a forecaster only ob...
research
05/28/2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

The information-theoretic analysis by Russo and Van Roy (2014) in combin...
research
03/23/2022

Minimax Regret for Cascading Bandits

Cascading bandits model the task of learning to rank K out of L items ov...
research
02/25/2020

Information Directed Sampling for Linear Partial Monitoring

Partial monitoring is a rich framework for sequential decision making un...

Please sign up or login with your details

Forgot password? Click here to reset