Exploration by Optimisation in Partial Monitoring

07/12/2019
by   Tor Lattimore, et al.
1

We provide a simple and efficient algorithm for adversarial k-action d-outcome non-degenerate locally observable partial monitoring games for which the n-round minimax regret is bounded by 3(d+1) k^3/2√(8n (k)), matching the best known information-theoretic upper bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2019

An Information-Theoretic Approach to Minimax Regret in Partial Monitoring

We prove a new minimax theorem connecting the worst-case Bayesian regret...
research
05/28/2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

The information-theoretic analysis by Russo and Van Roy (2014) in combin...
research
02/10/2011

Toward a Classification of Finite Partial-Monitoring Games

Partial-monitoring games constitute a mathematical framework for sequent...
research
02/22/2022

Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret

We show that a version of the generalised information ratio of Lattimore...
research
07/29/2022

Best-of-Both-Worlds Algorithms for Partial Monitoring

This paper considers the partial monitoring problem with k-actions and d...
research
05/23/2018

Cleaning up the neighborhood: A full classification for adversarial partial monitoring

Partial monitoring is a generalization of the well-known multi-armed ban...
research
09/02/2022

A PDE approach for regret bounds under partial monitoring

In this paper, we study a learning problem in which a forecaster only ob...

Please sign up or login with your details

Forgot password? Click here to reset