DeepAI
Log In Sign Up

Mirror Descent and the Information Ratio

09/25/2020
by   Tor Lattimore, et al.
0

We establish a connection between the stability of mirror descent and the information ratio by Russo and Van Roy [2014]. Our analysis shows that mirror descent with suitable loss estimators and exploratory distributions enjoys the same bound on the adversarial regret as the bounds on the Bayesian regret for information-directed sampling. Along the way, we develop the theory for information-directed sampling and provide an efficient algorithm for adversarial bandits for which the regret upper bound matches exactly the best known information-theoretic upper bound.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/28/2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

The information-theoretic analysis by Russo and Van Roy (2014) in combin...
05/27/2022

Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits

We study the Bayesian regret of the renowned Thompson Sampling algorithm...
11/11/2020

Asymptotically Optimal Information-Directed Sampling

We introduce a computationally efficient algorithm for finite stochastic...
06/09/2022

Regret Bounds for Information-Directed Reinforcement Learning

Information-directed sampling (IDS) has revealed its potential as a data...
05/29/2021

Information Directed Sampling for Sparse Linear Bandits

Stochastic sparse linear bandits offer a practical model for high-dimens...
01/29/2018

Information Directed Sampling and Bandits with Heteroscedastic Noise

In the stochastic bandit problem, the goal is to maximize an unknown fun...
09/30/2020

An Upper Bound for Wiretap Multi-way Channels

A general model for wiretap multi-way channels is introduced that includ...