Lifting the Information Ratio: An Information-Theoretic Analysis of Thompson Sampling for Contextual Bandits

05/27/2022
by   Gergely Neu, et al.
12

We study the Bayesian regret of the renowned Thompson Sampling algorithm in contextual bandits with binary losses and adversarially-selected contexts. We adapt the information-theoretic perspective of Russo and Van Roy [2016] to the contextual setting by introducing a new concept of information ratio based on the mutual information between the unknown model parameter and the observed loss. This allows us to bound the regret in terms of the entropy of the prior distribution through a remarkably simple proof, and with no structural assumptions on the likelihood or the prior. The extension to priors with infinite entropy only requires a Lipschitz assumption on the log-likelihood. An interesting special case is that of logistic bandits with d-dimensional parameters, K actions, and Lipschitz logits, for which we provide a O(√(dKT)) regret upper-bound that does not depend on the smallest slope of the sigmoid link function.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2023

Thompson Sampling Regret Bounds for Contextual Bandits with sub-Gaussian rewards

In this work, we study the performance of the Thompson Sampling algorith...
research
02/04/2021

Transfer Learning in Bandits with Latent Continuity

Structured stochastic multi-armed bandits provide accelerated regret rat...
research
09/25/2020

Mirror Descent and the Information Ratio

We establish a connection between the stability of mirror descent and th...
research
05/28/2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

The information-theoretic analysis by Russo and Van Roy (2014) in combin...
research
06/28/2018

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

We introduce a new family of margin-based regret guarantees for adversar...
research
11/04/2015

Quantification of observed prior and likelihood information in parametric Bayesian modeling

Two data-dependent information metrics are developed to quantify the inf...
research
09/25/2020

Measuring Dependencies of Order Statistics: An Information Theoretic Perspective

Consider a random sample X_1 , X_2 , ..., X_n drawn independently and id...

Please sign up or login with your details

Forgot password? Click here to reset