Thompson Sampling for 1-Dimensional Exponential Family Bandits

07/12/2013
by   Nathaniel Korda, et al.
0

Thompson Sampling has been demonstrated in many complex bandit models, however the theoretical guarantees available for the parametric multi-armed bandit are still limited to the Bernoulli case. Here we extend them by proving asymptotic optimality of the algorithm using the Jeffreys prior for 1-dimensional exponential family bandits. Our proof builds on previous work, but also makes extensive use of closed forms for Kullback-Leibler divergence and Fisher information (and thus Jeffreys prior) available in an exponential family. This allow us to give a finite time exponential concentration inequality for posterior distributions on exponential families that may be of interest in its own right. Moreover our analysis covers some distributions for which no optimistic algorithm has yet been proposed, including heavy-tailed exponential families.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2022

Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

We study the regret of Thompson sampling (TS) algorithms for exponential...
research
05/24/2017

Boundary Crossing Probabilities for General Exponential Families

We consider parametric exponential families of dimension K on the real l...
research
01/18/2022

Bregman Deviations of Generic Exponential Families

We revisit the method of mixture technique, also known as the Laplace me...
research
12/02/2021

Indexed Minimum Empirical Divergence for Unimodal Bandits

We consider a multi-armed bandit problem specified by a set of one-dimen...
research
07/08/2019

Thompson Sampling on Symmetric α-Stable Bandits

Thompson Sampling provides an efficient technique to introduce prior kno...
research
02/03/2023

Optimality of Thompson Sampling with Noninformative Priors for Pareto Bandits

In the stochastic multi-armed bandit problem, a randomized probability m...
research
06/11/2020

Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits

We investigate stochastic combinatorial multi-armed bandit with semi-ban...

Please sign up or login with your details

Forgot password? Click here to reset