Diffusion Approximations for Thompson Sampling

05/19/2021
by   Lin Fan, et al.
0

We study the behavior of Thompson sampling from the perspective of weak convergence. In the regime where the gaps between arm means scale as 1/√(n) with the time horizon n, we show that the dynamics of Thompson sampling evolve according to discrete versions of SDEs and random ODEs. As n →∞, we show that the dynamics converge weakly to solutions of the corresponding SDEs and random ODEs. (Recently, Wager and Xu (arXiv:2101.09855) independently proposed this regime and developed similar SDE and random ODE approximations.) Our weak convergence theory covers both the classical multi-armed and linear bandit settings, and can be used, for instance, to obtain insight about the characteristics of the regret distribution when there is information sharing among arms, as well as the effects of variance estimation, model mis-specification and batched updates in bandit learning. Our theory is developed from first-principles and can also be adapted to analyze other sampling-based bandit algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2019

Graph regret bounds for Thompson Sampling and UCB

We study the stochastic multi-armed bandit problem with the graph-based ...
research
06/03/2021

A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms

One of the key drivers of complexity in the classical (stochastic) multi...
research
10/27/2020

Sub-sampling for Efficient Non-Parametric Bandit Exploration

In this paper we propose the first multi-armed bandit algorithm based on...
research
10/11/2022

The Typical Behavior of Bandit Algorithms

We establish strong laws of large numbers and central limit theorems for...
research
02/11/2022

A PDE-Based Analysis of the Symmetric Two-Armed Bernoulli Bandit

This work addresses a version of the two-armed Bernoulli bandit problem ...
research
02/22/2017

Approximations of the Restless Bandit Problem

The multi-armed restless bandit problem is studied in the case where the...

Please sign up or login with your details

Forgot password? Click here to reset