DeepAI
Log In Sign Up

Asymptotic Convergence of Thompson Sampling

11/08/2020
by   Cem Kalkanli, et al.
0

Thompson sampling has been shown to be an effective policy across a variety of online learning tasks. Many works have analyzed the finite time performance of Thompson sampling, and proved that it achieves a sub-linear regret under a broad range of probabilistic settings. However its asymptotic behavior remains mostly underexplored. In this paper, we prove an asymptotic convergence result for Thompson sampling under the assumption of a sub-linear Bayesian regret, and show that the actions of a Thompson sampling agent provide a strongly consistent estimator of the optimal action. Our results rely on the martingale structure inherent in Thompson sampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

03/23/2021

Adaptive Importance Sampling for Finite-Sum Optimization and Sampling with Decreasing Step-Sizes

Reducing the variance of the gradient estimator is known to improve the ...
11/11/2020

Asymptotically Optimal Information-Directed Sampling

We introduce a computationally efficient algorithm for finite stochastic...
04/18/2019

Asymptotic Behavior of Bayesian Learners with Misspecified Models

We consider an agent who represents uncertainty about her environment vi...
06/20/2022

Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality

We revisit the problem of stochastic online learning with feedback graph...
11/14/2022

Implications of Regret on Stability of Linear Dynamical Systems

The setting of an agent making decisions under uncertainty and under dyn...
02/10/2021

On the Suboptimality of Thompson Sampling in High Dimensions

In this paper we consider Thompson Sampling for combinatorial semi-bandi...
02/13/2020

Predictive Power of Nearest Neighbors Algorithm under Random Perturbation

We consider a data corruption scenario in the classical k Nearest Neighb...