Randomized Exploration in Generalized Linear Bandits

06/21/2019
by   Branislav Kveton, et al.
2

We study two randomized algorithms for generalized linear bandits, GLM-TSL and GLM-FPL. GLM-TSL samples a generalized linear model (GLM) from the Laplace approximation to the posterior distribution. GLM-FPL, a new algorithm proposed in this work, fits a GLM to a randomly perturbed history of past rewards. We prove a Õ(d √(n) + d^2) upper bound on the n-round regret of GLM-TSL, where d is the number of features. This is the first regret bound of a Thompson sampling-like algorithm in GLM bandits where the leading term is Õ(d √(n)). We apply both GLM-TSL and GLM-FPL to logistic and neural network bandits, and show that they perform well empirically. In more complex models, GLM-FPL is significantly faster. Our results showcase the role of randomization, beyond posterior sampling, in exploration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2019

Perturbed-History Exploration in Stochastic Linear Bandits

We propose a new online algorithm for minimizing the cumulative regret i...
research
02/16/2021

The Randomized Elliptical Potential Lemma with an Application to Linear Thompson Sampling

In this note, we introduce a randomized version of the well-known ellipt...
research
06/15/2021

Thompson Sampling for Unimodal Bandits

In this paper, we propose a Thompson Sampling algorithm for unimodal ban...
research
06/22/2022

Langevin Monte Carlo for Contextual Bandits

We study the efficiency of Thompson sampling for contextual bandits. Exi...
research
02/25/2019

Improved Algorithm on Online Clustering of Bandits

We generalize the setting of online clustering of bandits by allowing no...
research
03/09/2019

Linear Bandits with Feature Feedback

This paper explores a new form of the linear bandit problem in which the...
research
11/20/2016

Linear Thompson Sampling Revisited

We derive an alternative proof for the regret of Thompson sampling () in...

Please sign up or login with your details

Forgot password? Click here to reset