Perturbed-History Exploration in Stochastic Linear Bandits

03/21/2019
by   Branislav Kveton, et al.
10

We propose a new online algorithm for minimizing the cumulative regret in stochastic linear bandits. The key idea is to build a perturbed history, which mixes the history of observed rewards with a pseudo-history of randomly generated i.i.d. pseudo-rewards. Our algorithm, perturbed-history exploration in a linear bandit (LinPHE), estimates a linear model from its perturbed history and pulls the arm with the highest value under that model. We prove a Õ(d √(n)) gap-free bound on the expected n-round regret of LinPHE, where d is the number of features. Our analysis relies on novel concentration and anti-concentration bounds on the weighted sum of Bernoulli random variables. To show the generality of our design, we extend LinPHE to a logistic reward model. We evaluate both algorithms empirically and show that they are practical.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/26/2019

Perturbed-History Exploration in Stochastic Multi-Armed Bandits

We propose an online algorithm for cumulative regret minimization in a s...
research
06/21/2019

Randomized Exploration in Generalized Linear Bandits

We study two randomized algorithms for generalized linear bandits, GLM-T...
research
11/13/2018

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits

We propose a multi-armed bandit algorithm that explores based on randomi...
research
02/23/2022

Residual Bootstrap Exploration for Stochastic Linear Bandit

We propose a new bootstrap-based online algorithm for stochastic linear ...
research
03/07/2021

CORe: Capitalizing On Rewards in Bandit Exploration

We propose a bandit algorithm that explores purely by randomizing its pa...
research
03/12/2018

Semiparametric Contextual Bandits

This paper studies semiparametric contextual bandits, a generalization o...
research
06/28/2023

Allocating Divisible Resources on Arms with Unknown and Random Rewards

We consider a decision maker allocating one unit of renewable and divisi...

Please sign up or login with your details

Forgot password? Click here to reset