Batched Multi-Armed Bandits with Optimal Regret

10/11/2019
by   Hossein Esfandiari, et al.
0

We present a simple and efficient algorithm for the batched stochastic multi-armed bandit problem. We prove a bound for its expected regret that improves over the best-known regret bound, for any number of batches. In particular, our algorithm achieves the optimal expected regret by using only a logarithmic number of batches.

READ FULL TEXT

page 1

page 2

research
02/06/2013

Bounded regret in stochastic multi-armed bandits

We study the stochastic multi-armed bandit problem when one knows the va...
research
06/17/2020

Stochastic Bandits with Linear Constraints

We study a constrained contextual linear bandit setting, where the goal ...
research
05/05/2014

Generalized Risk-Aversion in Stochastic Multi-Armed Bandits

We consider the problem of minimizing the regret in stochastic multi-arm...
research
04/03/2019

Batched Multi-armed Bandits Problem

In this paper, we study the multi-armed bandit problem in the batched se...
research
06/05/2021

Differentially Private Multi-Armed Bandits in the Shuffle Model

We give an (ε,δ)-differentially private algorithm for the multi-armed ba...
research
07/23/2013

Modeling Human Decision-making in Generalized Gaussian Multi-armed Bandits

We present a formal model of human decision-making in explore-exploit ta...
research
06/02/2015

Optimal Regret Analysis of Thompson Sampling in Stochastic Multi-armed Bandit Problem with Multiple Plays

We discuss a multiple-play multi-armed bandit (MAB) problem in which sev...

Please sign up or login with your details

Forgot password? Click here to reset