Graph regret bounds for Thompson Sampling and UCB

05/23/2019
by   Thodoris Lykouris, et al.
0

We study the stochastic multi-armed bandit problem with the graph-based feedback structure introduced by Mannor and Shamir. We analyze the performance of the two most prominent stochastic bandit algorithms, Thompson Sampling and Upper Confidence Bound (UCB), in the graph-based feedback setting. We show that these algorithms achieve regret guarantees that combine the graph structure and the gaps between the means of the arm distributions. Surprisingly this holds despite the fact that these algorithms do not explicitly use the graph structure to select arms. Towards this result we introduce a "layering technique" highlighting the commonalities in the two algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/07/2018

Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms

We analyze the regret of combinatorial Thompson sampling (CTS) for the c...
research
11/17/2016

Unimodal Thompson Sampling for Graph-Structured Arms

We study, to the best of our knowledge, the first Bayesian algorithm for...
research
01/16/2017

Thompson Sampling For Stochastic Bandits with Graph Feedback

We present a novel extension of Thompson Sampling for stochastic sequent...
research
05/19/2021

Diffusion Approximations for Thompson Sampling

We study the behavior of Thompson sampling from the perspective of weak ...
research
11/08/2017

Information Directed Sampling for Stochastic Bandits with Graph Feedback

We consider stochastic multi-armed bandit problems with graph feedback, ...
research
05/04/2018

Beyond the Click-Through Rate: Web Link Selection with Multi-level Feedback

The web link selection problem is to select a small subset of web links ...
research
04/03/2019

Internal versus external balancing in the evaluation of graph-based number types

Number types for exact computation are usually based on directed acyclic...

Please sign up or login with your details

Forgot password? Click here to reset