Thompson Sampling For Stochastic Bandits with Graph Feedback

01/16/2017
by   Aristide C. Y. Tossou, et al.
0

We present a novel extension of Thompson Sampling for stochastic sequential decision problems with graph feedback, even when the graph structure itself is unknown and/or changing. We provide theoretical guarantees on the Bayesian regret of the algorithm, linking its performance to the underlying properties of the graph. Thompson Sampling has the advantage of being applicable without the need to construct complicated upper confidence bounds for different problems. We illustrate its performance through extensive experimental results on real and simulated networks with graph feedback. More specifically, we tested our algorithms on power law, planted partitions and Erdo's-Renyi graphs, as well as on graphs derived from Facebook and Flixster data. These all show that our algorithms clearly outperform related methods that employ upper confidence bounds, even if the latter use more information about the graph.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2019

Graph regret bounds for Thompson Sampling and UCB

We study the stochastic multi-armed bandit problem with the graph-based ...
research
11/08/2017

Information Directed Sampling for Stochastic Bandits with Graph Feedback

We consider stochastic multi-armed bandit problems with graph feedback, ...
research
08/29/2023

Stochastic Graph Bandit Learning with Side-Observations

In this paper, we investigate the stochastic contextual bandit with gene...
research
05/06/2023

An improved regret analysis for UCB-N and TS-N

In the setting of stochastic online learning with undirected feedback gr...
research
03/23/2020

Algorithms for Non-Stationary Generalized Linear Bandits

The statistical framework of Generalized Linear Models (GLM) can be appl...
research
02/24/2022

Thompson Sampling with Unrestricted Delays

We investigate properties of Thompson Sampling in the stochastic multi-a...
research
02/17/2023

Graph Feedback via Reduction to Regression

When feedback is partial, leveraging all available information is critic...

Please sign up or login with your details

Forgot password? Click here to reset