Understanding Bandits with Graph Feedback

05/29/2021
by   Houshuang Chen, et al.
0

The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graph G=(V,E) where V is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed. A fundamental question is how the structure of the graph affects the min-max regret. We propose the notions of the fractional weak domination number δ^* and the k-packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual – the fractional vertex packing set respectively. Based on this connection, we utilize the strong duality theorem to prove a general regret upper bound O(( δ^*log |V|)^1/3T^2/3) and a lower bound Ω((δ^*/α)^1/3T^2/3) where α is the integrality gap of the dual linear program. Therefore, our bounds are tight up to a (log |V|)^1/3 factor on graphs with bounded integrality gap for the vertex packing problem including trees and graphs with bounded degree. Moreover, we show that for several special families of graphs, we can get rid of the (log |V|)^1/3 factor and establish optimal regret.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2021

Bandits with many optimal arms

We consider a stochastic bandit problem with a possibly infinite number ...
research
06/08/2015

Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem

We study the K-armed dueling bandit problem, a variation of the standard...
research
06/13/2023

Tight Memory-Regret Lower Bounds for Streaming Bandits

In this paper, we investigate the streaming bandits problem, wherein the...
research
05/30/2022

Improved Algorithms for Bandit with Graph Feedback via Regret Decomposition

The problem of bandit with graph feedback generalizes both the multi-arm...
research
06/25/2019

Restless dependent bandits with fading memory

We study the stochastic multi-armed bandit problem in the case when the ...
research
01/25/2020

Tight Regret Bounds for Noisy Optimization of a Brownian Motion

We consider the problem of Bayesian optimization of a one-dimensional Br...
research
12/21/2019

Bandit Multiclass Linear Classification for the Group Linear Separable Case

We consider the online multiclass linear classification under the bandit...

Please sign up or login with your details

Forgot password? Click here to reset