
Bandits with many optimal arms
We consider a stochastic bandit problem with a possibly infinite number ...
read it

Regret Lower Bound and Optimal Algorithm in Dueling Bandit Problem
We study the Karmed dueling bandit problem, a variation of the standard...
read it

Causal Bandits with Propagating Inference
Bandit is a framework for designing sequential experiments. In each expe...
read it

Tight Regret Bounds for Noisy Optimization of a Brownian Motion
We consider the problem of Bayesian optimization of a onedimensional Br...
read it

Bandit Multiclass Linear Classification for the Group Linear Separable Case
We consider the online multiclass linear classification under the bandit...
read it

Restless dependent bandits with fading memory
We study the stochastic multiarmed bandit problem in the case when the ...
read it

2D Fractional Cascading on Axisaligned Planar Subdivisions
Fractional cascading is one of the influential techniques in data struct...
read it
Understanding Bandits with Graph Feedback
The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graph G=(V,E) where V is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed. A fundamental question is how the structure of the graph affects the minmax regret. We propose the notions of the fractional weak domination number δ^* and the kpacking independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual – the fractional vertex packing set respectively. Based on this connection, we utilize the strong duality theorem to prove a general regret upper bound O(( δ^*log V)^1/3T^2/3) and a lower bound Ω((δ^*/α)^1/3T^2/3) where α is the integrality gap of the dual linear program. Therefore, our bounds are tight up to a (log V)^1/3 factor on graphs with bounded integrality gap for the vertex packing problem including trees and graphs with bounded degree. Moreover, we show that for several special families of graphs, we can get rid of the (log V)^1/3 factor and establish optimal regret.
READ FULL TEXT
Comments
There are no comments yet.