Pure Exploration in Multi-armed Bandits with Graph Side Information

08/02/2021
by   Parth K. Thaker, et al.
3

We study pure exploration in multi-armed bandits with graph side-information. In particular, we consider the best arm (and near-best arm) identification problem in the fixed confidence setting under the assumption that the arm rewards are smooth with respect to a given arbitrary graph. This captures a range of real world pure-exploration scenarios where one often has information about the similarity of the options or actions under consideration. We propose a novel algorithm GRUB (GRaph based UcB) for this problem and provide a theoretical characterization of its performance that elicits the benefit of the graph-side information. We complement our theory with experimental results that show that capitalizing on available graph side information yields significant improvements over pure exploration methods that are unable to use this information.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/08/2016

On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits

We consider the best-arm identification problem in multi-armed bandits, ...
research
08/23/2018

Diversity-Driven Selection of Exploration Strategies in Multi-Armed Bandits

We consider a scenario where an agent has multiple available strategies ...
research
10/27/2021

(Almost) Free Incentivized Exploration from Decentralized Learning Agents

Incentivized exploration in multi-armed bandits (MAB) has witnessed incr...
research
04/05/2019

Collaborative Learning with Limited Interaction: Tight Bounds for Distributed Exploration in Multi-Armed Bandits

Best arm identification (or, pure exploration) in multi-armed bandits is...
research
12/08/2021

Best Arm Identification under Additive Transfer Bandits

We consider a variant of the best arm identification (BAI) problem in mu...
research
06/16/2020

Finding All ε-Good Arms in Stochastic Bandits

The pure-exploration problem in stochastic multi-armed bandits aims to f...
research
10/12/2015

Context-Aware Bandits

We propose an efficient Context-Aware clustering of Bandits (CAB) algori...

Please sign up or login with your details

Forgot password? Click here to reset