Asynchronous Parallel Empirical Variance Guided Algorithms for the Thresholding Bandit Problem

04/15/2017
by   Jie Zhong, et al.
0

This paper considers the multi-armed thresholding bandit problem -- identifying all arms whose expected rewards are above a predefined threshold via as few pulls (or rounds) as possible -- proposed by Locatelli et al. [2016] recently. Although the proposed algorithm in Locatelli et al. [2016] achieves the optimal round complexity in a certain sense, there still remain unsolved issues. This paper proposes an asynchronous parallel thresholding algorithm and its parameter-free version to improve the efficiency and the applicability. On one hand, the proposed two algorithms use the empirical variance to guide the pull decision at each round, and significantly improve the round complexity of the "optimal" algorithm when all arms have bounded high order moments. The proposed algorithms can be proven to be optimal. On the other hand, most bandit algorithms assume that the reward can be observed immediately after the pull or the next decision would not be made before all rewards are observed. Our proposed asynchronous parallel algorithms allow making the choice of the next pull with unobserved rewards from earlier pulls, which avoids such an unrealistic assumption and significantly improves the identification process. Our theoretical analysis justifies the effectiveness and the efficiency of proposed asynchronous parallel algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2019

Thresholding Bandit with Optimal Aggregate Regret

We consider the thresholding bandit problem, whose goal is to find arms ...
research
02/01/2021

Doubly Robust Thompson Sampling for linear payoffs

A challenging aspect of the bandit problem is that a stochastic reward i...
research
05/14/2020

Thompson Sampling for Combinatorial Semi-bandits with Sleeping Arms and Long-Term Fairness Constraints

We study the combinatorial sleeping multi-armed semi-bandit problem with...
research
12/13/2021

Top K Ranking for Multi-Armed Bandit with Noisy Evaluations

We consider a multi-armed bandit setting where, at the beginning of each...
research
11/27/2018

Rotting bandits are no harder than stochastic ones

In bandits, arms' distributions are stationary. This is often violated i...
research
10/14/2019

Thresholding Bandit Problem with Both Duels and Pulls

The Thresholding Bandit Problem (TBP) aims to find the set of arms with ...

Please sign up or login with your details

Forgot password? Click here to reset