Thresholding Bandit with Optimal Aggregate Regret

05/27/2019
by   Chao Tao, et al.
5

We consider the thresholding bandit problem, whose goal is to find arms of mean rewards above a given threshold θ, with a fixed budget of T trials. We introduce LSA, a new, simple and anytime algorithm that aims to minimize the aggregate regret (or the expected number of mis-classified arms). We prove that our algorithm is instance-wise asymptotically optimal. We also provide comprehensive empirical results to demonstrate the algorithm's superior performance over existing algorithms under a variety of different scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2020

The Influence of Shape Constraints on the Thresholding Bandit Problem

We investigate the stochastic Thresholding Bandit problem (TBP) under se...
research
10/14/2019

Thresholding Bandit Problem with Both Duels and Pulls

The Thresholding Bandit Problem (TBP) aims to find the set of arms with ...
research
04/15/2017

Asynchronous Parallel Empirical Variance Guided Algorithms for the Thresholding Bandit Problem

This paper considers the multi-armed thresholding bandit problem -- iden...
research
10/18/2021

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

In the fixed budget thresholding bandit problem, an algorithm sequential...
research
05/22/2019

Thresholding Graph Bandits with GrAPL

In this paper, we introduce a new online decision making paradigm that w...
research
05/13/2020

Adaptive Double-Exploration Tradeoff for Outlier Detection

We study a variant of the thresholding bandit problem (TBP) in the conte...
research
07/26/2022

Neural Design for Genetic Perturbation Experiments

The problem of how to genetically modify cells in order to maximize a ce...

Please sign up or login with your details

Forgot password? Click here to reset