An optimal algorithm for the Thresholding Bandit Problem

05/27/2016
by   Andrea Locatelli, et al.
0

We study a specific combinatorial pure exploration stochastic bandit problem where the learner aims at finding the set of arms whose means are above a given threshold, up to a given precision, and for a fixed time horizon. We propose a parameter-free algorithm based on an original heuristic, and prove that it is optimal for this problem by deriving matching upper and lower bounds. To the best of our knowledge, this is the first non-trivial pure exploration setting with fixed budget for which optimal strategies are constructed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2016

Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem

We consider the problem of best arm identification with a fixed budget T...
research
02/09/2022

Finding Optimal Arms in Non-stochastic Combinatorial Bandits with Semi-bandit Feedback and Finite Budget

We consider the combinatorial bandits problem with semi-bandit feedback ...
research
06/18/2021

Problem Dependent View on Structured Thresholding Bandit Problems

We investigate the problem dependent regime in the stochastic Thresholdi...
research
06/23/2019

Making the Cut: A Bandit-based Approach to Tiered Interviewing

Given a huge set of applicants, how should a firm allocate sequential re...
research
12/02/2020

Instance-Sensitive Algorithms for Pure Exploration in Multinomial Logit Bandit

Motivated by real-world applications such as fast fashion retailing and ...
research
10/29/2021

Collaborative Pure Exploration in Kernel Bandit

In this paper, we formulate a Collaborative Pure Exploration in Kernel B...
research
10/14/2019

Thresholding Bandit Problem with Both Duels and Pulls

The Thresholding Bandit Problem (TBP) aims to find the set of arms with ...

Please sign up or login with your details

Forgot password? Click here to reset