On Elimination Strategies for Bandit Fixed-Confidence Identification

05/22/2022
by   Andrea Tirinzoni, et al.
0

Elimination algorithms for bandit identification, which prune the plausible correct answers sequentially until only one remains, are computationally convenient since they reduce the problem size over time. However, existing elimination strategies are often not fully adaptive (they update their sampling rule infrequently) and are not easy to extend to combinatorial settings, where the set of answers is exponentially large in the problem dimension. On the other hand, most existing fully-adaptive strategies to tackle general identification problems are computationally demanding since they repeatedly test the correctness of every answer, without ever reducing the problem size. We show that adaptive methods can be modified to use elimination in both their stopping and sampling rules, hence obtaining the best of these two worlds: the algorithms (1) remain fully adaptive, (2) suffer a sample complexity that is never worse of their non-elimination counterpart, and (3) provably eliminate certain wrong answers early. We confirm these benefits experimentally, where elimination improves significantly the computational complexity of adaptive methods on common tasks like best-arm identification in linear bandits.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2022

Choosing Answers in ε-Best-Answer Identification for Linear Bandits

In pure-exploration problems, information is gathered sequentially to an...
research
03/16/2023

On the Existence of a Complexity in Fixed Budget Bandit Identification

In fixed budget bandit identification, an algorithm sequentially observe...
research
06/11/2020

Best-Arm Identification for Quantile Bandits with Privacy

We study the best-arm identification problem in multi-armed bandits with...
research
06/12/2021

Guaranteed Fixed-Confidence Best Arm Identification in Multi-Armed Bandit

We consider the problem of finding, through adaptive sampling, which of ...
research
06/09/2017

Monte-Carlo Tree Search by Best Arm Identification

Recent advances in bandit tools and techniques for sequential learning a...
research
06/03/2019

MaxGap Bandit: Adaptive Algorithms for Approximate Ranking

This paper studies the problem of adaptively sampling from K distributio...
research
07/13/2023

Nested Elimination: A Simple Algorithm for Best-Item Identification from Choice-Based Feedback

We study the problem of best-item identification from choice-based feedb...

Please sign up or login with your details

Forgot password? Click here to reset