Non-Asymptotic Analysis of a UCB-based Top Two Algorithm

10/11/2022
by   Marc Jourdan, et al.
11

A Top Two sampling rule for bandit identification is a method which selects the next arm to sample from among two candidate arms, a leader and a challenger. Due to their simplicity and good empirical performance, they have received increased attention in recent years. For fixed-confidence best arm identification, theoretical guarantees for Top Two methods have only been obtained in the asymptotic regime, when the error level vanishes. We derive the first non-asymptotic upper bound on the expected sample complexity of a Top Two algorithm holding for any error level. Our analysis highlights sufficient properties for a regret minimization algorithm to be used as leader. They are satisfied by the UCB algorithm and our proposed UCB-based Top Two algorithm enjoys simultaneously non-asymptotic guarantees and competitive empirical performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2022

Top Two Algorithms Revisited

Top Two algorithms arose as an adaptation of Thompson sampling to best a...
research
05/25/2023

An ε-Best-Arm Identification Algorithm for Fixed-Confidence and Beyond

We propose EB-TCε, a novel sampling rule for ε-best arm identification i...
research
05/10/2023

Best Arm Identification in Bandits with Limited Precision Sampling

We study best arm identification in a variant of the multi-armed bandit ...
research
10/14/2022

Federated Best Arm Identification with Heterogeneous Clients

We study best arm identification in a federated multi-armed bandit setti...
research
06/14/2022

On the Finite-Time Performance of the Knowledge Gradient Algorithm

The knowledge gradient (KG) algorithm is a popular and effective algorit...
research
09/05/2023

On the Complexity of Differentially Private Best-Arm Identification with Fixed Confidence

Best Arm Identification (BAI) problems are progressively used for data-s...
research
05/27/2021

A Non-asymptotic Approach to Best-Arm Identification for Gaussian Bandits

We propose a new strategy for best-arm identification with fixed confide...

Please sign up or login with your details

Forgot password? Click here to reset