Best Arm Identification with a Fixed Budget under a Small Gap

01/12/2022
by   Masahiro Kato, et al.
17

We consider the fixed-budget best arm identification problem in the multi-armed bandit problem. One of the main interests in this field is to derive a tight lower bound on the probability of misidentifying the best arm and to develop a strategy whose performance guarantee matches the lower bound. However, it has long been an open problem when the optimal allocation ratio of arm draws is unknown. In this paper, we provide an answer for this problem under which the gap between the expected rewards is small. First, we derive a tight problem-dependent lower bound, which characterizes the optimal allocation ratio that depends on the gap of the expected rewards and the Fisher information of the bandit model. Then, we propose the "RS-AIPW" strategy, which consists of the randomized sampling (RS) rule using the estimated optimal allocation ratio and the recommendation rule using the augmented inverse probability weighting (AIPW) estimator. Our proposed strategy is optimal in the sense that the performance guarantee achieves the derived lower bound under a small gap. In the course of the analysis, we present a novel large deviation bound for martingales.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/15/2022

Semiparametric Best Arm Identification with Contextual Information

We study best-arm identification with a fixed budget and contextual (cov...
research
10/06/2021

Learning the Optimal Recommendation from Explorative Users

We propose a new problem setting to study the sequential interactions be...
research
11/15/2022

Bayesian Fixed-Budget Best-Arm Identification

Fixed-budget best-arm identification (BAI) is a bandit problem where the...
research
02/06/2023

Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization

We investigate fixed-budget best arm identification (BAI) for expected s...
research
08/23/2023

On Uniformly Optimal Algorithms for Best Arm Identification in Two-Armed Bandits with Fixed Budget

We study the problem of best-arm identification with fixed budget in sto...
research
08/21/2020

Near Optimal Adversarial Attack on UCB Bandits

We consider a stochastic multi-arm bandit problem where rewards are subj...
research
11/27/2022

Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget

We consider a constrained, pure exploration, stochastic multi-armed band...

Please sign up or login with your details

Forgot password? Click here to reset