Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization

02/06/2023
by   Masahiro Kato, et al.
0

We investigate fixed-budget best arm identification (BAI) for expected simple regret minimization. In each round of an adaptive experiment, a decision maker draws one of multiple treatment arms based on past observations and subsequently observes the outcomes of the chosen arm. After the experiment, the decision maker recommends a treatment arm with the highest projected outcome. We evaluate this decision in terms of the expected simple regret, a difference between the expected outcomes of the best and recommended treatment arms. Due to the inherent uncertainty, we evaluate the regret using the minimax criterion. For distributions with fixed variances (location-shift models), such as Gaussian distributions, we derive asymptotic lower bounds for the worst-case expected simple regret. Then, we show that the Random Sampling (RS)-Augmented Inverse Probability Weighting (AIPW) strategy proposed by Kato et al. (2022) is asymptotically minimax optimal in the sense that the leading factor of its worst-case expected simple regret asymptotically matches our derived worst-case lower bound. Our result indicates that, for location-shift models, the optimal RS-AIPW strategy draws treatment arms with varying probabilities based on their variances. This result contrasts with the results of Bubeck et al. (2011), which shows that drawing each treatment arm with an equal ratio is minimax optimal in a bounded outcome setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2020

On Worst-case Regret of Linear Thompson Sampling

In this paper, we consider the worst-case regret of Linear Thompson Samp...
research
09/15/2022

Semiparametric Best Arm Identification with Contextual Information

We study best-arm identification with a fixed budget and contextual (cov...
research
02/10/2022

Bayes Optimal Algorithm is Suboptimal in Frequentist Best Arm Identification

We consider the fixed-budget best arm identification problem with Normal...
research
02/09/2017

Efficient Policy Learning

We consider the problem of using observational data to learn treatment a...
research
01/12/2022

Best Arm Identification with a Fixed Budget under a Small Gap

We consider the fixed-budget best arm identification problem in the mult...
research
06/17/2020

The Influence of Shape Constraints on the Thresholding Bandit Problem

We investigate the stochastic Thresholding Bandit problem (TBP) under se...
research
05/29/2017

Improving the Expected Improvement Algorithm

The expected improvement (EI) algorithm is a popular strategy for inform...

Please sign up or login with your details

Forgot password? Click here to reset