Bayes Optimal Algorithm is Suboptimal in Frequentist Best Arm Identification

02/10/2022
by   Junpei Komiyama, et al.
3

We consider the fixed-budget best arm identification problem with Normal rewards. In this problem, the forecaster is given K arms (treatments) and T time steps. The forecaster attempts to find the best arm in terms of the largest mean via an adaptive experiment conducted with an algorithm. The performance of the algorithm is measured by the simple regret, or the quality of the estimated best arm. It is known that the frequentist simple regret can be exponentially small to T for any fixed parameters, whereas the Bayesian simple regret is Θ(T^-1) over a continuous prior distribution. This paper shows that Bayes optimal algorithm, which minimizes the Bayesian simple regret, does not have an exponential simple regret for some parameters. This finding contrasts with the many results indicating the asymptotic equivalence of Bayesian and frequentist algorithms in fixed sampling regimes. While the Bayes optimal algorithm is described in terms of a recursive equation that is virtually impossible to compute exactly, we pave the way to an analysis by introducing a key quantity that we call the expected Bellman improvement.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2021

Optimal Simple Regret in Bayesian Best Arm Identification

We consider Bayesian best arm identification in the multi-armed bandit p...
research
02/06/2023

Asymptotically Minimax Optimal Fixed-Budget Best Arm Identification for Expected Simple Regret Minimization

We investigate fixed-budget best arm identification (BAI) for expected s...
research
06/09/2022

Globally Optimal Algorithms for Fixed-Budget Best Arm Identification

We consider the fixed-budget best arm identification problem where the g...
research
05/29/2017

Improving the Expected Improvement Algorithm

The expected improvement (EI) algorithm is a popular strategy for inform...
research
10/08/2022

Empirical Bayesian Selection for Value Maximization

We study the common problem of selecting the best m units from a set of ...
research
01/10/2017

Identifying Best Interventions through Online Importance Sampling

Motivated by applications in computational advertising and systems biolo...
research
10/09/2018

Bridging the gap between regret minimization and best arm identification, with application to A/B tests

State of the art online learning procedures focus either on selecting th...

Please sign up or login with your details

Forgot password? Click here to reset