Optimal best arm selection for general distributions

08/24/2019
by   Shubhada Agrawal, et al.
0

Given a finite set of unknown distributions or arms that can be sampled from, we consider the problem of identifying the one with the largest mean using a delta-correct algorithm (an adaptive, sequential algorithm that restricts the probability of error to a specified delta) that has minimum sample complexity. Lower bounds for delta-correct algorithms are well known. Further, delta-correct algorithms that match the lower bound asymptotically as delta reduces to zero have also been developed in literature when the arm distributions are restricted to a single parameter exponential family. In this paper, we first observe a negative result that some restrictions are essential as otherwise under a delta-correct algorithm, distributions with unbounded support would require an infinite number of samples in expectation. We then propose a delta-correct algorithm that matches the lower bound as delta reduces to zero under a mild restriction that a known bound on the expectation of a non-negative, increasing convex function (for example, the squared moment) of underlying random variables, exists. We also propose batch processing and identify optimal batch sizes to substantially speed up the proposed algorithm. This best arm selection problem is a well studied classic problem in the simulation community. It has many learning applications including in recommendation systems and in product selection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2018

Sample complexity of partition identification using multi-armed bandits

Given a vector of probability distributions, or arms, each of which can ...
research
08/22/2016

Towards Instance Optimal Bounds for Best Arm Identification

In the classical best arm identification (Best-1-Arm) problem, we are gi...
research
12/25/2018

On discrimination between the Lindley and xgamma distributions

For a given data set the problem of selecting either Lindley or xgamma d...
research
08/17/2020

Optimal Best-Arm Identification Methods for Tail-Risk Measures

Conditional value-at-risk (CVaR) and value-at-risk (VaR) are popular tai...
research
04/11/2020

Discriminative Learning via Adaptive Questioning

We consider the problem of designing an adaptive sequence of questions t...
research
10/29/2021

A/B/n Testing with Control in the Presence of Subpopulations

Motivated by A/B/n testing applications, we consider a finite set of dis...
research
05/09/2019

Non-Asymptotic Sequential Tests for Overlapping Hypotheses and application to near optimal arm identification in bandit models

In this paper, we study sequential testing problems with overlapping hyp...

Please sign up or login with your details

Forgot password? Click here to reset