Best Arm Identification for Contaminated Bandits

02/26/2018
by   Jason Altschuler, et al.
0

We propose the Contaminated Best Arm Identification variant of the Multi-Armed Bandit problem, in which every arm pull has some probability ε of generating a sample from an arbitrary contamination distribution instead of the true underlying distribution. We would still like to guarantee that we can identify the best (or approximately best) true distribution with high probability, as well as provide guarantees on how good that arm's underlying distribution is. It is simple to see that in this contamination model, there are no consistent estimators for statistics (e.g. median) of the underlying distribution, and that even with infinite samples they can be estimated only up to some unavoidable bias. We give novel tight, finite-sample complexity bounds for estimating the first two robust moments (median and median absolute deviation) with high probability. We then show how to use these algorithmically for our problem by adapting Best Arm Identification algorithms from the classical Multi-Armed Bandit literature. We present matching upper and lower bounds (up to a small logarithmic factor) on these algorithm's sample complexity. These results suggest an inherent robustness of classical Best Arm Identification algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/17/2013

On Finding the Largest Mean Among Many

Sampling from distributions to find the one with the largest mean arises...
research
11/01/2022

Beyond the Best: Estimating Distribution Functionals in Infinite-Armed Bandits

In the infinite-armed bandit problem, each arm's average reward is sampl...
research
06/06/2022

Robust Pareto Set Identification with Contaminated Bandit Feedback

We consider the Pareto set identification (PSI) problem in multi-objecti...
research
05/04/2021

Optimal Algorithms for Range Searching over Multi-Armed Bandits

This paper studies a multi-armed bandit (MAB) version of the range-searc...
research
03/14/2023

Best arm identification in rare events

We consider the best arm identification problem in the stochastic multi-...
research
06/14/2022

On the Finite-Time Performance of the Knowledge Gradient Algorithm

The knowledge gradient (KG) algorithm is a popular and effective algorit...
research
06/24/2019

Sequential estimation of quantiles with applications to A/B-testing and best-arm identification

Consider the problem of sequentially estimating quantiles of any distrib...

Please sign up or login with your details

Forgot password? Click here to reset