Quantile Bandits for Best Arms Identification with Concentration Inequalities

10/22/2020
by   Mengyan Zhang, et al.
0

We consider a variant of the best arm identification task in stochastic multi-armed bandits. Motivated by risk-averse decision-making problems in fields like medicine, biology and finance, our goal is to identify a set of m arms with the highest τ-quantile values under a fixed budget. We propose Quantile Successive Accepts and Rejects algorithm (Q-SAR), the first quantile based algorithm for fixed budget multiple arms identification. We prove two-sided asymmetric concentration inequalities for order statistics and quantiles of random variables that have non-decreasing hazard rate, which may be of independent interest. With the proposed concentration inequalities, we upper bound the probability of arm misidentification for the bandit task. We show illustrative experiments for best arm identification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2016

Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem

We consider the problem of best arm identification with a fixed budget T...
research
01/23/2020

Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting

We design and analyze CascadeBAI, an algorithm for finding the best set ...
research
01/04/2019

Risk-aware Multi-armed Bandits Using Conditional Value-at-Risk

Traditional multi-armed bandit problems are geared towards finding the a...
research
03/21/2022

Efficient Algorithms for Extreme Bandits

In this paper, we contribute to the Extreme Bandit problem, a variant of...
research
06/24/2019

Sequential estimation of quantiles with applications to A/B-testing and best-arm identification

Consider the problem of sequentially estimating quantiles of any distrib...
research
11/27/2022

Constrained Pure Exploration Multi-Armed Bandits with a Fixed Budget

We consider a constrained, pure exploration, stochastic multi-armed band...
research
07/27/2023

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

We investigate the fixed-budget best-arm identification (BAI) problem fo...

Please sign up or login with your details

Forgot password? Click here to reset