Approximate Top-m Arm Identification with Heterogeneous Reward Variances

04/11/2022
by   Ruida Zhou, et al.
0

We study the effect of reward variance heterogeneity in the approximate top-m arm identification setting. In this setting, the reward for the i-th arm follows a σ^2_i-sub-Gaussian distribution, and the agent needs to incorporate this knowledge to minimize the expected number of arm pulls to identify m arms with the largest means within error ϵ out of the n arms, with probability at least 1-δ. We show that the worst-case sample complexity of this problem is Θ( ∑_i =1^n σ_i^2/ϵ^2ln1/δ + ∑_i ∈ G^mσ_i^2/ϵ^2ln(m) + ∑_j ∈ G^lσ_j^2/ϵ^2Ent(σ^2_G^r) ), where G^m, G^l, G^r are certain specific subsets of the overall arm set {1, 2, …, n}, and Ent(·) is an entropy-like function which measures the heterogeneity of the variance proxies. The upper bound of the complexity is obtained using a divide-and-conquer style algorithm, while the matching lower bound relies on the study of a dual formulation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2016

Towards Instance Optimal Bounds for Best Arm Identification

In the classical best arm identification (Best-1-Arm) problem, we are gi...
research
10/14/2022

Federated Best Arm Identification with Heterogeneous Clients

We study best arm identification in a federated multi-armed bandit setti...
research
04/08/2023

Best Arm Identification with Fairness Constraints on Subpopulations

We formulate, analyze and solve the problem of best arm identification w...
research
03/23/2021

Bandits with many optimal arms

We consider a stochastic bandit problem with a possibly infinite number ...
research
09/21/2020

Robust Outlier Arm Identification

We study the problem of Robust Outlier Arm Identification (ROAI), where ...
research
06/03/2019

MaxGap Bandit: Adaptive Algorithms for Approximate Ranking

This paper studies the problem of adaptively sampling from K distributio...
research
04/20/2020

Collaborative Top Distribution Identifications with Limited Interaction

We consider the following problem in this paper: given a set of n distri...

Please sign up or login with your details

Forgot password? Click here to reset