ICQ: A Quantization Scheme for Best-Arm Identification Over Bit-Constrained Channels

04/30/2023
by   Fathima Zarin Faizal, et al.
0

We study the problem of best-arm identification in a distributed variant of the multi-armed bandit setting, with a central learner and multiple agents. Each agent is associated with an arm of the bandit, generating stochastic rewards following an unknown distribution. Further, each agent can communicate the observed rewards with the learner over a bit-constrained channel. We propose a novel quantization scheme called Inflating Confidence for Quantization (ICQ) that can be applied to existing confidence-bound based learning algorithms such as Successive Elimination. We analyze the performance of ICQ applied to Successive Elimination and show that the overall algorithm, named ICQ-SE, has the order-optimal sample complexity as that of the (unquantized) SE algorithm. Moreover, it requires only an exponentially sparse frequency of communication between the learner and the agents, thus requiring considerably fewer bits than existing quantization schemes to successfully identify the best arm. We validate the performance improvement offered by ICQ with other quantization methods through numerical experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2020

Best-Arm Identification for Quantile Bandits with Privacy

We study the best-arm identification problem in multi-armed bandits with...
research
11/14/2021

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

This paper investigates the problem of best arm identification in contam...
research
08/19/2022

Almost Cost-Free Communication in Federated Best Arm Identification

We study the problem of best arm identification in a federated learning ...
research
10/27/2011

The multi-armed bandit problem with covariates

We consider a multi-armed bandit problem in a setting where each arm pro...
research
09/07/2016

Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem

We consider a non-stationary formulation of the stochastic multi-armed b...
research
04/07/2012

UCB Algorithm for Exponential Distributions

We introduce in this paper a new algorithm for Multi-Armed Bandit (MAB) ...
research
11/11/2021

Solving Multi-Arm Bandit Using a Few Bits of Communication

The multi-armed bandit (MAB) problem is an active learning framework tha...

Please sign up or login with your details

Forgot password? Click here to reset