The Pareto Frontier of Instance-Dependent Guarantees in Multi-Player Multi-Armed Bandits with no Communication

02/19/2022
by   Allen Liu, et al.
0

We study the stochastic multi-player multi-armed bandit problem. In this problem, m players cooperate to maximize their total reward from K > m arms. However the players cannot communicate and are penalized (e.g. receive no reward) if they pull the same arm at the same time. We ask whether it is possible to obtain optimal instance-dependent regret Õ(1/Δ) where Δ is the gap between the m-th and m+1-st best arms. Such guarantees were recently achieved in a model allowing the players to implicitly communicate through intentional collisions. We show that with no communication at all, such guarantees are, surprisingly, not achievable. In fact, obtaining the optimal Õ(1/Δ) regret for some regimes of Δ necessarily implies strictly sub-optimal regret in other regimes. Our main result is a complete characterization of the Pareto optimal instance-dependent trade-offs that are possible with no communication. Our algorithm generalizes that of Bubeck, Budzinski, and the second author and enjoys the same strong no-collision property, while our lower bound is based on a topological obstruction and holds even under full information.

READ FULL TEXT
research
09/28/2019

An Optimal Algorithm in Multiplayer Multi-Armed Bandits

The paper addresses the Multiplayer Multi-Armed Bandit (MMAB) problem, w...
research
02/04/2019

New Algorithms for Multiplayer Bandits when Arm Means Vary Among Players

We study multiplayer stochastic multi-armed bandit problems in which the...
research
11/08/2021

An Instance-Dependent Analysis for the Cooperative Multi-Player Multi-Armed Bandit

We study the problem of information sharing and cooperation in Multi-Pla...
research
11/08/2020

Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions

We consider the cooperative multi-player version of the stochastic multi...
research
11/05/2018

Multi-armed Bandits with Compensation

We propose and study the known-compensation multi-arm bandit (KCMAB) pro...
research
09/17/2018

Multi-Player Bandits: A Trekking Approach

We study stochastic multi-armed bandits with many players. The players d...
research
12/09/2015

Multi-Player Bandits -- a Musical Chairs Approach

We consider a variant of the stochastic multi-armed bandit problem, wher...

Please sign up or login with your details

Forgot password? Click here to reset