Asymptotic Optimality for Decentralised Bandits

09/20/2021
by   Conor Newton, et al.
0

We consider a large number of agents collaborating on a multi-armed bandit problem with a large number of arms. The goal is to minimise the regret of each agent in a communication-constrained setting. We present a decentralised algorithm which builds upon and improves the Gossip-Insert-Eliminate method of Chawla et al. arxiv:2001.05452. We provide a theoretical analysis of the regret incurred which shows that our algorithm is asymptotically optimal. In fact, our regret guarantee matches the asymptotically optimal rate achievable in the full communication setting. Finally, we present empirical results which support our conclusions

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2022

On Regret-optimal Cooperative Nonstochastic Multi-armed Bandits

We consider the nonstochastic multi-agent multi-armed bandit problem wit...
research
07/07/2019

Individual Regret in Cooperative Nonstochastic Multi-Armed Bandits

We study agents communicating over an underlying network by exchanging m...
research
07/29/2020

An Index-based Deterministic Asymptotically Optimal Algorithm for Constrained Multi-armed Bandit Problems

For the model of constrained multi-armed bandit, we show that by constru...
research
10/15/2020

Double-Linear Thompson Sampling for Context-Attentive Bandits

In this paper, we analyze and extend an online learning framework known ...
research
04/26/2022

Rate-Constrained Remote Contextual Bandits

We consider a rate-constrained contextual multi-armed bandit (RC-CMAB) p...
research
05/17/2019

Pair Matching: When bandits meet stochastic block model

The pair-matching problem appears in many applications where one wants t...
research
03/12/2021

Beyond log^2(T) Regret for Decentralized Bandits in Matching Markets

We design decentralized algorithms for regret minimization in the two-si...

Please sign up or login with your details

Forgot password? Click here to reset