Game of Thrones: Fully Distributed Learning for Multi-Player Bandits

10/26/2018
by   Ilai Bistritz, et al.
0

We consider a multi-armed bandit game where N players compete for M arms for T turns. Each player has different expected rewards for the arms, and the instantaneous rewards are independent and identically distributed or Markovian. When two or more players choose the same arm, they all receive zero reward. Performance is measured using the expected sum of regrets, compared to optimal assignment of arms to players. We assume that each player only knows her actions and the reward she received each turn. Players cannot observe the actions of other players, and no communication between players is possible. We present a distributed algorithm and prove that it achieves an expected sum of regrets of near-O( T). This is the first algorithm to achieve a near order optimal regret in this fully distributed scenario. All other works have assumed that either all players have the same vector of expected rewards or that communication between players is possible.

READ FULL TEXT
research
10/26/2018

Distributed Multi-Player Bandits - a Game of Thrones Approach

We consider a multi-armed bandit game where N players compete for K arms...
research
02/23/2020

My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits

Consider N cooperative but non-communicating players where each plays on...
research
08/03/2019

Multiplayer Bandit Learning, from Competition to Cooperation

The stochastic multi-armed bandit problem is a classic model illustratin...
research
05/04/2015

On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits

We consider the problem of learning in single-player and multiplayer mul...
research
12/12/2022

Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits

Multi-player multi-armed bandit is an increasingly relevant decision-mak...
research
02/21/2019

Multi-Player Bandits: The Adversarial Case

We consider a setting where multiple players sequentially choose among a...
research
09/07/2021

Online Learning for Cooperative Multi-Player Multi-Armed Bandits

We introduce a framework for decentralized online learning for multi-arm...

Please sign up or login with your details

Forgot password? Click here to reset