Distributed Bandits: Probabilistic Communication on d-regular Graphs

11/16/2020
by   Udari Madhushani, et al.
1

We study the decentralized multi-agent multi-armed bandit problem for agents that communicate with probability over a network defined by a d-regular graph. Every edge in the graph has probabilistic weight p to account for the (1-p) probability of a communication link failure. At each time step, each agent chooses an arm and receives a numerical reward associated with the chosen arm. After each choice, each agent observes the last obtained reward of each of its neighbors with probability p. We propose a new Upper Confidence Bound (UCB) based algorithm and analyze how agent-based strategies contribute to minimizing group regret in this probabilistic communication setting. We provide theoretical guarantees that our algorithm outperforms state-of-the-art algorithms. We illustrate our results and validate the theoretical claims using numerical simulations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2020

Bayesian Algorithms for Decentralized Stochastic Bandits

We study a decentralized cooperative multi-agent multi-armed bandit prob...
research
10/10/2018

Decentralized Cooperative Stochastic Multi-armed Bandits

We study a decentralized cooperative stochastic multi-armed bandit probl...
research
09/02/2020

Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

We investigate the benefits of heterogeneity in multi-agent explore-expl...
research
02/10/2021

Multi-Agent Multi-Armed Bandits with Limited Communication

We consider the problem where N agents collaboratively interact with an ...
research
03/09/2023

Communication-Efficient Collaborative Heterogeneous Bandits in Networks

The multi-agent multi-armed bandit problem has been studied extensively ...
research
03/06/2020

A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option

We consider a sequential decision-making problem where an agent can take...
research
06/06/2018

Finding the Bandit in a Graph: Sequential Search-and-Stop

We consider the problem where an agent wants to find a hidden object tha...

Please sign up or login with your details

Forgot password? Click here to reset