One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

11/24/2021
by   Udari Madhushani, et al.
5

The cooperative bandit problem is increasingly becoming relevant due to its applications in large-scale decision-making. However, most research for this problem focuses exclusively on the setting with perfect communication, whereas in most real-world distributed settings, communication is often over stochastic networks, with arbitrary corruptions and delays. In this paper, we study cooperative bandit learning under three typical real-world communication scenarios, namely, (a) message-passing over stochastic time-varying networks, (b) instantaneous reward-sharing over a network with random delays, and (c) message-passing with adversarially corrupted rewards, including byzantine communication. For each of these environments, we propose decentralized algorithms that achieve competitive performance, along with near-optimal guarantees on the incurred group regret as well. Furthermore, in the setting with perfect communication, we present an improved delayed-update algorithm that outperforms the existing state-of-the-art on various network topologies. Finally, we present tight network-dependent minimax lower bounds on the group regret. Our proposed algorithms are straightforward to implement and obtain competitive empirical performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2020

Cooperative Multi-Agent Bandits with Heavy Tails

We study the heavy-tailed stochastic bandit problem in the cooperative m...
research
08/14/2020

Kernel Methods for Cooperative Multi-Agent Contextual Bandits

Cooperative multi-agent decision making involves a group of agents coope...
research
05/27/2022

Private and Byzantine-Proof Cooperative Decision-Making

The cooperative bandit problem is a multi-agent decision problem involvi...
research
10/08/2021

When to Call Your Neighbor? Strategic Communication in Cooperative Stochastic Bandits

In cooperative bandits, a framework that captures essential features of ...
research
07/21/2022

Delayed Feedback in Generalised Linear Bandits Revisited

The stochastic generalised linear bandit is a well-understood model for ...
research
03/10/2023

Robust MADER: Decentralized Multiagent Trajectory Planner Robust to Communication Delay in Dynamic Environments

Communication delays can be catastrophic for multiagent systems. However...
research
09/17/2021

Online Learning of Network Bottlenecks via Minimax Paths

In this paper, we study bottleneck identification in networks via extrac...

Please sign up or login with your details

Forgot password? Click here to reset