Robust Multi-Agent Bandits Over Undirected Graphs

02/28/2022
by   Daniel Vial, et al.
0

We consider a multi-agent multi-armed bandit setting in which n honest agents collaborate over a network to minimize regret but m malicious agents can disrupt learning arbitrarily. Assuming the network is the complete graph, existing algorithms incur O( (m + K/n) log (T) / Δ ) regret in this setting, where K is the number of arms and Δ is the arm gap. For m ≪ K, this improves over the single-agent baseline regret of O(Klog(T)/Δ). In this work, we show the situation is murkier beyond the case of a complete graph. In particular, we prove that if the state-of-the-art algorithm is used on the undirected line graph, honest agents can suffer (nearly) linear regret until time is doubly exponential in K and n. In light of this negative result, we propose a new algorithm for which the i-th agent has regret O( ( d_mal(i) + K/n) log(T)/Δ) on any connected and undirected graph, where d_mal(i) is the number of i's neighbors who are malicious. Thus, we generalize existing regret bounds beyond the complete graph (where d_mal(i) = m), and show the effect of malicious agents is entirely local (in the sense that only the d_mal(i) malicious agents directly connected to i affect its long-term regret).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2020

Robust Multi-Agent Multi-Armed Bandits

There has been recent interest in collaborative multi-agent bandits, whe...
research
01/15/2020

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

We consider a decentralized multi-agent Multi Armed Bandit (MAB) setup c...
research
11/10/2021

Multi-Agent Learning for Iterative Dominance Elimination: Formal Barriers and New Algorithms

Dominated actions are natural (and perhaps the simplest possible) multi-...
research
06/07/2023

Optimal Fair Multi-Agent Bandits

In this paper, we study the problem of fair multi-agent multi-arm bandit...
research
03/09/2023

Communication-Efficient Collaborative Heterogeneous Bandits in Networks

The multi-agent multi-armed bandit problem has been studied extensively ...
research
02/29/2016

Collaborative Learning of Stochastic Bandits over a Social Network

We consider a collaborative online learning paradigm, wherein a group of...
research
07/17/2013

From Bandits to Experts: A Tale of Domination and Independence

We consider the partial observability model for multi-armed bandits, int...

Please sign up or login with your details

Forgot password? Click here to reset