Robust Multi-Agent Multi-Armed Bandits

07/07/2020
by   Daniel Vial, et al.
0

There has been recent interest in collaborative multi-agent bandits, where groups of agents share recommendations to decrease per-agent regret. However, these works assume that each agent always recommends their individual best-arm estimates to other agents, which is unrealistic in envisioned applications (machine faults in distributed computing or spam in social recommendation systems). Hence, we generalize the setting to include honest and malicious agents who recommend best-arm estimates and arbitrary arms, respectively. We show that even with a single malicious agent, existing collaboration-based algorithms fail to improve regret guarantees over a single-agent baseline. We propose a scheme where honest agents learn who is malicious and dynamically reduce communication with them, i.e., "blacklist" them. We show that collaboration indeed decreases regret for this algorithm, when the number of malicious agents is small compared to the number of arms, and crucially without assumptions on the malicious agents' behavior. Thus, our algorithm is robust against any malicious recommendation strategy.

READ FULL TEXT
research
02/28/2022

Robust Multi-Agent Bandits Over Undirected Graphs

We consider a multi-agent multi-armed bandit setting in which n honest a...
research
05/30/2023

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

The study of collaborative multi-agent bandits has attracted significant...
research
01/15/2020

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

We consider a decentralized multi-agent Multi Armed Bandit (MAB) setup c...
research
04/21/2021

Searching with Opponent-Awareness

We propose Searching with Opponent-Awareness (SOA), an approach to lever...
research
11/23/2022

Incentive-Aware Recommender Systems in Two-Sided Markets

Online platforms in the Internet Economy commonly incorporate recommende...
research
02/07/2023

Universally Robust Information Aggregation for Binary Decisions

We study an information aggregation setting in which a decision maker ma...
research
07/16/2022

Collaborative Best Arm Identification with Limited Communication on Non-IID Data

In this paper, we study the tradeoffs between time-speedup and the numbe...

Please sign up or login with your details

Forgot password? Click here to reset