Collaborative Regret Minimization in Multi-Armed Bandits

01/26/2023
by   Nikolai Karpov, et al.
0

In this paper, we study the collaborative learning model, which concerns the tradeoff between parallelism and communication overhead in multi-agent reinforcement learning. For a fundamental problem in bandit theory, regret minimization in multi-armed bandits, we present the first and almost tight tradeoffs between the number of rounds of communication between the agents and the regret of the collaborative learning process.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2023

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

The study of collaborative multi-agent bandits has attracted significant...
research
09/15/2021

Estimation of Warfarin Dosage with Reinforcement Learning

In this paper, it has attempted to use Reinforcement learning to model t...
research
02/29/2016

Collaborative Learning of Stochastic Bandits over a Social Network

We consider a collaborative online learning paradigm, wherein a group of...
research
10/15/2018

Regret vs. Bandwidth Trade-off for Recommendation Systems

We consider recommendation systems that need to operate under wireless b...
research
04/21/2021

Searching with Opponent-Awareness

We propose Searching with Opponent-Awareness (SOA), an approach to lever...
research
05/26/2023

A Framework for Incentivized Collaborative Learning

Collaborations among various entities, such as companies, research labs,...
research
06/08/2021

Cooperative Stochastic Multi-agent Multi-armed Bandits Robust to Adversarial Corruptions

We study the problem of stochastic bandits with adversarial corruptions ...

Please sign up or login with your details

Forgot password? Click here to reset