Kernel Methods for Cooperative Multi-Agent Contextual Bandits

08/14/2020
by   Abhimanyu Dubey, et al.
2

Cooperative multi-agent decision making involves a group of agents cooperatively solving learning problems while communicating over a network with delays. In this paper, we consider the kernelised contextual bandit problem, where the reward obtained by an agent is an arbitrary linear function of the contexts' images in the related reproducing kernel Hilbert space (RKHS), and a group of agents must cooperate to collectively solve their unique decision problems. For this problem, we propose Coop-KernelUCB, an algorithm that provides near-optimal bounds on the per-agent regret, and is both computationally and communicatively efficient. For special cases of the cooperative problem, we also provide variants of Coop-KernelUCB that provides optimal per-agent regret. In addition, our algorithm generalizes several existing results in the multi-agent bandit setting. Finally, on a series of both synthetic and real-world multi-agent network benchmarks, we demonstrate that our algorithm significantly outperforms existing benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2020

Cooperative Multi-Agent Bandits with Heavy Tails

We study the heavy-tailed stochastic bandit problem in the cooperative m...
research
05/30/2023

Cooperative Thresholded Lasso for Sparse Linear Bandit

We present a novel approach to address the multi-agent sparse contextual...
research
11/24/2021

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

The cooperative bandit problem is increasingly becoming relevant due to ...
research
12/01/2022

Decision Market Based Learning For Multi-agent Contextual Bandit Problems

Information is often stored in a distributed and proprietary form, and a...
research
09/15/2023

Clustered Multi-Agent Linear Bandits

We address in this paper a particular instance of the multi-agent linear...
research
09/15/2022

How to solve a classification problem using a cooperative tiling Multi-Agent System?

Adaptive Multi-Agent Systems (AMAS) transform dynamic problems into prob...
research
12/21/2020

Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism

Online learning has been successfully applied to many problems in which ...

Please sign up or login with your details

Forgot password? Click here to reset