Collaborative Learning and Personalization in Multi-Agent Stochastic Linear Bandits

06/15/2021
āˆ™
by   Avishek Ghosh, et al.
āˆ™
0
āˆ™

We consider the problem of minimizing regret in an N agent heterogeneous stochastic linear bandits framework, where the agents (users) are similar but not all identical. We model user heterogeneity using two popularly used ideas in practice; (i) A clustering framework where users are partitioned into groups with users in the same group being identical to each other, but different across groups, and (ii) a personalization framework where no two users are necessarily identical, but a user's parameters are close to that of the population average. In the clustered users' setup, we propose a novel algorithm, based on successive refinement of cluster identities and regret minimization. We show that, for any agent, the regret scales as š’Ŗ(āˆš(T/N)), if the agent is in a `well separated' cluster, or scales as š’Ŗ(T^1/2 + Īµ/(N)^1/2 -Īµ) if its cluster is not well separated, where Īµ is positive and arbitrarily close to 0. Our algorithm is adaptive to the cluster separation, and is parameter free ā€“ it does not need to know the number of clusters, separation and cluster size, yet the regret guarantee adapts to the inherent complexity. In the personalization framework, we introduce a natural algorithm where, the personal bandit instances are initialized with the estimates of the global average model. We show that, an agent i whose parameter deviates from the population average by Ļµ_i, attains a regret scaling of O(Ļµ_iāˆš(T)). This demonstrates that if the user representations are close (small Ļµ_i), the resulting regret is low, and vice-versa. The results are empirically validated and we observe superior performance of our adaptive algorithms over non-adaptive baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
āˆ™ 09/15/2023

Clustered Multi-Agent Linear Bandits

We address in this paper a particular instance of the multi-agent linear...
research
āˆ™ 04/03/2023

Learning Personalized Models with Clustered System Identification

We address the problem of learning linear system models from observing m...
research
āˆ™ 05/30/2023

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

The study of collaborative multi-agent bandits has attracted significant...
research
āˆ™ 01/17/2023

Optimal Algorithms for Latent Bandits with Cluster Structure

We consider the problem of latent bandits with cluster structure where t...
research
āˆ™ 10/04/2019

Social Learning in Multi Agent Multi Armed Bandits

In this paper, we introduce a distributed version of the classical stoch...
research
āˆ™ 06/09/2021

Parameter and Feature Selection in Stochastic Linear Bandits

We study two model selection settings in stochastic linear bandits (LB)....
research
āˆ™ 06/13/2012

Learning and Solving Many-Player Games through a Cluster-Based Representation

In addressing the challenge of exponential scaling with the number of ag...

Please sign up or login with your details

Forgot password? Click here to reset