DeepAI AI Chat
Log In Sign Up

Federated Multi-armed Bandits with Personalization

02/25/2021
by   Chengshuai Shi, et al.
12

A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization. Under the PF-MAB framework, a mixed bandit learning problem that flexibly balances generalization and personalization is studied. A lower bound analysis for the mixed model is presented. We then propose the Personalized Federated Upper Confidence Bound (PF-UCB) algorithm, where the exploration length is chosen carefully to achieve the desired balance of learning the local model and supplying global information for the mixed learning objective. Theoretical analysis proves that PF-UCB achieves an O(log(T)) regret regardless of the degree of personalization, and has a similar instance dependency as the lower bound. Experiments using both synthetic and real-world datasets corroborate the theoretical analysis and demonstrate the effectiveness of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

01/28/2021

Federated Multi-Armed Bandits

Federated multi-armed bandits (FMAB) is a new bandit paradigm that paral...
10/04/2021

Asynchronous Upper Confidence Bound Algorithms for Federated Linear Bandits

Linear contextual bandit is a popular online learning problem. It has be...
07/12/2019

Laplacian-regularized graph bandits: Algorithms and theoretical analysis

We study contextual multi-armed bandit problems in the case of multiple ...
01/22/2023

Doubly Adversarial Federated Bandits

We study a new non-stochastic federated multi-armed bandit problem with ...
10/27/2021

Federated Linear Contextual Bandits

This paper presents a novel federated linear contextual bandits model, w...
06/24/2021

Personalized Federated Learning with Clustered Generalization

We study the recent emerging personalized federated learning (PFL) that ...
07/06/2018

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

The design of personalized incentives or recommendations to improve user...