Federated Reinforcement Learning with Environment Heterogeneity

04/06/2022
by   Hao Jin, et al.
1

We study a Federated Reinforcement Learning (FedRL) problem in which n agents collaboratively learn a single policy without sharing the trajectories they collected during agent-environment interaction. We stress the constraint of environment heterogeneity, which means n environments corresponding to these n agents have different state transitions. To obtain a value function or a policy function which optimizes the overall performance in all environments, we propose two federated RL algorithms, and . We theoretically prove that these algorithms converge to suboptimal solutions, while such suboptimality depends on how heterogeneous these n environments are. Moreover, we propose a heuristic that achieves personalization by embedding the n environments into n vectors. The personalization heuristic not only improves the training but also allows for better generalization to new environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2023

FedHQL: Federated Heterogeneous Q-Learning

Federated Reinforcement Learning (FedRL) encourages distributed agents t...
research
06/21/2022

Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling

Since reinforcement learning algorithms are notoriously data-intensive, ...
research
12/13/2022

Improving generalization in reinforcement learning through forked agents

An eco-system of agents each having their own policy with some, but limi...
research
06/04/2023

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

Reinforcement learning agents may sometimes develop habits that are effe...
research
05/18/2023

Client Selection for Federated Policy Optimization with Environment Heterogeneity

The development of Policy Iteration (PI) has inspired many recent algori...
research
08/31/2022

Transmit Power Control for Indoor Small Cells: A Method Based on Federated Reinforcement Learning

Setting the transmit power setting of 5G cells has been a long-term topi...
research
02/02/2023

Diversity Through Exclusion (DTE): Niche Identification for Reinforcement Learning through Value-Decomposition

Many environments contain numerous available niches of variable value, e...

Please sign up or login with your details

Forgot password? Click here to reset