Federated Reinforcement Distillation with Proxy Experience Memory

07/15/2019
by   Han Cha, et al.
0

In distributed reinforcement learning, it is common to exchange the experience memory of each agent and thereby collectively train their local models. The experience memory, however, contains all the preceding state observations and their corresponding policies of the host agent, which may violate the privacy of the agent. To avoid this problem, in this work, we propose a privacy-preserving distributed reinforcement learning (RL) framework, termed federated reinforcement distillation (FRD). The key idea is to exchange a proxy experience memory comprising a pre-arranged set of states and time-averaged policies, thereby preserving the privacy of actual experiences. Based on an advantage actor-critic RL architecture, we numerically evaluate the effectiveness of FRD and investigate how the performance of FRD is affected by the proxy memory structure and different memory exchanging rules.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2020

Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning

Traditional distributed deep reinforcement learning (RL) commonly relies...
research
05/13/2020

Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Leargning

Traditional distributed deep reinforcement learning (RL) commonly relies...
research
10/06/2021

Cooperative Multi-Agent Actor-Critic for Privacy-Preserving Load Scheduling in a Residential Microgrid

As a scalable data-driven approach, multi-agent reinforcement learning (...
research
01/24/2019

Federated Reinforcement Learning

In reinforcement learning, building policies of high-quality is challeng...
research
06/09/2023

Large Language Model Is Semi-Parametric Reinforcement Learning Agent

Inspired by the insights in cognitive science with respect to human memo...
research
09/17/2023

Using Reinforcement Learning to Simplify Mealtime Insulin Dosing for People with Type 1 Diabetes: In-Silico Experiments

People with type 1 diabetes (T1D) struggle to calculate the optimal insu...
research
11/03/2022

Synthesis of separation processes with reinforcement learning

This paper shows the implementation of reinforcement learning (RL) in co...

Please sign up or login with your details

Forgot password? Click here to reset