Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Leargning

05/13/2020
by   Han Cha, et al.
0

Traditional distributed deep reinforcement learning (RL) commonly relies on exchanging the experience replay memory (RM) of each agent. Since the RM contains all state observations and action policy history, it may incur huge communication overhead while violating the privacy of each agent. Alternatively, this article presents a communication-efficient and privacy-preserving distributed RL framework, coined federated reinforcement distillation (FRD). In FRD, each agent exchanges its proxy experience replay memory (ProxRM), in which policies are locally averaged with respect to proxy states clustering actual states. To provide FRD design insights, we present ablation studies on the impact of ProxRM structures, neural network architectures, and communication intervals. Furthermore, we propose an improved version of FRD, coined mixup augmented FRD (MixFRD), in which ProxRM is interpolated using the mixup data augmentation algorithm. Simulations in a Cartpole environment validate the effectiveness of MixFRD in reducing the variance of mission completion time and communication cost, compared to the benchmark schemes, vanilla FRD, federated reinforcement learning (FRL), and policy distillation (PD).

READ FULL TEXT
research
05/13/2020

Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning

Traditional distributed deep reinforcement learning (RL) commonly relies...
research
07/15/2019

Federated Reinforcement Distillation with Proxy Experience Memory

In distributed reinforcement learning, it is common to exchange the expe...
research
03/04/2020

Dynamic Experience Replay

We present a novel technique called Dynamic Experience Replay (DER) that...
research
03/02/2018

Distributed Prioritized Experience Replay

We propose a distributed architecture for deep reinforcement learning at...
research
06/01/2022

Efficient Scheduling of Data Augmentation for Deep Reinforcement Learning

In deep reinforcement learning (RL), data augmentation is widely conside...
research
10/26/2021

A DPDK-Based Acceleration Method for Experience Sampling of Distributed Reinforcement Learning

A computing cluster that interconnects multiple compute nodes is used to...
research
06/11/2022

Federated Offline Reinforcement Learning

Evidence-based or data-driven dynamic treatment regimes are essential fo...

Please sign up or login with your details

Forgot password? Click here to reset