Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation

12/09/2020
by   Chenyang Zhao, et al.
0

In reinforcement learning, domain randomisation is an increasingly popular technique for learning more general policies that are robust to domain-shifts at deployment. However, naively aggregating information from randomised domains may lead to high variance in gradient estimation and unstable learning process. To address this issue, we present a peer-to-peer online distillation strategy for RL termed P2PDRL, where multiple workers are each assigned to a different environment, and exchange knowledge through mutual regularisation based on Kullback-Leibler divergence. Our experiments on continuous control tasks show that P2PDRL enables robust learning across a wider randomisation distribution than baselines, and more robust generalisation to new environments at testing.

READ FULL TEXT
research
02/06/2020

Transfer Heterogeneous Knowledge Among Peer-to-Peer Teammates: A Model Distillation Approach

Peer-to-peer knowledge transfer in distributed environments has emerged ...
research
05/20/2018

Learning to Teach in Cooperative Multiagent Reinforcement Learning

We present a framework and algorithm for peer-to-peer teaching in cooper...
research
06/07/2020

Peer Collaborative Learning for Online Knowledge Distillation

Traditional knowledge distillation uses a two-stage training strategy to...
research
07/13/2017

Distral: Robust Multitask Reinforcement Learning

Most deep reinforcement learning algorithms are data inefficient in comp...
research
02/01/2020

Periodic Intra-Ensemble Knowledge Distillation for Reinforcement Learning

Off-policy ensemble reinforcement learning (RL) methods have demonstrate...
research
11/19/2019

Attention Privileged Reinforcement Learning For Domain Transfer

Applying reinforcement learning (RL) to physical systems presents notabl...
research
10/01/2020

Student-Initiated Action Advising via Advice Novelty

Action advising is a knowledge exchange mechanism between peers, namely ...

Please sign up or login with your details

Forgot password? Click here to reset