Distral: Robust Multitask Reinforcement Learning

07/13/2017
by   Yee Whye Teh, et al.
0

Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (Distill & transfer learning). Instead of sharing parameters between the different workers, we propose to share a "distilled" policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2018

DiGrad: Multi-Task Reinforcement Learning with Shared Actions

Most reinforcement learning algorithms are inefficient for learning mult...
research
09/03/2019

Generalization in Transfer Learning

Agents trained with deep reinforcement learning algorithms are capable o...
research
11/06/2016

Modular Multitask Reinforcement Learning with Policy Sketches

We describe a framework for multitask deep reinforcement learning guided...
research
10/07/2022

Improving Robustness of Deep Reinforcement Learning Agents: Environment Attack based on the Critic Network

To improve policy robustness of deep reinforcement learning agents, a li...
research
12/09/2020

Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation

In reinforcement learning, domain randomisation is an increasingly popul...
research
01/07/2018

Sample-Efficient Reinforcement Learning through Transfer and Architectural Priors

Recent work in deep reinforcement learning has allowed algorithms to lea...
research
02/06/2019

Distilling Policy Distillation

The transfer of knowledge from one policy to another is an important too...

Please sign up or login with your details

Forgot password? Click here to reset