Real-time Policy Distillation in Deep Reinforcement Learning

12/29/2019
by   Yuxiang Sun, et al.
0

Policy distillation in deep reinforcement learning provides an effective way to transfer control policies from a larger network to a smaller untrained network without a significant degradation in performance. However, policy distillation is underexplored in deep reinforcement learning, and existing approaches are computationally inefficient, resulting in a long distillation time. In addition, the effectiveness of the distillation process is still limited to the model capacity. We propose a new distillation mechanism, called real-time policy distillation, in which training the teacher model and distilling the policy to the student model occur simultaneously. Accordingly, the teacher's latest policy is transferred to the student model in real time. This reduces the distillation time to half the original time or even less and also makes it possible for extremely small student models to learn skills at the expert level. We evaluated the proposed algorithm in the Atari 2600 domain. The results show that our approach can achieve full distillation in most games, even with compression ratios up to 1.7

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2020

Defending Adversarial Attacks without Adversarial Attacks in Deep Reinforcement Learning

Many recent studies in deep reinforcement learning (DRL) have proposed t...
research
01/23/2019

Distillation Strategies for Proximal Policy Optimization

Vision-based deep reinforcement learning (RL), similar to deep learning,...
research
10/05/2022

On Neural Consolidation for Transfer in Reinforcement Learning

Although transfer learning is considered to be a milestone in deep reinf...
research
02/13/2018

Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control

Deep reinforcement learning has demonstrated increasing capabilities for...
research
02/06/2019

Distilling Policy Distillation

The transfer of knowledge from one policy to another is an important too...
research
09/18/2017

N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning

While bigger and deeper neural network architectures continue to advance...

Please sign up or login with your details

Forgot password? Click here to reset