N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning

by   Anubhav Ashok, et al.

While bigger and deeper neural network architectures continue to advance the state-of-the-art for many computer vision tasks, real-world adoption of these networks is impeded by hardware and speed constraints. Conventional model compression methods attempt to address this problem by modifying the architecture manually or using pre-defined heuristics. Since the space of all reduced architectures is very large, modifying the architecture of a deep neural network in this way is a difficult task. In this paper, we tackle this issue by introducing a principled method for learning reduced network architectures in a data-driven way using reinforcement learning. Our approach takes a larger `teacher' network as input and outputs a compressed `student' network derived from the `teacher' network. In the first stage of our method, a recurrent policy network aggressively removes layers from the large `teacher' model. In the second stage, another recurrent policy network carefully reduces the size of each remaining layer. The resulting network is then evaluated to obtain a reward -- a score based on the accuracy and compression of the network. Our approach uses this reward signal with policy gradients to train the policies to find a locally optimal student network. Our experiments show that we can achieve compression rates of more than 10x for models such as ResNet-34 while maintaining similar performance to the input `teacher' network. We also present a valuable transfer learning result which shows that policies which are pre-trained on smaller `teacher' networks can be used to rapidly speed up training on larger `teacher' networks.


page 1

page 2

page 3

page 4


DECORE: Deep Compression with Reinforcement Learning

Deep learning has become an increasingly popular and powerful option for...

Real-time Policy Distillation in Deep Reinforcement Learning

Policy distillation in deep reinforcement learning provides an effective...

Feature Matters: A Stage-by-Stage Approach for Knowledge Transfer

Convolutional Neural Networks (CNNs) become deeper and deeper in recent ...

Guarded Policy Optimization with Imperfect Online Demonstrations

The Teacher-Student Framework (TSF) is a reinforcement learning setting ...

Don't Forget Your Teacher: A Corrective Reinforcement Learning Framework

Although reinforcement learning (RL) can provide reliable solutions in m...

Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location

Accurate and reliable prediction of hospital admission location is impor...

Network Recasting: A Universal Method for Network Architecture Transformation

This paper proposes network recasting as a general method for network ar...

Please sign up or login with your details

Forgot password? Click here to reset