Like What You Like: Knowledge Distill via Neuron Selectivity Transfer

07/05/2017
by   Zehao Huang, et al.
0

Despite deep neural networks have demonstrated extraordinary power in various applications, their superior performances are at expenses of high storage and computational cost. Consequently, the acceleration and compression of neural networks have attracted much attention recently. Knowledge Transfer (KT), which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the popular solutions. In this paper, we propose a novel knowledge transfer method by treating it as a distribution matching problem. Particularly, we match the distributions of neuron selectivity patterns between teacher and student networks. To achieve this goal, we devise a new KT loss function by minimizing the Maximum Mean Discrepancy (MMD) metric between these distributions. Combined with the original loss function, our method can significantly improve the performance of student networks. We validate the effectiveness of our method across several datasets, and further combine it with other KT methods to explore the best possible results.

READ FULL TEXT
research
02/14/2018

Paraphrasing Complex Network: Network Compression via Factor Transfer

Deep neural networks (DNN) have recently shown promising performances in...
research
09/02/2017

Learning Loss for Knowledge Distillation with Conditional Adversarial Networks

There is an increasing interest on accelerating neural networks for real...
research
07/05/2017

DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer

We have witnessed rapid evolution of deep neural network architecture de...
research
05/23/2019

Zero-shot Knowledge Transfer via Adversarial Belief Matching

Performing knowledge transfer from a large teacher network to a smaller ...
research
04/25/2022

Proto2Proto: Can you recognize the car, the way I do?

Prototypical methods have recently gained a lot of attention due to thei...
research
11/08/2018

Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons

An activation boundary for a neuron refers to a separating hyperplane th...
research
10/26/2017

Knowledge Projection for Deep Neural Networks

While deeper and wider neural networks are actively pushing the performa...

Please sign up or login with your details

Forgot password? Click here to reset