Follow Your Path: a Progressive Method for Knowledge Distillation

07/20/2021
by   Wenxian Shi, et al.
0

Deep neural networks often have a huge number of parameters, which posts challenges in deployment in application scenarios with limited memory and computation capacity. Knowledge distillation is one approach to derive compact models from bigger ones. However, it has been observed that a converged heavy teacher model is strongly constrained for learning a compact student network and could make the optimization subject to poor local optima. In this paper, we propose ProKT, a new model-agnostic method by projecting the supervision signals of a teacher model into the student's parameter space. Such projection is implemented by decomposing the training objective into local intermediate targets with an approximate mirror descent technique. The proposed method could be less sensitive with the quirks during optimization which could result in a better local optimum. Experiments on both image and text datasets show that our proposed ProKT consistently achieves superior performance compared to other existing knowledge distillation methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2021

Annealing Knowledge Distillation

Significant memory and computational requirements of large deep neural n...
research
11/15/2019

Stagewise Knowledge Distillation

The deployment of modern Deep Learning models requires high computationa...
research
02/01/2022

Local Feature Matching with Transformers for low-end devices

LoFTR arXiv:2104.00680 is an efficient deep learning method for finding ...
research
04/19/2019

Knowledge Distillation via Route Constrained Optimization

Distillation-based learning boosts the performance of the miniaturized n...
research
08/18/2023

Unlimited Knowledge Distillation for Action Recognition in the Dark

Dark videos often lose essential information, which causes the knowledge...
research
11/14/2022

An Interpretable Neuron Embedding for Static Knowledge Distillation

Although deep neural networks have shown well-performance in various tas...
research
01/20/2021

Deep Epidemiological Modeling by Black-box Knowledge Distillation: An Accurate Deep Learning Model for COVID-19

An accurate and efficient forecasting system is imperative to the preven...

Please sign up or login with your details

Forgot password? Click here to reset