Modeling Teacher-Student Techniques in Deep Neural Networks for Knowledge Distillation

12/31/2019
by   Sajjad Abbasi, et al.
0

Knowledge distillation (KD) is a new method for transferring knowledge of a structure under training to another one. The typical application of KD is in the form of learning a small model (named as a student) by soft labels produced by a complex model (named as a teacher). Due to the novel idea introduced in KD, recently, its notion is used in different methods such as compression and processes that are going to enhance the model accuracy. Although different techniques are proposed in the area of KD, there is a lack of a model to generalize KD techniques. In this paper, various studies in the scope of KD are investigated and analyzed to build a general model for KD. All the methods and techniques in KD can be summarized through the proposed model. By utilizing the proposed model, different methods in KD are better investigated and explored. The advantages and disadvantages of different approaches in KD can be better understood and develop a new strategy for KD can be possible. Using the proposed model, different KD methods are represented in an abstract view.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2021

Collaborative Teacher-Student Learning via Multiple Knowledge Transfer

Knowledge distillation (KD), as an efficient and effective model compres...
research
10/08/2019

Knowledge Distillation from Internal Representations

Knowledge distillation is typically conducted by training a small model ...
research
11/05/2021

Oracle Teacher: Towards Better Knowledge Distillation

Knowledge distillation (KD), best known as an effective method for model...
research
07/03/2020

Knowledge Distillation Beyond Model Compression

Knowledge distillation (KD) is commonly deemed as an effective model com...
research
03/06/2023

KDSM: An uplift modeling framework based on knowledge distillation and sample matching

Uplift modeling aims to estimate the treatment effect on individuals, wi...
research
05/02/2020

Heterogeneous Knowledge Distillation using Information Flow Modeling

Knowledge Distillation (KD) methods are capable of transferring the know...
research
01/31/2018

Model compression for faster structural separation of macromolecules captured by Cellular Electron Cryo-Tomography

Electron Cryo-Tomography (ECT) enables 3D visualization of macromolecule...

Please sign up or login with your details

Forgot password? Click here to reset