Hint-dynamic Knowledge Distillation

11/30/2022
by   Yiyang Liu, et al.
0

Knowledge Distillation (KD) transfers the knowledge from a high-capacity teacher model to promote a smaller student model. Existing efforts guide the distillation by matching their prediction logits, feature embedding, etc., while leaving how to efficiently utilize them in junction less explored. In this paper, we propose Hint-dynamic Knowledge Distillation, dubbed HKD, which excavates the knowledge from the teacher' s hints in a dynamic scheme. The guidance effect from the knowledge hints usually varies in different instances and learning stages, which motivates us to customize a specific hint-learning manner for each instance adaptively. Specifically, a meta-weight network is introduced to generate the instance-wise weight coefficients about knowledge hints in the perception of the dynamical learning progress of the student model. We further present a weight ensembling strategy to eliminate the potential bias of coefficient estimation by exploiting the historical statics. Experiments on standard benchmarks of CIFAR-100 and Tiny-ImageNet manifest that the proposed HKD well boost the effect of knowledge distillation tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2020

ResKD: Residual-Guided Knowledge Distillation

Knowledge distillation has emerge as a promising technique for compressi...
research
01/21/2021

Collaborative Teacher-Student Learning via Multiple Knowledge Transfer

Knowledge distillation (KD), as an efficient and effective model compres...
research
03/18/2021

Similarity Transfer for Knowledge Distillation

Knowledge distillation is a popular paradigm for learning portable neura...
research
08/29/2021

Lipschitz Continuity Guided Knowledge Distillation

Knowledge distillation has become one of the most important model compre...
research
04/03/2019

Correlation Congruence for Knowledge Distillation

Most teacher-student frameworks based on knowledge distillation (KD) dep...
research
03/14/2023

Teacher-Student Knowledge Distillation for Radar Perception on Embedded Accelerators

Many radar signal processing methodologies are being developed for criti...
research
05/21/2022

Mapping Emulation for Knowledge Distillation

This paper formalizes the source-blind knowledge distillation problem th...

Please sign up or login with your details

Forgot password? Click here to reset