ResKD: Residual-Guided Knowledge Distillation

06/08/2020
by   Xuewei Li, et al.
0

Knowledge distillation has emerge as a promising technique for compressing neural networks. Due to the capacity gap between a heavy teacher and a lightweight student, there exists a significant performance gap between them. In this paper, we see knowledge distillation in a fresh light, using the knowledge gap between a teacher and a student as guidance to train a lighter-weight student called res-student. The combination of a normal student and a res-student becomes a new student. Such a residual-guided process can be repeated. Experimental results show that we achieve competitive results on the CIFAR10/10, Tiny-ImageNet, and ImageNet datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Improving Knowledge Distillation Via Transferring Learning Ability

Existing knowledge distillation methods generally use a teacher-student ...
research
05/18/2018

Recurrent knowledge distillation

Knowledge distillation compacts deep networks by letting a small student...
research
06/02/2020

Channel Distillation: Channel-Wise Attention for Knowledge Distillation

Knowledge distillation is to transfer the knowledge from the data learne...
research
07/23/2019

Similarity-Preserving Knowledge Distillation

Knowledge distillation is a widely applicable technique for training a s...
research
06/30/2020

Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution

Knowledge distillation has been used to transfer knowledge learned by a ...
research
11/30/2022

Hint-dynamic Knowledge Distillation

Knowledge Distillation (KD) transfers the knowledge from a high-capacity...
research
02/21/2020

Residual Knowledge Distillation

Knowledge distillation (KD) is one of the most potent ways for model com...

Please sign up or login with your details

Forgot password? Click here to reset