ResKD: Residual-Guided Knowledge Distillation

06/08/2020
by   Xuewei Li, et al.
0

Knowledge distillation has emerge as a promising technique for compressing neural networks. Due to the capacity gap between a heavy teacher and a lightweight student, there exists a significant performance gap between them. In this paper, we see knowledge distillation in a fresh light, using the knowledge gap between a teacher and a student as guidance to train a lighter-weight student called res-student. The combination of a normal student and a res-student becomes a new student. Such a residual-guided process can be repeated. Experimental results show that we achieve competitive results on the CIFAR10/10, Tiny-ImageNet, and ImageNet datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset