Student Network Learning via Evolutionary Knowledge Distillation

03/23/2021
by   Kangkai Zhang, et al.
0

Knowledge distillation provides an effective way to transfer knowledge via teacher-student learning, where most existing distillation approaches apply a fixed pre-trained model as teacher to supervise the learning of student network. This manner usually brings in a big capability gap between teacher and student networks during learning. Recent researches have observed that a small teacher-student capability gap can facilitate knowledge transfer. Inspired by that, we propose an evolutionary knowledge distillation approach to improve the transfer effectiveness of teacher knowledge. Instead of a fixed pre-trained teacher, an evolutionary teacher is learned online and consistently transfers intermediate knowledge to supervise student network learning on-the-fly. To enhance intermediate knowledge representation and mimicking, several simple guided modules are introduced between corresponding teacher-student blocks. In this way, the student can simultaneously obtain rich internal knowledge and capture its growth process, leading to effective student network learning. Extensive experiments clearly demonstrate the effectiveness of our approach as well as good adaptability in the low-resolution and few-sample visual recognition scenarios.

READ FULL TEXT

page 1

page 12

research
02/28/2021

Distilling Knowledge via Intermediate Classifier Heads

The crux of knowledge distillation – as a transfer-learning approach – i...
research
12/26/2022

Prototype-guided Cross-task Knowledge Distillation for Large-scale Models

Recently, large-scale pre-trained models have shown their advantages in ...
research
02/19/2023

HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

Knowledge distillation has been shown to be a powerful model compression...
research
05/25/2023

Triplet Knowledge Distillation

In Knowledge Distillation, the teacher is generally much larger than the...
research
10/23/2022

Respecting Transfer Gap in Knowledge Distillation

Knowledge distillation (KD) is essentially a process of transferring a t...
research
05/16/2021

Undistillable: Making A Nasty Teacher That CANNOT teach students

Knowledge Distillation (KD) is a widely used technique to transfer knowl...
research
03/10/2021

Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones

Recently, research efforts have been concentrated on revealing how pre-t...

Please sign up or login with your details

Forgot password? Click here to reset