On the Efficacy of Knowledge Distillation

10/03/2019
by   Jang Hyun Cho, et al.
0

In this paper, we present a thorough evaluation of the efficacy of knowledge distillation and its dependence on student and teacher architectures. Starting with the observation that more accurate teachers often don't make good teachers, we attempt to tease apart the factors that affect knowledge distillation performance. We find crucially that larger models do not often make better teachers. We show that this is a consequence of mismatched capacity, and that small students are unable to mimic large teachers. We find typical ways of circumventing this (such as performing a sequence of knowledge distillation steps) to be ineffective. Finally, we show that this effect can be mitigated by stopping the teacher's training early. Our results generalize across datasets and models.

READ FULL TEXT

page 4

page 8

page 11

research
06/10/2021

Does Knowledge Distillation Really Work?

Knowledge distillation is a popular technique for training a small stude...
research
01/28/2023

Supervision Complexity and its Role in Knowledge Distillation

Despite the popularity and efficacy of knowledge distillation, there is ...
research
10/21/2022

Distilling the Undistillable: Learning from a Nasty Teacher

The inadvertent stealing of private/sensitive information using Knowledg...
research
07/03/2020

Knowledge Distillation Beyond Model Compression

Knowledge distillation (KD) is commonly deemed as an effective model com...
research
03/10/2023

Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss

This work studies knowledge distillation (KD) and addresses its constrai...
research
10/12/2022

Efficient Knowledge Distillation from Model Checkpoints

Knowledge distillation is an effective approach to learn compact models ...
research
09/30/2021

Born Again Neural Rankers

We introduce Born Again neural Rankers (BAR) in the Learning to Rank (LT...

Please sign up or login with your details

Forgot password? Click here to reset