What Knowledge Gets Distilled in Knowledge Distillation?

05/31/2022
by   Utkarsh Ojha, et al.
0

Knowledge distillation aims to transfer useful information from a teacher network to a student network, with the primary goal of improving the student's performance for the task at hand. Over the years, there has a been a deluge of novel techniques and use cases of knowledge distillation. Yet, despite the various improvements, there seems to be a glaring gap in the community's fundamental understanding of the process. Specifically, what is the knowledge that gets distilled in knowledge distillation? In other words, in what ways does the student become similar to the teacher? Does it start to localize objects in the same way? Does it get fooled by the same adversarial samples? Does its data invariance properties become similar? Our work presents a comprehensive study to try to answer these questions and more. Our results, using image classification as a case study and three state-of-the-art knowledge distillation techniques, show that knowledge distillation methods can indeed indirectly distill other kinds of properties beyond improving task performance. By exploring these questions, we hope for our work to provide a clearer picture of what happens during knowledge distillation.

READ FULL TEXT

page 4

page 13

research
03/09/2023

Learning the Wrong Lessons: Inserting Trojans During Knowledge Distillation

In recent years, knowledge distillation has become a cornerstone of effi...
research
10/18/2022

On effects of Knowledge Distillation on Transfer Learning

Knowledge distillation is a popular machine learning technique that aims...
research
11/08/2022

Understanding the Role of Mixup in Knowledge Distillation: An Empirical Study

Mixup is a popular data augmentation technique based on creating new sam...
research
09/30/2022

Towards a Unified View of Affinity-Based Knowledge Distillation

Knowledge transfer between artificial neural networks has become an impo...
research
03/14/2023

Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification

Data-Free Knowledge Distillation (DFKD) has recently attracted growing a...
research
04/13/2021

Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation

In this work, we address the issues of missing modalities that have aris...
research
10/15/2020

Spherical Knowledge Distillation

Knowledge distillation aims at obtaining a small but effective deep mode...

Please sign up or login with your details

Forgot password? Click here to reset