DeepAI AI Chat
Log In Sign Up

Not All Knowledge Is Created Equal

06/02/2021
by   Ziyun Li, et al.
15

Mutual knowledge distillation (MKD) improves a model by distilling knowledge from another model. However, not all knowledge is certain and correct, especially under adverse conditions. For example, label noise usually leads to less reliable models due to the undesired memorisation [1, 2]. Wrong knowledge misleads the learning rather than helps. This problem can be handled by two aspects: (i) improving the reliability of a model where the knowledge is from (i.e., knowledge source's reliability); (ii) selecting reliable knowledge for distillation. In the literature, making a model more reliable is widely studied while selective MKD receives little attention. Therefore, we focus on studying selective MKD and highlight its importance in this work. Concretely, a generic MKD framework, Confident knowledge selection followed by Mutual Distillation (CMD), is designed. The key component of CMD is a generic knowledge selection formulation, making the selection threshold either static (CMD-S) or progressive (CMD-P). Additionally, CMD covers two special cases: zero knowledge and all knowledge, leading to a unified MKD framework. We empirically find CMD-P performs better than CMD-S. The main reason is that a model's knowledge upgrades and becomes confident as the training progresses. Extensive experiments are present to demonstrate the effectiveness of CMD and thoroughly justify the design of CMD. For example, CMD-P obtains new state-of-the-art results in robustness against label noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

11/30/2020

A Selective Survey on Versatile Knowledge Distillation Paradigm for Neural Network Models

This paper aims to provide a selective survey about knowledge distillati...
04/26/2020

DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

With a fixed model structure, knowledge distillation and filter grafting...
03/25/2020

Circumventing Outliers of AutoAugment with Knowledge Distillation

AutoAugment has been a powerful algorithm that improves the accuracy of ...
09/30/2022

Using Knowledge Distillation to improve interpretable models in a retail banking context

This article sets forth a review of knowledge distillation techniques wi...
12/01/2021

Information Theoretic Representation Distillation

Despite the empirical success of knowledge distillation, there still lac...
02/17/2023

Explicit and Implicit Knowledge Distillation via Unlabeled Data

Data-free knowledge distillation is a challenging model lightweight task...
03/04/2023

IKD+: Reliable Low Complexity Deep Models For Retinopathy Classification

Deep neural network (DNN) models for retinopathy have estimated predicti...