Regularizing Class-wise Predictions via Self-knowledge Distillation

03/31/2020
by   Sukmin Yun, et al.
9

Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting. To mitigate the issue, we propose a new regularization method that penalizes the predictive distribution between similar samples. In particular, we distill the predictive distribution between different samples of the same label during training. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network (i.e., a self-knowledge distillation) by forcing it to produce more meaningful and consistent predictions in a class-wise manner. Consequently, it mitigates overconfident predictions and reduces intra-class variations. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve not only the generalization ability but also the calibration performance of modern convolutional neural networks.

READ FULL TEXT

page 6

page 12

research
07/06/2021

Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation

Though convolutional neural networks are widely used in different tasks,...
research
08/11/2022

Self-Knowledge Distillation via Dropout

To boost the performance, deep neural networks require deeper or wider n...
research
09/11/2020

Extending Label Smoothing Regularization with Self-Knowledge Distillation

Inspired by the strong correlation between the Label Smoothing Regulariz...
research
06/22/2020

Self-Knowledge Distillation: A Simple Way for Better Generalization

The generalization capability of deep neural networks has been substanti...
research
08/26/2021

Efficient training of lightweight neural networks using Online Self-Acquired Knowledge Distillation

Knowledge Distillation has been established as a highly promising approa...
research
09/15/2023

One-Class Knowledge Distillation for Spoofing Speech Detection

The detection of spoofing speech generated by unseen algorithms remains ...
research
05/30/2018

Collaborative Learning for Deep Neural Networks

We introduce collaborative learning in which multiple classifier heads o...

Please sign up or login with your details

Forgot password? Click here to reset