Blind Knowledge Distillation for Robust Image Classification

11/21/2022
by   Timo Kaiser, et al.
0

Optimizing neural networks with noisy labels is a challenging task, especially if the label set contains real-world noise. Networks tend to generalize to reasonable patterns in the early training stages and overfit to specific details of noisy samples in the latter ones. We introduce Blind Knowledge Distillation - a novel teacher-student approach for learning with noisy labels by masking the ground truth related teacher output to filter out potentially corrupted knowledge and to estimate the tipping point from generalizing to overfitting. Based on this, we enable the estimation of noise in the training data with Otsus algorithm. With this estimation, we train the network with a modified weighted cross-entropy loss function. We show in our experiments that Blind Knowledge Distillation detects overfitting effectively during training and improves the detection of clean and noisy labels on the recently published CIFAR-N dataset. Code is available at GitHub.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2020

Channel Distillation: Channel-Wise Attention for Knowledge Distillation

Knowledge distillation is to transfer the knowledge from the data learne...
research
01/30/2023

Understanding Self-Distillation in the Presence of Label Noise

Self-distillation (SD) is the process of first training a teacher model ...
research
07/27/2022

NICEST: Noisy Label Correction and Training for Robust Scene Graph Generation

Nearly all existing scene graph generation (SGG) models have overlooked ...
research
05/26/2023

ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression

Noise suppression (NS) models have been widely applied to enhance speech...
research
05/27/2021

Training Classifiers that are Universally Robust to All Label Noise Levels

For classification tasks, deep neural networks are prone to overfitting ...
research
09/03/2022

Noise-Robust Bidirectional Learning with Dynamic Sample Reweighting

Deep neural networks trained with standard cross-entropy loss are more p...
research
10/02/2019

Distillation ≈ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network

Distillation is a method to transfer knowledge from one model to another...

Please sign up or login with your details

Forgot password? Click here to reset