Robustness and Diversity Seeking Data-Free Knowledge Distillation

11/07/2020
by   Pengchao Han, et al.
0

Knowledge distillation (KD) has enabled remarkable progress in model compression and knowledge transfer. However, KD requires a large volume of original data or their representation statistics that are not usually available in practice. Data-free KD has recently been proposed to resolve this problem, wherein teacher and student models are fed by a synthetic sample generator trained from the teacher. Nonetheless, existing data-free KD methods rely on fine-tuning of weights to balance multiple losses, and ignore the diversity of generated samples, resulting in limited accuracy and robustness. To overcome this challenge, we propose robustness and diversity seeking data-free KD (RDSKD) in this paper. The generator loss function is crafted to produce samples with high authenticity, class diversity, and inter-sample diversity. Without real data, the objectives of seeking high sample authenticity and class diversity often conflict with each other, causing frequent loss fluctuations. We mitigate this by exponentially penalizing loss increments. With MNIST, CIFAR-10, and SVHN datasets, our experiments show that RDSKD achieves higher accuracy with more robustness over different hyperparameter settings, compared to other data-free KD methods such as DAFL, MSKD, ZSKD, and DeepInversion.

READ FULL TEXT

page 4

page 5

research
12/31/2021

Conditional Generative Data-Free Knowledge Distillation based on Attention Transfer

Knowledge distillation has made remarkable achievements in model compres...
research
05/08/2020

Data-Free Network Quantization With Adversarial Knowledge Distillation

Network quantization is an essential procedure in deep learning for deve...
research
05/08/2022

Data-Free Adversarial Knowledge Distillation for Graph Neural Networks

Graph neural networks (GNNs) have been widely used in modeling graph str...
research
11/23/2020

Generative Adversarial Simulator

Knowledge distillation between machine learning models has opened many n...
research
12/23/2019

Data-Free Adversarial Distillation

Knowledge Distillation (KD) has made remarkable progress in the last few...
research
08/11/2021

Preventing Catastrophic Forgetting and Distribution Mismatch in Knowledge Distillation via Synthetic Data

With the increasing popularity of deep learning on edge devices, compres...
research
09/18/2023

Dual Student Networks for Data-Free Model Stealing

Existing data-free model stealing methods use a generator to produce sam...

Please sign up or login with your details

Forgot password? Click here to reset