CDFKD-MFS: Collaborative Data-free Knowledge Distillation via Multi-level Feature Sharing

05/24/2022
by   Zhiwei Hao, et al.
6

Recently, the compression and deployment of powerful deep neural networks (DNNs) on resource-limited edge devices to provide intelligent services have become attractive tasks. Although knowledge distillation (KD) is a feasible solution for compression, its requirement on the original dataset raises privacy concerns. In addition, it is common to integrate multiple pretrained models to achieve satisfactory performance. How to compress multiple models into a tiny model is challenging, especially when the original data are unavailable. To tackle this challenge, we propose a framework termed collaborative data-free knowledge distillation via multi-level feature sharing (CDFKD-MFS), which consists of a multi-header student module, an asymmetric adversarial data-free KD module, and an attention-based aggregation module. In this framework, the student model equipped with a multi-level feature-sharing structure learns from multiple teacher models and is trained together with a generator in an asymmetric adversarial manner. When some real samples are available, the attention module adaptively aggregates predictions of the student headers, which can further improve performance. We conduct extensive experiments on three popular computer visual datasets. In particular, compared with the most competitive alternative, the accuracy of the proposed framework is 1.18% higher on the CIFAR-100 dataset, 1.67% higher on the Caltech-101 dataset, and 2.99% higher on the mini-ImageNet dataset.

READ FULL TEXT

page 1

page 3

page 8

page 9

page 11

research
03/06/2021

Adaptive Multi-Teacher Multi-level Knowledge Distillation

Knowledge distillation (KD) is an effective learning paradigm for improv...
research
11/26/2022

SKDBERT: Compressing BERT via Stochastic Knowledge Distillation

In this paper, we propose Stochastic Knowledge Distillation (SKD) to obt...
research
05/08/2020

Data-Free Network Quantization With Adversarial Knowledge Distillation

Network quantization is an essential procedure in deep learning for deve...
research
12/25/2022

BD-KD: Balancing the Divergences for Online Knowledge Distillation

Knowledge distillation (KD) has gained a lot of attention in the field o...
research
11/18/2020

Effectiveness of Arbitrary Transfer Sets for Data-free Knowledge Distillation

Knowledge Distillation is an effective method to transfer the learning a...
research
12/23/2019

Data-Free Adversarial Distillation

Knowledge Distillation (KD) has made remarkable progress in the last few...
research
12/27/2019

DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier

In this era of digital information explosion, an abundance of data from ...

Please sign up or login with your details

Forgot password? Click here to reset