Knowledge Distillation with Feature Maps for Image Classification

12/03/2018
by   Wei-Chun Chen, et al.
0

The model reduction problem that eases the computation costs and latency of complex deep learning architectures has received an increasing number of investigations owing to its importance in model deployment. One promising method is knowledge distillation (KD), which creates a fast-to-execute student model to mimic a large teacher network. In this paper, we propose a method, called KDFM (Knowledge Distillation with Feature Maps), which improves the effectiveness of KD by learning the feature maps from the teacher network. Two major techniques used in KDFM are shared classifier and generative adversarial network. Experimental results show that KDFM can use a four layers CNN to mimic DenseNet-40 and use MobileNet to mimic DenseNet-100. Both student networks have less than 1% accuracy loss comparing to their teacher models for CIFAR-100 datasets. The student networks are 2-6 times faster than their teacher models for inference, and the model size of MobileNet is less than half of DenseNet-100's.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/03/2019

QUEST: Quantized embedding space for transferring knowledge

Knowledge distillation refers to the process of training a compact stude...
research
06/26/2022

Knowledge Distillation with Representative Teacher Keys Based on Attention Mechanism for Image Classification Model Compression

With the improvement of AI chips (e.g., GPU, TPU, and NPU) and the fast ...
research
04/12/2022

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

Recent Knowledge distillation (KD) studies show that different manually ...
research
05/06/2018

Quantization Mimic: Towards Very Tiny CNN for Object Detection

In this paper, we propose a simple and general framework for training ve...
research
03/31/2022

Conditional Autoregressors are Interpretable Classifiers

We explore the use of class-conditional autoregressive (CA) models to pe...
research
09/03/2021

Towards Learning Spatially Discriminative Feature Representations

The backbone of traditional CNN classifier is generally considered as a ...
research
02/10/2023

Feature Affinity Assisted Knowledge Distillation and Quantization of Deep Neural Networks on Label-Free Data

In this paper, we propose a feature affinity (FA) assisted knowledge dis...

Please sign up or login with your details

Forgot password? Click here to reset