PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation

02/26/2021
by   Reyhan Kevser Keser, et al.
0

We propose a novel knowledge distillation methodology for compressing deep neural networks. One of the most efficient methods for knowledge distillation is hint distillation, where the student model is injected with information (hints) from several different layers of the teacher model. Although the selection of hint points can drastically alter the compression performance, there is no systematic approach for selecting them, other than brute-force hyper-parameter search. We propose a clustering based hint selection methodology, where the layers of teacher model are clustered with respect to several metrics and the cluster centers are used as the hint points. The proposed approach is validated in CIFAR-100 dataset, where ResNet-110 network was used as the teacher model. Our results show that hint points selected by our algorithm results in superior compression performance with respect to state-of-the-art knowledge distillation algorithms on the same student models and datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2018

Recurrent knowledge distillation

Knowledge distillation compacts deep networks by letting a small student...
research
11/15/2022

Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling

DETR is a novel end-to-end transformer architecture object detector, whi...
research
10/19/2021

Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation

Knowledge Distillation is becoming one of the primary trends among neura...
research
09/04/2019

Empirical Analysis of Knowledge Distillation Technique for Optimization of Quantized Deep Neural Networks

Knowledge distillation (KD) is a very popular method for model size redu...
research
05/27/2023

Knowledge Distillation Performs Partial Variance Reduction

Knowledge distillation is a popular approach for enhancing the performan...
research
09/30/2020

Pea-KD: Parameter-efficient and Accurate Knowledge Distillation

How can we efficiently compress a model while maintaining its performanc...
research
08/29/2021

Lipschitz Continuity Guided Knowledge Distillation

Knowledge distillation has become one of the most important model compre...

Please sign up or login with your details

Forgot password? Click here to reset