QUEST: Quantized embedding space for transferring knowledge

12/03/2019
by   Himalaya Jain, et al.
0

Knowledge distillation refers to the process of training a compact student network to achieve better accuracy by learning from a high capacity teacher network. Most of the existing knowledge distillation methods direct the student to follow the teacher by matching the teacher's output, feature maps or their distribution. In this work, we propose a novel way to achieve this goal: by distilling the knowledge through a quantized space. According to our method, the teacher's feature maps are quantized to represent the main visual concepts encompassed in the feature maps. The student is then asked to predict the quantized representation, which thus forms the task that the student uses to learn from the teacher. Despite its simplicity, we show that our approach is able to yield results that improve the state of the art on knowledge distillation. To that end, we provide an extensive evaluation across several network architectures and most commonly used benchmark datasets.

READ FULL TEXT

page 3

page 10

research
12/03/2018

Knowledge Distillation with Feature Maps for Image Classification

The model reduction problem that eases the computation costs and latency...
research
03/31/2021

Knowledge Distillation By Sparse Representation Matching

Knowledge Distillation refers to a class of methods that transfers the k...
research
11/29/2022

Feature-domain Adaptive Contrastive Distillation for Efficient Single Image Super-Resolution

Recently, CNN-based SISR has numerous parameters and high computational ...
research
05/04/2022

Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation for Scene Recognition

Knowledge Distillation (KD) is a strategy for the definition of a set of...
research
02/10/2023

Feature Affinity Assisted Knowledge Distillation and Quantization of Deep Neural Networks on Label-Free Data

In this paper, we propose a feature affinity (FA) assisted knowledge dis...
research
06/26/2022

Knowledge Distillation with Representative Teacher Keys Based on Attention Mechanism for Image Classification Model Compression

With the improvement of AI chips (e.g., GPU, TPU, and NPU) and the fast ...
research
12/06/2022

Leveraging Different Learning Styles for Improved Knowledge Distillation

Learning style refers to a type of training mechanism adopted by an indi...

Please sign up or login with your details

Forgot password? Click here to reset