On the Efficiency of Subclass Knowledge Distillation in Classification Tasks

09/12/2021
by   Ahmad Sajedi, et al.
0

This work introduces a novel knowledge distillation framework for classification tasks where information on existing subclasses is available and taken into consideration. In classification tasks with a small number of classes or binary detection (two classes) the amount of information transferred from the teacher to the student network is restricted, thus limiting the utility of knowledge distillation. Performance can be improved by leveraging information about possible subclasses within the available classes in the classification task. To that end, we propose the so-called Subclass Knowledge Distillation (SKD) framework, which is the process of transferring the subclasses' prediction knowledge from a large teacher model into a smaller student one. Through SKD, additional meaningful information which is not in the teacher's class logits but exists in subclasses (e.g., similarities inside classes) will be conveyed to the student and boost its performance. Mathematically, we measure how many extra information bits the teacher can provide for the student via SKD framework. The framework developed is evaluated in clinical application, namely colorectal polyp binary classification. In this application, clinician-provided annotations are used to define subclasses based on the annotation label's variability in a curriculum style of learning. A lightweight, low complexity student trained with the proposed framework achieves an F1-score of 85.05 student that trains without and with conventional knowledge distillation, respectively. These results show that the extra subclasses' knowledge (i.e., 0.4656 label bits per training sample in our experiment) can provide more information about the teacher generalization, and therefore SKD can benefit from using more information to increase the student performance.

READ FULL TEXT

page 6

page 12

page 13

research
07/17/2022

Subclass Knowledge Distillation with Known Subclass Labels

This work introduces a novel knowledge distillation framework for classi...
research
05/04/2022

Generalized Knowledge Distillation via Relationship Matching

The knowledge of a well-trained deep neural network (a.k.a. the "teacher...
research
10/15/2022

RoS-KD: A Robust Stochastic Knowledge Distillation Approach for Noisy Medical Imaging

AI-powered Medical Imaging has recently achieved enormous attention due ...
research
05/27/2023

Vision Transformers for Small Histological Datasets Learned through Knowledge Distillation

Computational Pathology (CPATH) systems have the potential to automate d...
research
12/10/2018

Spatial Knowledge Distillation to aid Visual Reasoning

For tasks involving language and vision, the current state-of-the-art me...
research
09/30/2020

Pea-KD: Parameter-efficient and Accurate Knowledge Distillation

How can we efficiently compress a model while maintaining its performanc...
research
03/07/2023

PreFallKD: Pre-Impact Fall Detection via CNN-ViT Knowledge Distillation

Fall accidents are critical issues in an aging and aged society. Recentl...

Please sign up or login with your details

Forgot password? Click here to reset