Dynamic Rectification Knowledge Distillation

01/27/2022
by   Fahad Rahman Amik, et al.
0

Knowledge Distillation is a technique which aims to utilize dark knowledge to compress and transfer information from a vast, well-trained neural network (teacher model) to a smaller, less capable neural network (student model) with improved inference efficiency. This approach of distilling knowledge has gained popularity as a result of the prohibitively complicated nature of such cumbersome models for deployment on edge computing devices. Generally, the teacher models used to teach smaller student models are cumbersome in nature and expensive to train. To eliminate the necessity for a cumbersome teacher model completely, we propose a simple yet effective knowledge distillation framework that we termed Dynamic Rectification Knowledge Distillation (DR-KD). Our method transforms the student into its own teacher, and if the self-teacher makes wrong predictions while distilling information, the error is rectified prior to the knowledge being distilled. Specifically, the teacher targets are dynamically tweaked by the agency of ground-truth while distilling the knowledge gained from traditional training. Our proposed DR-KD performs remarkably well in the absence of a sophisticated cumbersome teacher model and achieves comparable performance to existing state-of-the-art teacher-free knowledge distillation frameworks when implemented by a low-cost dynamic mannered teacher. Our approach is all-encompassing and can be utilized for any deep neural network training that requires categorization or object recognition. DR-KD enhances the test accuracy on Tiny ImageNet by 2.65 prominent baseline models, which is significantly better than any other knowledge distillation approach while requiring no additional training costs.

READ FULL TEXT

page 1

page 2

research
03/09/2023

Learning the Wrong Lessons: Inserting Trojans During Knowledge Distillation

In recent years, knowledge distillation has become a cornerstone of effi...
research
04/14/2021

Annealing Knowledge Distillation

Significant memory and computational requirements of large deep neural n...
research
03/24/2023

Mixed-Type Wafer Classification For Low Memory Devices Using Knowledge Distillation

Manufacturing wafers is an intricate task involving thousands of steps. ...
research
03/01/2021

Embedded Knowledge Distillation in Depth-level Dynamic Neural Network

In real applications, different computation-resource devices need differ...
research
10/14/2022

Knowledge Distillation approach towards Melanoma Detection

Melanoma is regarded as the most threatening among all skin cancers. The...
research
10/26/2019

Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework

The holy grail in deep neural network research is porting the memory- an...
research
09/19/2020

Introspective Learning by Distilling Knowledge from Online Self-explanation

In recent years, many explanation methods have been proposed to explain ...

Please sign up or login with your details

Forgot password? Click here to reset