Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation

03/15/2021
by   Mingi Ji, et al.
0

Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage. To reduce the necessity of training a large teacher model, the recent literatures introduced a self-knowledge distillation, which trains a student network progressively to distill its own knowledge without a pretrained teacher network. While Self-knowledge distillation is largely divided into a data augmentation based approach and an auxiliary network based approach, the data augmentation approach looses its local information in the augmentation process, which hinders its applicability to diverse vision tasks, such as semantic segmentation. Moreover, these knowledge distillation approaches do not receive the refined feature maps, which are prevalent in the object detection and semantic segmentation community. This paper proposes a novel self-knowledge distillation method, Feature Refinement via Self-Knowledge Distillation (FRSKD), which utilizes an auxiliary self-teacher network to transfer a refined knowledge for the classifier network. Our proposed method, FRSKD, can utilize both soft label and feature-map distillations for the self-knowledge distillation. Therefore, FRSKD can be applied to classification, and semantic segmentation, which emphasize preserving the local information. We demonstrate the effectiveness of FRSKD by enumerating its performance improvements in diverse tasks and benchmark datasets. The implemented code is available at https://github.com/MingiJi/FRSKD.

READ FULL TEXT

page 4

page 7

page 10

research
08/11/2022

MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition

Unlike the conventional Knowledge Distillation (KD), Self-KD allows a ne...
research
04/14/2022

Cross-Image Relational Knowledge Distillation for Semantic Segmentation

Current Knowledge Distillation (KD) methods for semantic segmentation of...
research
08/23/2023

CED: Consistent ensemble distillation for audio tagging

Augmentation and knowledge distillation (KD) are well-established techni...
research
12/17/2021

Weakly Supervised Semantic Segmentation via Alternative Self-Dual Teaching

Current weakly supervised semantic segmentation (WSSS) frameworks usuall...
research
02/27/2022

Transformer-based Knowledge Distillation for Efficient Semantic Segmentation of Road-driving Scenes

For scene understanding in robotics and automated driving, there is a gr...
research
08/12/2023

Multi-Label Knowledge Distillation

Existing knowledge distillation methods typically work by imparting the ...
research
03/12/2021

Self-Feature Regularization: Self-Feature Distillation Without Teacher Models

Knowledge distillation is the process of transferring the knowledge from...

Please sign up or login with your details

Forgot password? Click here to reset