MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation

08/27/2020
by   Benlin Liu, et al.
0

Knowledge Distillation (KD) has been one of the most popu-lar methods to learn a compact model. However, it still suffers from highdemand in time and computational resources caused by sequential train-ing pipeline. Furthermore, the soft targets from deeper models do notoften serve as good cues for the shallower models due to the gap of com-patibility. In this work, we consider these two problems at the same time.Specifically, we propose that better soft targets with higher compatibil-ity can be generated by using a label generator to fuse the feature mapsfrom deeper stages in a top-down manner, and we can employ the meta-learning technique to optimize this label generator. Utilizing the softtargets learned from the intermediate feature maps of the model, we canachieve better self-boosting of the network in comparison with the state-of-the-art. The experiments are conducted on two standard classificationbenchmarks, namely CIFAR-100 and ILSVRC2012. We test various net-work architectures to show the generalizability of our MetaDistiller. Theexperiments results on two datasets strongly demonstrate the effective-ness of our method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/11/2022

MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition

Unlike the conventional Knowledge Distillation (KD), Self-KD allows a ne...
research
02/21/2020

Residual Knowledge Distillation

Knowledge distillation (KD) is one of the most potent ways for model com...
research
03/30/2022

Self-Distillation from the Last Mini-Batch for Consistency Regularization

Knowledge distillation (KD) shows a bright promise as a powerful regular...
research
08/12/2023

Multi-Label Knowledge Distillation

Existing knowledge distillation methods typically work by imparting the ...
research
04/12/2022

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

Recent Knowledge distillation (KD) studies show that different manually ...
research
02/28/2023

Learning to Retain while Acquiring: Combating Distribution-Shift in Adversarial Data-Free Knowledge Distillation

Data-free Knowledge Distillation (DFKD) has gained popularity recently, ...
research
03/26/2021

Distilling a Powerful Student Model via Online Knowledge Distillation

Existing online knowledge distillation approaches either adopt the stude...

Please sign up or login with your details

Forgot password? Click here to reset