Online Knowledge Distillation via Multi-branch Diversity Enhancement

10/02/2020
by   Zheng Li, et al.
8

Knowledge distillation is an effective method to transfer the knowledge from the cumbersome teacher model to the lightweight student model. Online knowledge distillation uses the ensembled prediction results of multiple student models as soft targets to train each student model. However, the homogenization problem will lead to difficulty in further improving model performance. In this work, we propose a new distillation method to enhance the diversity among multiple student models. We introduce Feature Fusion Module (FFM), which improves the performance of the attention mechanism in the network by integrating rich semantic information contained in the last block of multiple student models. Furthermore, we use the Classifier Diversification(CD) loss function to strengthen the differences between the student models and deliver a better ensemble result. Extensive experiments proved that our method significantly enhances the diversity among student models and brings better distillation performance. We evaluate our method on three image classification datasets: CIFAR-10/100 and CINIC-10. The results show that our method achieves state-of-the-art performance on these datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2021

Semi-Online Knowledge Distillation

Knowledge distillation is an effective and stable method for model compr...
research
12/01/2019

Online Knowledge Distillation with Diverse Peers

Distillation is an effective knowledge-transfer technique that uses pred...
research
09/07/2023

Towards Comparable Knowledge Distillation in Semantic Image Segmentation

Knowledge Distillation (KD) is one proposed solution to large model size...
research
07/10/2022

1st Place Solution to the EPIC-Kitchens Action Anticipation Challenge 2022

In this report, we describe the technical details of our submission to t...
research
03/14/2023

Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification

Data-Free Knowledge Distillation (DFKD) has recently attracted growing a...
research
03/26/2021

Distilling a Powerful Student Model via Online Knowledge Distillation

Existing online knowledge distillation approaches either adopt the stude...
research
05/25/2023

On the Impact of Knowledge Distillation for Model Interpretability

Several recent studies have elucidated why knowledge distillation (KD) i...

Please sign up or login with your details

Forgot password? Click here to reset