Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

09/17/2019
by   Umar Asif, et al.
8

Ensemble models comprising of deep Convolutional Neural Networks (CNN) have shown significant improvements in model generalization but at the cost of large computation and memory requirements. for learning compact CNN models with improved classification performance and model generalization. For this, we propose a CNN architecture of a compact student model with parallel branches which are trained using ground truth labels and information from high capacity teacher networks in an ensemble learning fashion. Our framework provides two main benefits: i) Distilling knowledge from different teachers into the student network promotes heterogeneity in feature learning at different branches of the student network and enables the network to learn diverse solutions to the target problem. ii) Coupling the branches of the student network through ensembling encourages collaboration and improves the quality of the final predictions by reducing variance in the network outputs. and CIFAR-100 datasets show that our Ensemble Knowledge Distillation (EKD) improves classification accuracy and model generalization especially in situations with limited training data. Experiments also show that our EKD based compact networks outperform in terms of mean accuracy on the test datasets compared to state-of-the-art knowledge distillation based methods.

READ FULL TEXT

page 3

page 9

page 10

research
11/29/2019

Towards Oracle Knowledge Distillation with Neural Architecture Search

We present a novel framework of knowledge distillation that is capable o...
research
12/17/2020

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

We formally study how Ensemble of deep learning models can improve test ...
research
01/31/2018

Model compression for faster structural separation of macromolecules captured by Cellular Electron Cryo-Tomography

Electron Cryo-Tomography (ECT) enables 3D visualization of macromolecule...
research
04/01/2021

Students are the Best Teacher: Exit-Ensemble Distillation with Multi-Exits

This paper proposes a novel knowledge distillation-based learning method...
research
10/09/2020

Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

Deep neural network architectures have attained remarkable improvements ...
research
10/05/2022

Meta-Ensemble Parameter Learning

Ensemble of machine learning models yields improved performance as well ...
research
04/07/2019

Long-Term Vehicle Localization by Recursive Knowledge Distillation

Most of the current state-of-the-art frameworks for cross-season visual ...

Please sign up or login with your details

Forgot password? Click here to reset