Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

12/17/2020
by   Guodong Xu, et al.
0

Knowledge distillation, which involves extracting the "dark knowledge" from a teacher network to guide the learning of a student network, has emerged as an essential technique for model compression and transfer learning. Unlike previous works that focus on the accuracy of student network, here we study a little-explored but important question, i.e., knowledge distillation efficiency. Our goal is to achieve a performance comparable to conventional knowledge distillation with a lower computation cost during training. We show that the UNcertainty-aware mIXup (UNIX) can serve as a clean yet effective solution. The uncertainty sampling strategy is used to evaluate the informativeness of each training sample. Adaptive mixup is applied to uncertain samples to compact knowledge. We further show that the redundancy of conventional knowledge distillation lies in the excessive learning of easy samples. By combining uncertainty and mixup, our approach reduces the redundancy and makes better use of each query to the teacher network. We validate our approach on CIFAR100 and ImageNet. Notably, with only 79 computation cost, we outperform conventional knowledge distillation on CIFAR100 and achieve a comparable result on ImageNet.

READ FULL TEXT
research
06/12/2020

Knowledge Distillation Meets Self-Supervision

Knowledge distillation, which involves extracting the "dark knowledge" f...
research
06/12/2019

Efficient Evaluation-Time Uncertainty Estimation by Improved Distillation

In this work we aim to obtain computationally-efficient uncertainty esti...
research
09/19/2016

A scalable convolutional neural network for task-specified scenarios via knowledge distillation

In this paper, we explore the redundancy in convolutional neural network...
research
01/20/2021

Deep Epidemiological Modeling by Black-box Knowledge Distillation: An Accurate Deep Learning Model for COVID-19

An accurate and efficient forecasting system is imperative to the preven...
research
08/01/2023

NormKD: Normalized Logits for Knowledge Distillation

Logit based knowledge distillation gets less attention in recent years s...
research
11/19/2018

Self-Referenced Deep Learning

Knowledge distillation is an effective approach to transferring knowledg...
research
06/14/2021

Energy-efficient Knowledge Distillation for Spiking Neural Networks

Spiking neural networks (SNNs) have been gaining interest as energy-effi...

Please sign up or login with your details

Forgot password? Click here to reset