Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation

10/19/2021
by   Sumanth Chennupati, et al.
16

Knowledge Distillation is becoming one of the primary trends among neural network compression algorithms to improve the generalization performance of a smaller student model with guidance from a larger teacher model. This momentous rise in applications of knowledge distillation is accompanied by the introduction of numerous algorithms for distilling the knowledge such as soft targets and hint layers. Despite this advancement in different techniques for distilling the knowledge, the aggregation of different paths for distillation has not been studied comprehensively. This is of particular significance, not only because different paths have different importance, but also due to the fact that some paths might have negative effects on the generalization performance of the student model. Hence, we need to adaptively adjust the importance of each path to maximize the impact of distillation on the student model. In this paper, we explore different approaches for aggregating these different paths and introduce our proposed adaptive approach based on multitask learning methods. We empirically demonstrate the effectiveness of the proposed approach over other baselines on the applications of knowledge distillation in classification, semantic segmentation, and object detection tasks.

READ FULL TEXT
research
02/26/2021

PURSUhInT: In Search of Informative Hint Points Based on Layer Clustering for Knowledge Distillation

We propose a novel knowledge distillation methodology for compressing de...
research
09/20/2023

Weight Averaging Improves Knowledge Distillation under Domain Shift

Knowledge distillation (KD) is a powerful model compression technique br...
research
09/19/2016

A scalable convolutional neural network for task-specified scenarios via knowledge distillation

In this paper, we explore the redundancy in convolutional neural network...
research
03/06/2023

KDSM: An uplift modeling framework based on knowledge distillation and sample matching

Uplift modeling aims to estimate the treatment effect on individuals, wi...
research
04/03/2023

Domain Generalization for Crop Segmentation with Knowledge Distillation

In recent years, precision agriculture has gradually oriented farming cl...
research
08/04/2020

Prime-Aware Adaptive Distillation

Knowledge distillation(KD) aims to improve the performance of a student ...
research
12/01/2020

Solvable Model for Inheriting the Regularization through Knowledge Distillation

In recent years the empirical success of transfer learning with neural n...

Please sign up or login with your details

Forgot password? Click here to reset