InDistill: Transferring Knowledge From Pruned Intermediate Layers

05/20/2022
by   Ioannis Sarridis, et al.
0

Deploying deep neural networks on hardware with limited resources, such as smartphones and drones, constitutes a great challenge due to their computational complexity. Knowledge distillation approaches aim at transferring knowledge from a large model to a lightweight one, also known as teacher and student respectively, while distilling the knowledge from intermediate layers provides an additional supervision to that task. The capacity gap between the models, the information encoding that collapses its architectural alignment, and the absence of appropriate learning schemes for transferring multiple layers restrict the performance of existing methods. In this paper, we propose a novel method, termed InDistill, that can drastically improve the performance of existing single-layer knowledge distillation methods by leveraging the properties of channel pruning to both reduce the capacity gap between the models and retain the architectural alignment. Furthermore, we propose a curriculum learning based scheme for enhancing the effectiveness of transferring knowledge from multiple intermediate layers. The proposed method surpasses state-of-the-art performance on three benchmark image datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2023

A Survey on Recent Teacher-student Learning Studies

Knowledge distillation is a method of transferring the knowledge from a ...
research
05/02/2020

Heterogeneous Knowledge Distillation using Information Flow Modeling

Knowledge Distillation (KD) methods are capable of transferring the know...
research
01/18/2022

It's All in the Head: Representation Knowledge Distillation through Classifier Sharing

Representation knowledge distillation aims at transferring rich informat...
research
06/16/2021

Topology Distillation for Recommender System

Recommender Systems (RS) have employed knowledge distillation which is a...
research
02/27/2023

Leveraging Angular Distributions for Improved Knowledge Distillation

Knowledge distillation as a broad class of methods has led to the develo...
research
11/05/2021

Visualizing the Emergence of Intermediate Visual Patterns in DNNs

This paper proposes a method to visualize the discrimination power of in...
research
07/13/2020

Towards practical lipreading with distilled and efficient models

Lipreading has witnessed a lot of progress due to the resurgence of neur...

Please sign up or login with your details

Forgot password? Click here to reset