Categories of Response-Based, Feature-Based, and Relation-Based Knowledge Distillation

06/19/2023
by   Chuanguang Yang, et al.
0

Deep neural networks have achieved remarkable performance for artificial intelligence tasks. The success behind intelligent systems often relies on large-scale models with high computational complexity and storage costs. The over-parameterized networks are often easy to optimize and can achieve better performance. However, it is challenging to deploy them over resource-limited edge-devices. Knowledge Distillation (KD) aims to optimize a lightweight network from the perspective of over-parameterized training. The traditional offline KD transfers knowledge from a cumbersome teacher to a small and fast student network. When a sizeable pre-trained teacher network is unavailable, online KD can improve a group of models by collaborative or mutual learning. Without needing extra models, Self-KD boosts the network itself using attached auxiliary architectures. KD mainly involves knowledge extraction and distillation strategies these two aspects. Beyond KD schemes, various KD algorithms are widely used in practical applications, such as multi-teacher KD, cross-modal KD, attention-based KD, data-free KD and adversarial KD. This paper provides a comprehensive KD survey, including knowledge categories, distillation schemes and algorithms, as well as some empirical studies on performance comparison. Finally, we discuss the open challenges of existing KD works and prospect the future directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/09/2020

Knowledge Distillation: A Survey

In recent years, deep neural networks have been very successful in the f...
research
10/28/2022

Teacher-Student Architecture for Knowledge Learning: A Survey

Although Deep Neural Networks (DNNs) have shown a strong capacity to sol...
research
08/17/2023

Learning Through Guidance: Knowledge Distillation for Endoscopic Image Classification

Endoscopy plays a major role in identifying any underlying abnormalities...
research
08/08/2023

Teacher-Student Architecture for Knowledge Distillation: A Survey

Although Deep neural networks (DNNs) have shown a strong capacity to sol...
research
01/13/2023

A Comprehensive Survey to Dataset Distillation

Deep learning technology has unprecedentedly developed in the last decad...
research
05/17/2019

Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation

Convolutional neural networks have been widely deployed in various appli...
research
05/28/2023

ConaCLIP: Exploring Distillation of Fully-Connected Knowledge Interaction Graph for Lightweight Text-Image Retrieval

Large-scale pre-trained text-image models with dual-encoder architecture...

Please sign up or login with your details

Forgot password? Click here to reset