Review helps learn better: Temporal Supervised Knowledge Distillation

07/03/2023
by   Dongwei Wang, et al.
0

Reviewing plays an important role when learning knowledge. The knowledge acquisition at a certain time point may be strongly inspired with the help of previous experience. Thus the knowledge growing procedure should show strong relationship along the temporal dimension. In our research, we find that during the network training, the evolution of feature map follows temporal sequence property. A proper temporal supervision may further improve the network training performance. Inspired by this observation, we design a novel knowledge distillation method. Specifically, we extract the spatiotemporal features in the different training phases of student by convolutional Long Short-term memory network (Conv-LSTM). Then, we train the student net through a dynamic target, rather than static teacher network features. This process realizes the refinement of old knowledge in student network, and utilizes them to assist current learning. Extensive experiments verify the effectiveness and advantages of our method over existing knowledge distillation methods, including various network architectures, different tasks (image classification and object detection) .

READ FULL TEXT

page 4

page 5

research
02/01/2023

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

Knowledge distillation addresses the problem of transferring knowledge f...
research
12/05/2018

Knowledge Distillation from Few Samples

Current knowledge distillation methods require full training data to dis...
research
09/24/2019

FEED: Feature-level Ensemble for Knowledge Distillation

Knowledge Distillation (KD) aims to transfer knowledge in a teacher-stud...
research
03/02/2020

Long Short-Term Sample Distillation

In the past decade, there has been substantial progress at training incr...
research
03/04/2019

TKD: Temporal Knowledge Distillation for Active Perception

Deep neural networks based methods have been proved to achieve outstandi...
research
08/10/2023

Towards General and Fast Video Derain via Knowledge Distillation

As a common natural weather condition, rain can obscure video frames and...
research
08/24/2023

Fall Detection using Knowledge Distillation Based Long short-term memory for Offline Embedded and Low Power Devices

This paper presents a cost-effective, low-power approach to unintentiona...

Please sign up or login with your details

Forgot password? Click here to reset