Knowledge Distillation via Instance-level Sequence Learning

06/21/2021
by   Haoran Zhao, et al.
10

Recently, distillation approaches are suggested to extract general knowledge from a teacher network to guide a student network. Most of the existing methods transfer knowledge from the teacher network to the student via feeding the sequence of random mini-batches sampled uniformly from the data. Instead, we argue that the compact student network should be guided gradually using samples ordered in a meaningful sequence. Thus, it can bridge the gap of feature representation between the teacher and student network step by step. In this work, we provide a curriculum learning knowledge distillation framework via instance-level sequence learning. It employs the student network of the early epoch as a snapshot to create a curriculum for the student network's next training phase. We carry out extensive experiments on CIFAR-10, CIFAR-100, SVHN and CINIC-10 datasets. Compared with several state-of-the-art methods, our framework achieves the best performance with fewer iterations.

READ FULL TEXT

page 1

page 3

research
02/09/2019

Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher

Despite the fact that deep neural networks are powerful models and achie...
research
09/15/2022

CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation

Knowledge distillation (KD) is an effective tool for compressing deep cl...
research
07/01/2020

Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location

Accurate and reliable prediction of hospital admission location is impor...
research
11/22/2019

Knowledge Network and a Knowledge Network Example

Knowledge networks can be defined as social networks that enable the tra...
research
08/29/2022

How to Teach: Learning Data-Free Knowledge Distillation from Curriculum

Data-free knowledge distillation (DFKD) aims at training lightweight stu...
research
02/02/2023

Paced-Curriculum Distillation with Prediction and Label Uncertainty for Image Segmentation

Purpose: In curriculum learning, the idea is to train on easier samples ...
research
11/18/2019

Preparing Lessons: Improve Knowledge Distillation with Better Supervision

Knowledge distillation (KD) is widely used for training a compact model ...

Please sign up or login with your details

Forgot password? Click here to reset