Self-Referenced Deep Learning

11/19/2018
by   Xu Lan, et al.
0

Knowledge distillation is an effective approach to transferring knowledge from a teacher neural network to a student target network for satisfying the low-memory and fast running requirements in practice use. Whilst being able to create stronger target networks compared to the vanilla non-teacher based learning strategy, this scheme needs to train additionally a large teacher model with expensive computational cost. In this work, we present a Self-Referenced Deep Learning (SRDL) strategy. Unlike both vanilla optimisation and existing knowledge distillation, SRDL distils the knowledge discovered by the in-training target model back to itself to regularise the subsequent learning procedure therefore eliminating the need for training a large teacher model. SRDL improves the model generalisation performance compared to vanilla learning and conventional knowledge distillation approaches with negligible extra computational cost. Extensive evaluations show that a variety of deep networks benefit from SRDL resulting in enhanced deployment performance on both coarse-grained object categorisation tasks (CIFAR10, CIFAR100, Tiny ImageNet, and ImageNet) and fine-grained person instance identification tasks (Market-1501).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2018

Knowledge Distillation by On-the-Fly Native Ensemble

Knowledge distillation is effective to train small and generalisable net...
research
02/04/2022

Iterative Self Knowledge Distillation – From Pothole Classification to Fine-Grained and COVID Recognition

Pothole classification has become an important task for road inspection ...
research
12/17/2020

Computation-Efficient Knowledge Distillation via Uncertainty-Aware Mixup

Knowledge distillation, which involves extracting the "dark knowledge" f...
research
06/24/2021

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

In this paper, we address the problem of high performance and computatio...
research
05/25/2023

VanillaKD: Revisit the Power of Vanilla Knowledge Distillation from Small Scale to Large Scale

The tremendous success of large models trained on extensive datasets dem...
research
11/21/2017

Knowledge Concentration: Learning 100K Object Classifiers in a Single CNN

Fine-grained image labels are desirable for many computer vision applica...
research
05/10/2023

Explainable Knowledge Distillation for On-device Chest X-Ray Classification

Automated multi-label chest X-rays (CXR) image classification has achiev...

Please sign up or login with your details

Forgot password? Click here to reset