Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation

05/17/2019
by   Linfeng Zhang, et al.
0

Convolutional neural networks have been widely deployed in various application scenarios. In order to extend the applications' boundaries to some accuracy-crucial domains, researchers have been investigating approaches to boost accuracy through either deeper or wider network structures, which brings with them the exponential increment of the computational and storage cost, delaying the responding time. In this paper, we propose a general training framework named self distillation, which notably enhances the performance (accuracy) of convolutional neural networks through shrinking the size of the network rather than aggrandizing it. Different from traditional knowledge distillation - a knowledge transformation methodology among networks, which forces student neural networks to approximate the softmax layer outputs of pre-trained teacher neural networks, the proposed self distillation framework distills knowledge within network itself. The networks are firstly divided into several sections. Then the knowledge in the deeper portion of the networks is squeezed into the shallow ones. Experiments further prove the generalization of the proposed self distillation framework: enhancement of accuracy at average level is 2.65 maximum. In addition, it can also provide flexibility of depth-wise scalable inference on resource-limited edge devices.Our codes will be released on github soon.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/23/2019

Highlight Every Step: Knowledge Distillation via Collaborative Teaching

High storage and computational costs obstruct deep neural networks to be...
research
11/21/2019

MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks

As the development of neural networks, more and more deep neural network...
research
04/14/2020

Smart Inference for Multidigit Convolutional Neural Network based Barcode Decoding

Barcodes are ubiquitous and have been used in most of critical daily act...
research
09/28/2019

Training convolutional neural networks with cheap convolutions and online distillation

The large memory and computation consumption in convolutional neural net...
research
06/19/2023

Categories of Response-Based, Feature-Based, and Relation-Based Knowledge Distillation

Deep neural networks have achieved remarkable performance for artificial...
research
09/19/2016

A scalable convolutional neural network for task-specified scenarios via knowledge distillation

In this paper, we explore the redundancy in convolutional neural network...
research
10/29/2021

Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment

Layer-wise model fusion via optimal transport, named OTFusion, applies s...

Please sign up or login with your details

Forgot password? Click here to reset