Learning Loss for Knowledge Distillation with Conditional Adversarial Networks

09/02/2017
by   Zheng Xu, et al.
0

There is an increasing interest on accelerating neural networks for real-time applications. We study the student-teacher strategy, in which a small and fast student network is trained with the auxiliary information provided by a large and accurate teacher network. We use conditional adversarial networks to learn the loss function to transfer knowledge from teacher to student. The proposed method is particularly effective for relatively small student networks. Moreover, experimental results show the effect of network size when the modern networks are used as student. We empirically study trade-off between inference time and classification accuracy, and provide suggestions on choosing a proper student.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2019

Adversarially Robust Distillation

Knowledge distillation is effective for producing small high-performance...
research
12/17/2018

Learning Student Networks via Feature Embedding

Deep convolutional neural networks have been widely used in numerous app...
research
03/13/2021

Student-Teacher Learning from Clean Inputs to Noisy Inputs

Feature-based student-teacher learning, a training method that encourage...
research
03/28/2018

Adversarial Network Compression

Neural network compression has recently received much attention due to t...
research
07/05/2017

Like What You Like: Knowledge Distill via Neuron Selectivity Transfer

Despite deep neural networks have demonstrated extraordinary power in va...
research
10/19/2018

Improving Fast Segmentation With Teacher-student Learning

Recently, segmentation neural networks have been significantly improved ...
research
03/15/2023

Descriptor Distillation for Efficient Multi-Robot SLAM

Performing accurate localization while maintaining the low-level communi...

Please sign up or login with your details

Forgot password? Click here to reset