ExpandNets: Exploiting Linear Redundancy to Train Small Networks

11/26/2018
by   Shuxuan Guo, et al.
0

While very deep networks can achieve great performance, they are ill-suited to applications in resource-constrained environments. Knowledge transfer, which leverages a deep teacher network to train a given small network, has emerged as one of the most popular strategies to address this problem. In this paper, we introduce an alternative approach to training a given small network, based on the intuition that parameter redundancy facilitates learning. We propose to expand each linear layer of a small network into multiple linear layers, without adding any nonlinearity. As such, the resulting expanded network can be compressed back to the small one algebraically, but, as evidenced by our experiments, consistently outperforms training the small network from scratch. This strategy is orthogonal to knowledge transfer. We therefore further show on several standard benchmarks that, for any knowledge transfer technique, using our expanded network as student systematically improves over using the small network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/01/2017

Deep Mutual Learning

Model distillation is an effective and widely used technique to transfer...
research
03/28/2018

Adversarial Network Compression

Neural network compression has recently received much attention due to t...
research
01/23/2021

Network-Agnostic Knowledge Transfer for Medical Image Segmentation

Conventional transfer learning leverages weights of pre-trained networks...
research
08/31/2020

Evaluating Knowledge Transfer In Neural Network for Medical Images

Deep learning and knowledge transfer techniques have permeated the field...
research
02/14/2018

Paraphrasing Complex Network: Network Compression via Factor Transfer

Deep neural networks (DNN) have recently shown promising performances in...
research
03/06/2019

Learning from Higher-Layer Feature Visualizations

Driven by the goal to enable sleep apnea monitoring and machine learning...
research
08/02/2019

Distilling Knowledge From a Deep Pose Regressor Network

This paper presents a novel method to distill knowledge from a deep pose...

Please sign up or login with your details

Forgot password? Click here to reset