Channel Planting for Deep Neural Networks using Knowledge Distillation

11/04/2020
by   Kakeru Mitsuno, et al.
0

In recent years, deeper and wider neural networks have shown excellent performance in computer vision tasks, while their enormous amount of parameters results in increased computational cost and overfitting. Several methods have been proposed to compress the size of the networks without reducing network performance. Network pruning can reduce redundant and unnecessary parameters from a network. Knowledge distillation can transfer the knowledge of deeper and wider networks to smaller networks. The performance of the smaller network obtained by these methods is bounded by the predefined network. Neural architecture search has been proposed, which can search automatically the architecture of the networks to break the structure limitation. Also, there is a dynamic configuration method to train networks incrementally as sub-networks. In this paper, we present a novel incremental training algorithm for deep neural networks called planting. Our planting can search the optimal network architecture with smaller number of parameters for improving the network performance by augmenting channels incrementally to layers of the initial networks while keeping the earlier trained parameters fixed. Also, we propose using the knowledge distillation method for training the channels planted. By transferring the knowledge of deeper and wider networks, we can grow the networks effectively and efficiently. We evaluate the effectiveness of the proposed method on different datasets such as CIFAR-10/100 and STL-10. For the STL-10 dataset, we show that we are able to achieve comparable performance with only 7 caused by a small amount of the data.

READ FULL TEXT

page 1

page 3

research
10/02/2020

Neighbourhood Distillation: On the benefits of non end-to-end distillation

End-to-end training with back propagation is the standard method for tra...
research
03/14/2019

Improving Neural Architecture Search Image Classifiers via Ensemble Learning

Finding the best neural network architecture requires significant time, ...
research
05/23/2019

Network Pruning via Transformable Architecture Search

Network pruning reduces the computation costs of an over-parameterized n...
research
10/09/2020

Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

Deep neural network architectures have attained remarkable improvements ...
research
06/16/2020

Prior knowledge distillation based on financial time series

One of the major characteristics of financial time series is that they c...
research
09/08/2023

Towards Mitigating Architecture Overfitting in Dataset Distillation

Dataset distillation methods have demonstrated remarkable performance fo...
research
05/23/2023

Transferring Learning Trajectories of Neural Networks

Training deep neural networks (DNNs) is computationally expensive, which...

Please sign up or login with your details

Forgot password? Click here to reset