Dynamic Collective Intelligence Learning: Finding Efficient Sparse Model via Refined Gradients for Pruned Weights

09/10/2021
by   Jangho Kim, et al.
0

With the growth of deep neural networks (DNN), the number of DNN parameters has drastically increased. This makes DNN models hard to be deployed on resource-limited embedded systems. To alleviate this problem, dynamic pruning methods have emerged, which try to find diverse sparsity patterns during training by utilizing Straight-Through-Estimator (STE) to approximate gradients of pruned weights. STE can help the pruned weights revive in the process of finding dynamic sparsity patterns. However, using these coarse gradients causes training instability and performance degradation owing to the unreliable gradient signal of the STE approximation. In this work, to tackle this issue, we introduce refined gradients to update the pruned weights by forming dual forwarding paths from two sets (pruned and unpruned) of weights. We propose a novel Dynamic Collective Intelligence Learning (DCIL) which makes use of the learning synergy between the collective intelligence of both weight sets. We verify the usefulness of the refined gradients by showing enhancements in the training stability and the model performance on the CIFAR and ImageNet datasets. DCIL outperforms various previously proposed pruning schemes including other dynamic pruning methods with enhanced stability during training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2020

Dynamic Model Pruning with Feedback

Deep neural networks often have millions of parameters. This can hinder ...
research
01/08/2019

Spatial-Winograd Pruning Enabling Sparse Winograd Convolution

Deep convolutional neural networks (CNNs) are deployed in various applic...
research
09/11/2020

Achieving Adversarial Robustness via Sparsity

Network pruning has been known to produce compact models without much ac...
research
09/27/2019

Global Sparse Momentum SGD for Pruning Very Deep Neural Networks

Deep Neural Network (DNN) is powerful but computationally expensive and ...
research
05/31/2019

Learning Sparse Networks Using Targeted Dropout

Neural networks are easier to optimise when they have many more weights ...
research
05/21/2019

Revisiting hard thresholding for DNN pruning

The most common method for DNN pruning is hard thresholding of network w...
research
08/13/2023

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training

Binarization of neural networks is a dominant paradigm in neural network...

Please sign up or login with your details

Forgot password? Click here to reset