Joint Pruning on Activations and Weights for Efficient Neural Networks

06/19/2019
by   Qing Yang, et al.
0

With rapidly scaling up of deep neural networks (DNNs), extensive research studies on network model compression such as weight pruning have been performed for improving deployment efficiency. This work aims to advance the compression beyond the weights to neuron activations. We propose an end-to-end Joint Pruning (JP) technique which integrates the activation pruning with the weight pruning. By distinguishing and taking on the different significance of neuron responses and connections during learning, the generated network, namely JPnet, optimizes the sparsity of activations and weights for improving execution efficiency. To our best knowledge, JP is the first technique that simultaneously explores the redundancy in both weights and activations. The derived deep sparsification in the JPnet reveals more optimizing potentialities for the existing DNN accelerators dedicated for sparse matrix operations. The effectiveness of JP technique is thoroughly evaluated through various network models with different activation functions and on different datasets. With <0.4% degradation on testing accuracy, a JPnet can save 71.1%∼ 96.35% of computation cost, compared to the original dense models with up to 5.8× and 10× reductions in activation and weight numbers, respectively. Compared to state-of-the-art weight pruning technique, JPnet can further reduce the computation cost 1.2×∼ 2.7×.

READ FULL TEXT
research
09/13/2019

DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement

To improve the execution speed and efficiency of neural networks in embe...
research
12/05/2017

Automated Pruning for Deep Neural Network Compression

In this work we present a method to improve the pruning step of the curr...
research
05/25/2017

Gated XNOR Networks: Deep Neural Networks with Ternary Weights and Activations under a Unified Discretization Framework

There is a pressing need to build an architecture that could subsume the...
research
11/19/2019

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

The rapidly growing parameter volume of deep neural networks (DNNs) hind...
research
04/26/2018

Accelerator-Aware Pruning for Convolutional Neural Networks

Convolutional neural networks have shown tremendous performance in compu...
research
06/10/2018

Smallify: Learning Network Size while Training

As neural networks become widely deployed in different applications and ...
research
07/09/2019

On Activation Function Coresets for Network Pruning

Model compression provides a means to efficiently deploy deep neural net...

Please sign up or login with your details

Forgot password? Click here to reset