Parameterized Structured Pruning for Deep Neural Networks

06/12/2019
by   Guenther Schindler, et al.
0

As a result of the growing size of Deep Neural Networks (DNNs), the gap to hardware capabilities in terms of memory and compute increases. To effectively compress DNNs, quantization and connection pruning are usually considered. However, unconstrained pruning usually leads to unstructured parallelism, which maps poorly to massively parallel processors, and substantially reduces the efficiency of general-purpose processors. Similar applies to quantization, which often requires dedicated hardware. We propose Parameterized Structured Pruning (PSP), a novel method to dynamically learn the shape of DNNs through structured sparsity. PSP parameterizes structures (e.g. channel- or layer-wise) in a weight tensor and leverages weight decay to learn a clear distinction between important and unimportant structures. As a result, PSP maintains prediction performance, creates a substantial amount of sparsity that is structured and, thus, easy and efficient to map to a variety of massively parallel processors, which are mandatory for utmost compute power and energy efficiency. PSP is experimentally validated on the popular CIFAR10/100 and ILSVRC2012 datasets using ResNet and DenseNet architectures, respectively.

READ FULL TEXT
research
09/15/2022

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

Sparsity has become one of the promising methods to compress and acceler...
research
05/23/2022

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

As Deep Neural Networks (DNNs) usually are overparameterized and have mi...
research
07/03/2019

Non-structured DNN Weight Pruning Considered Harmful

Large deep neural network (DNN) models pose the key challenge to energy ...
research
02/08/2023

The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks

Energy efficient implementations and deployments of Spiking neural netwo...
research
11/25/2022

Signed Binary Weight Networks

Efficient inference of Deep Neural Networks (DNNs) is essential to makin...
research
01/13/2022

Examining and Mitigating the Impact of Crossbar Non-idealities for Accurate Implementation of Sparse Deep Neural Networks

Recently several structured pruning techniques have been introduced for ...
research
10/05/2020

Joint Pruning Quantization for Extremely Sparse Neural Networks

We investigate pruning and quantization for deep neural networks. Our go...

Please sign up or login with your details

Forgot password? Click here to reset