Pruning Filters for Efficient ConvNets

08/31/2016
by   Hao Li, et al.
University of Maryland
NEC Laboratories America
0

The success of CNNs in various applications is accompanied by a significant increase in the computation and parameter storage costs. Recent efforts toward reducing these overheads involve pruning and compressing the weights of various layers without hurting original accuracy. However, magnitude-based pruning of weights reduces a significant number of parameters from the fully connected layers and may not adequately reduce the computation costs in the convolutional layers due to irregular sparsity in the pruned networks. We present an acceleration method for CNNs, where we prune filters from CNNs that are identified as having a small effect on the output accuracy. By removing whole filters in the network together with their connecting feature maps, the computation costs are reduced significantly. In contrast to pruning weights, this approach does not result in sparse connectivity patterns. Hence, it does not need the support of sparse convolution libraries and can work with existing efficient BLAS libraries for dense matrix multiplications. We show that even simple filter pruning techniques can reduce inference costs for VGG-16 by up to 34 original accuracy by retraining the networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/30/2016

Compact Deep Convolutional Neural Networks With Coarse Pruning

The learning capability of a neural network improves with increasing dep...
08/04/2016

Faster CNNs with Direct Sparse Convolutions and Guided Pruning

Phenomenally successful in practical inference problems, convolutional n...
12/18/2019

Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning

The success of convolutional neural networks (CNNs) in various applicati...
01/23/2019

Towards Compact ConvNets via Structure-Sparsity Regularized Filter Pruning

The success of convolutional neural networks (CNNs) in computer vision a...
05/11/2019

Play and Prune: Adaptive Filter Pruning for Deep Model Compression

While convolutional neural networks (CNN) have achieved impressive perfo...
10/16/2019

SPEC2: SPECtral SParsE CNN Accelerator on FPGAs

To accelerate inference of Convolutional Neural Networks (CNNs), various...
10/15/2019

Training CNNs faster with Dynamic Input and Kernel Downsampling

We reduce training time in convolutional networks (CNNs) with a method t...

Code Repositories

model_compression

deep learning model compression based on keras


view repo

Please sign up or login with your details

Forgot password? Click here to reset