Efficient Inference of CNNs via Channel Pruning

08/08/2019
by   Boyu Zhang, et al.
0

The deployment of Convolutional Neural Networks (CNNs) on resource constrained platforms such as mobile devices and embedded systems has been greatly hindered by their high implementation cost, and thus motivated a lot research interest in compressing and accelerating trained CNN models. Among various techniques proposed in literature, structured pruning, especially channel pruning, has gain a lot focus due to 1) its superior performance in memory, computation, and energy reduction; and 2) it is friendly to existing hardware and software libraries. In this paper, we investigate the intermediate results of convolutional layers and present a novel pivoted QR factorization based channel pruning technique that can prune any specified number of input channels of any layer. We also explore more pruning opportunities in ResNet-like architectures by applying two tweaks to our technique. Experiment results on VGG-16 and ResNet-50 models with ImageNet ILSVRC 2012 dataset are very impressive with 4.29X and 2.84X computation reduction while only sacrificing about 1.40% top-5 accuracy. Compared to many prior works, the pruned models produced by our technique require up to 47.7% less computation while still achieve higher accuracies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/20/2020

Performance Aware Convolutional Neural Network Channel Pruning for Embedded GPUs

Convolutional Neural Networks (CNN) are becoming a common presence in ma...
research
10/03/2020

UCP: Uniform Channel Pruning for Deep Convolutional Neural Networks Compression and Acceleration

To apply deep CNNs to mobile terminals and portable devices, many schola...
research
11/20/2018

Stability Based Filter Pruning for Accelerating Deep CNNs

Convolutional neural networks (CNN) have achieved impressive performance...
research
01/17/2021

KCP: Kernel Cluster Pruning for Dense Labeling Neural Networks

Pruning has become a promising technique used to compress and accelerate...
research
11/06/2018

Synaptic Strength For Convolutional Neural Network

Convolutional Neural Networks(CNNs) are both computation and memory inte...
research
09/16/2021

Dense Pruning of Pointwise Convolutions in the Frequency Domain

Depthwise separable convolutions and frequency-domain convolutions are t...
research
10/28/2022

Determining Ratio of Prunable Channels in MobileNet by Sparsity for Acoustic Scene Classification

MobileNet is widely used for Acoustic Scene Classification (ASC) in embe...

Please sign up or login with your details

Forgot password? Click here to reset