Accelerate Your CNN from Three Dimensions: A Comprehensive Pruning Framework

by   Wenxiao Wang, et al.

To deploy a pre-trained deep CNN on resource-constrained mobile devices, neural network pruning is often used to cut down the model's computational cost. For example, filter-level pruning (reducing the model's width) or layer-level pruning (reducing the model's depth) can both save computations with some sacrifice of accuracy. Besides, reducing the resolution of input images can also reach the same goal. Most previous methods focus on reducing one or two of these dimensions (i.e., depth, width, and image resolution) for acceleration. However, excessive reduction of any single dimension will lead to unacceptable accuracy loss, and we have to prune these three dimensions comprehensively to yield the best result. In this paper, a simple yet effective pruning framework is proposed to comprehensively consider these three dimensions. Our framework falls into two steps: 1) Determining the optimal depth (d*), width (w*), and image resolution (r) for the model. 2) Pruning the model in terms of (d*, w*, r*). Specifically, at the first step, we formulate model acceleration as an optimization problem. It takes depth (d), width (w) and image resolution (r) as variables and the model's accuracy as the optimization objective. Although it is hard to determine the expression of the objective function, approximating it with polynomials is still feasible, during which several properties of the objective function are utilized to ease and speedup the fitting process. Then the optimal d*, w* and r* are attained by maximizing the objective function with Lagrange multiplier theorem and KKT conditions. Extensive experiments are done on several popular architectures and datasets. The results show that we have outperformd the state-of-the-art pruning methods. The code will be published soon.


page 4

page 7


A One-step Pruning-recovery Framework for Acceleration of Convolutional Neural Networks

Acceleration of convolutional neural network has received increasing att...

A closer look at network resolution for efficient network design

There is growing interest in designing lightweight neural networks for m...

Comprehensive Online Network Pruning via Learnable Scaling Factors

One of the major challenges in deploying deep neural network architectur...

DBP: Discrimination Based Block-Level Pruning for Deep Model Acceleration

Neural network pruning is one of the most popular methods of acceleratin...

SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning

For many years, the family of convolutional neural networks (CNNs) has b...

Joint Channel and Weight Pruning for Model Acceleration on Moblie Devices

For practical deep neural network design on mobile devices, it is essent...

On the Predictability of Pruning Across Scales

We show that the error of magnitude-pruned networks follows a scaling la...

Please sign up or login with your details

Forgot password? Click here to reset