DBP: Discrimination Based Block-Level Pruning for Deep Model Acceleration

12/21/2019
by   Wenxiao Wang, et al.
0

Neural network pruning is one of the most popular methods of accelerating the inference of deep convolutional neural networks (CNNs). The dominant pruning methods, filter-level pruning methods, evaluate their performance through the reduction ratio of computations and deem that a higher reduction ratio of computations is equivalent to a higher acceleration ratio in terms of inference time. However, we argue that they are not equivalent if parallel computing is considered. Given that filter-level pruning only prunes filters in layers and computations in a layer usually run in parallel, most computations reduced by filter-level pruning usually run in parallel with the un-reduced ones. Thus, the acceleration ratio of filter-level pruning is limited. To get a higher acceleration ratio, it is better to prune redundant layers because computations of different layers cannot run in parallel. In this paper, we propose our Discrimination based Block-level Pruning method (DBP). Specifically, DBP takes a sequence of consecutive layers (e.g., Conv-BN-ReLu) as a block and removes redundant blocks according to the discrimination of their output features. As a result, DBP achieves a considerable acceleration ratio by reducing the depth of CNNs. Extensive experiments show that DBP has surpassed state-of-the-art filter-level pruning methods in both accuracy and acceleration ratio. Our code will be made available soon.

READ FULL TEXT
research
11/29/2020

Layer Pruning via Fusible Residual Convolutional Block for Deep Neural Networks

In order to deploy deep convolutional neural networks (CNNs) on resource...
research
04/26/2023

Filter Pruning via Filters Similarity in Consecutive Layers

Filter pruning is widely adopted to compress and accelerate the Convolut...
research
05/06/2021

Exact Acceleration of K-Means++ and K-Means

K-Means++ and its distributed variant K-Means have become de facto tools...
research
10/06/2020

Comprehensive Online Network Pruning via Learnable Scaling Factors

One of the major challenges in deploying deep neural network architectur...
research
03/02/2023

Practical Network Acceleration with Tiny Sets: Hypothesis, Theory, and Algorithm

Due to data privacy issues, accelerating networks with tiny training set...
research
10/10/2020

Accelerate Your CNN from Three Dimensions: A Comprehensive Pruning Framework

To deploy a pre-trained deep CNN on resource-constrained mobile devices,...
research
01/07/2021

L2PF – Learning to Prune Faster

Various applications in the field of autonomous driving are based on con...

Please sign up or login with your details

Forgot password? Click here to reset