Towards Optimal Filter Pruning with Balanced Performance and Pruning Speed

by   Dong Li, et al.

Filter pruning has drawn more attention since resource constrained platform requires more compact model for deployment. However, current pruning methods suffer either from the inferior performance of one-shot methods, or the expensive time cost of iterative training methods. In this paper, we propose a balanced filter pruning method for both performance and pruning speed. Based on the filter importance criteria, our method is able to prune a layer with approximate layer-wise optimal pruning rate at preset loss variation. The network is pruned in the layer-wise way without the time consuming prune-retrain iteration. If a pre-defined pruning rate for the entire network is given, we also introduce a method to find the corresponding loss variation threshold with fast converging speed. Moreover, we propose the layer group pruning and channel selection mechanism for channel alignment in network with short connections. The proposed pruning method is widely applicable to common architectures and does not involve any additional training except the final fine-tuning. Comprehensive experiments show that our method outperforms many state-of-the-art approaches.


page 1

page 2

page 3

page 4


To filter prune, or to layer prune, that is the question

Recent advances in pruning of neural networks have made it possible to r...

AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference

Channel pruning is an important family of methods to speedup deep model'...

Channel Pruning via Multi-Criteria based on Weight Dependency

Channel pruning has demonstrated its effectiveness in compressing ConvNe...

2PFPCE: Two-Phase Filter Pruning Based on Conditional Entropy

Deep Convolutional Neural Networks (CNNs) offer remarkable performance o...

Speeding up convolutional networks pruning with coarse ranking

Channel-based pruning has achieved significant successes in accelerating...

Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot

Network pruning is a method for reducing test-time computational resourc...

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-tr...