Network Automatic Pruning: Start NAP and Take a Nap

01/17/2021
by   Wenyuan Zeng, et al.
0

Network pruning can significantly reduce the computation and memory footprint of large neural networks. To achieve a good trade-off between model size and performance, popular pruning techniques usually rely on hand-crafted heuristics and require manually setting the compression ratio for each layer. This process is typically time-consuming and requires expert knowledge to achieve good results. In this paper, we propose NAP, a unified and automatic pruning framework for both fine-grained and structured pruning. It can find out unimportant components of a network and automatically decide appropriate compression ratios for different layers, based on a theoretically sound criterion. Towards this goal, NAP uses an efficient approximation of the Hessian for evaluating the importances of components, based on a Kronecker-factored Approximate Curvature method. Despite its simpleness to use, NAP outperforms previous pruning methods by large margins. For fine-grained pruning, NAP can compress AlexNet and VGG16 by 25x, and ResNet-50 by 6.7x without loss in accuracy on ImageNet. For structured pruning (e.g. channel pruning), it can reduce flops of VGG16 by 5.4x and ResNet-50 by 2.3x with only 1 hyper-parameter tuning and requires no expert knowledge. You can start NAP and then take a nap!

READ FULL TEXT

page 3

page 13

research
03/04/2022

Structured Pruning is All You Need for Pruning CNNs at Initialization

Pruning is a popular technique for reducing the model size and computati...
research
05/05/2021

Sequential Encryption of Sparse Neural Networks Toward Optimum Representation of Irregular Sparsity

Even though fine-grained pruning techniques achieve a high compression r...
research
10/14/2019

Learning Sparsity and Quantization Jointly and Automatically for Neural Network Compression via Constrained Optimization

Deep Neural Networks (DNNs) are widely applied in a wide range of usecas...
research
08/14/2023

Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning

Structured pruning and quantization are promising approaches for reducin...
research
06/02/2023

Task-Agnostic Structured Pruning of Speech Representation Models

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM h...
research
04/26/2022

Attentive Fine-Grained Structured Sparsity for Image Restoration

Image restoration tasks have witnessed great performance improvement in ...
research
11/01/2018

Hybrid Pruning: Thinner Sparse Networks for Fast Inference on Edge Devices

We introduce hybrid pruning which combines both coarse-grained channel a...

Please sign up or login with your details

Forgot password? Click here to reset