Pruning Pre-trained Language Models Without Fine-Tuning

10/12/2022
by   Ting Jiang, et al.
0

To overcome the overparameterized problem in Pre-trained Language Models (PLMs), pruning is widely used as a simple and straightforward compression method by directly removing unimportant weights. Previous first-order methods successfully compress PLMs to extremely high sparsity with little performance drop. These methods, such as movement pruning, use first-order information to prune PLMs while fine-tuning the remaining weights. In this work, we argue fine-tuning is redundant for first-order pruning, since first-order pruning is sufficient to converge PLMs to downstream tasks without fine-tuning. Under this motivation, we propose Static Model Pruning (SMP), which only uses first-order pruning to adapt PLMs to downstream tasks while achieving the target sparsity level. In addition, we also design a new masking function and training objective to further improve SMP. Extensive experiments at various sparsity levels show SMP has significant improvements over first-order and zero-order methods. Unlike previous first-order methods, SMP is also applicable to low sparsity and outperforms zero-order methods. Meanwhile, SMP is more parameter efficient than other methods due to it does not require fine-tuning.

READ FULL TEXT
research
05/15/2020

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Magnitude pruning is a widely used strategy for reducing model size in p...
research
05/28/2023

Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning

Large pre-trained models (LPMs), such as LLaMA and ViT-G, have shown exc...
research
05/15/2023

Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text Sequence-to-Sequence modeling

The increasing size of language models raises great research interests i...
research
09/15/2023

Frustratingly Simple Memory Efficiency for Pre-trained Language Models via Dynamic Embedding Pruning

The extensive memory footprint of pre-trained language models (PLMs) can...
research
09/05/2020

FlipOut: Uncovering Redundant Weights via Sign Flipping

Modern neural networks, although achieving state-of-the-art results on m...
research
03/09/2022

Data-Efficient Structured Pruning via Submodular Optimization

Structured pruning is an effective approach for compressing large pre-tr...
research
03/20/2023

Greedy Pruning with Group Lasso Provably Generalizes for Matrix Sensing and Neural Networks with Quadratic Activations

Pruning schemes have been widely used in practice to reduce the complexi...

Please sign up or login with your details

Forgot password? Click here to reset