Structured Pruning of Large Language Models

10/10/2019
by   Ziheng Wang, et al.
0

Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly, and raises an interesting question: do language models need to be large? We study this question through the lens of model compression. We present a novel, structured pruning approach based on low rank factorization and augmented Lagrangian L0 norm regularization. Our structured approach achieves significant inference speedups while matching or outperforming our unstructured pruning baseline at various sparsity levels. We apply our method to state of the art models on the enwiki8 dataset and obtain a 1.19 perplexity score with just 5M parameters, vastly outperforming a model of the same size trained from scratch. We also demonstrate that our method can be applied to language model fine-tuning by pruning the BERT model on several downstream classification benchmarks.

READ FULL TEXT
research
03/14/2022

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

Pre-trained Transformer-based language models have become a key building...
research
05/15/2020

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Magnitude pruning is a widely used strategy for reducing model size in p...
research
10/12/2022

GMP*: Well-Tuned Global Magnitude Pruning Can Outperform Most BERT-Pruning Methods

We revisit the performance of the classic gradual magnitude pruning (GMP...
research
05/30/2022

Parameter Efficient Diff Pruning for Bias Mitigation

In recent years language models have achieved state of the art performan...
research
10/18/2021

BERMo: What can BERT learn from ELMo?

We propose BERMo, an architectural modification to BERT, which makes pre...
research
05/25/2022

Sparse*BERT: Sparse Models are Robust

Large Language Models have become the core architecture upon which most ...
research
09/18/2023

Pruning Large Language Models via Accuracy Predictor

Large language models(LLMs) containing tens of billions of parameters (o...

Please sign up or login with your details

Forgot password? Click here to reset