LLM-Pruner: On the Structural Pruning of Large Language Models

05/19/2023
by   Xinyin Ma, et al.
0

Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. However, such impressive capability typically comes with a substantial model size, which presents significant challenges in both the deployment, inference, and training stages. With LLM being a general-purpose task solver, we explore its compression in a task-agnostic manner, which aims to preserve the multi-task solving and language generation ability of the original LLM. One challenge to achieving this is the enormous size of the training corpus of LLM, which makes both data transfer and model post-training over-burdensome. Thus, we tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures based on gradient information, maximally preserving the majority of the LLM's functionality. To this end, the performance of pruned models can be efficiently recovered through tuning techniques, LoRA, in merely 3 hours, requiring only 50K data. We validate the LLM-Pruner on three LLMs, including LLaMA, Vicuna, and ChatGLM, and demonstrate that the compressed models still exhibit satisfactory capabilities in zero-shot classification and generation. The code is available at: https://github.com/horseee/LLM-Pruner

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Structural Pruning for Diffusion Models

Generative modeling has recently undergone remarkable advancements, prim...
research
12/20/2022

Go-tuning: Improving Zero-shot Learning Abilities of Smaller Language Models

With increasing scale, large language models demonstrate both quantitati...
research
05/09/2022

Attribution-based Task-specific Pruning for Multi-task Language Models

Multi-task language models show outstanding performance for various natu...
research
09/18/2023

Pruning Large Language Models via Accuracy Predictor

Large language models(LLMs) containing tens of billions of parameters (o...
research
07/28/2022

LAD: Language Models as Data for Zero-Shot Dialog

To facilitate zero-shot generalization in taskoriented dialog, this pape...
research
09/19/2023

Language Modeling Is Compression

It has long been established that predictive models can be transformed i...
research
03/08/2023

Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference

Large Language Models (LLMs) like GPT-3 have sparked significant interes...

Please sign up or login with your details

Forgot password? Click here to reset