Gradient-Free Structured Pruning with Unlabeled Data

03/07/2023
by   Azade Nova, et al.
0

Large Language Models (LLMs) have achieved great success in solving difficult tasks across many domains, but such success comes with a high computation cost, and inference latency. As developers and third parties customize these models, the need to provide efficient inference has increased. Many efforts have attempted to reduce inference cost through model compression techniques such as pruning and distillation. However, these techniques either require labeled data, or are time-consuming as they require the compressed model to be retrained to regain accuracy. In this paper, we propose a gradient-free structured pruning framework that uses only unlabeled data. An evaluation on the GLUE and SQuAD benchmarks using BERT_BASE and DistilBERT illustrates the effectiveness of the proposed approach. By only using the weights of the pre-trained model and unlabeled data, in a matter of a few minutes on a single GPU, up to 40 accuracy loss across all tasks considered.

READ FULL TEXT
research
04/01/2022

Structured Pruning Learns Compact and Accurate Models

The growing size of neural language models has led to increased attentio...
research
05/18/2023

PDP: Parameter-free Differentiable Pruning is All You Need

DNN pruning is a popular way to reduce the size of a model, improve the ...
research
12/15/2022

Gradient-based Intra-attention Pruning on Pre-trained Language Models

Pre-trained language models achieve superior performance, but they are c...
research
05/24/2023

PruMUX: Augmenting Data Multiplexing with Model Compression

As language models increase in size by the day, methods for efficient in...
research
04/26/2019

Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization

Existing methods for CWS usually rely on a large number of labeled sente...
research
07/13/2019

Bringing Giant Neural Networks Down to Earth with Unlabeled Data

Compressing giant neural networks has gained much attention for their ex...
research
04/20/2018

Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling

Many efforts have been made to facilitate natural language processing ta...

Please sign up or login with your details

Forgot password? Click here to reset