DeepAI AI Chat
Log In Sign Up

GMP*: Well-Tuned Global Magnitude Pruning Can Outperform Most BERT-Pruning Methods

10/12/2022
by   Eldar Kurtic, et al.
Institute of Science and Technology Austria
0

We revisit the performance of the classic gradual magnitude pruning (GMP) baseline for large language models, focusing on the classic BERT benchmark on various popular tasks. Despite existing evidence in the literature that GMP performs poorly, we show that a simple and general variant, which we call GMP*, can match and sometimes outperform more complex state-of-the-art methods. Our results provide a simple yet strong baseline for future work, highlight the importance of parameter tuning for baselines, and even improve the performance of the state-of-the-art second-order pruning method in this setting.

READ FULL TEXT
10/15/2020

A Deeper Look at the Layerwise Sparsity of Magnitude-based Pruning

Recent discoveries on neural network pruning reveal that, with a careful...
10/10/2019

Structured Pruning of Large Language Models

Large language models have recently achieved state of the art performanc...
09/29/2022

Is Complexity Required for Neural Network Pruning? A Case Study on Global Magnitude Pruning

Pruning neural networks has become popular in the last decade when it wa...
06/20/2023

A Simple and Effective Pruning Approach for Large Language Models

As their size increases, Large Languages Models (LLMs) are natural candi...
11/26/2021

How Well Do Sparse Imagenet Models Transfer?

Transfer learning is a classic paradigm by which models pretrained on la...
09/30/2020

AUBER: Automated BERT Regularization

How can we effectively regularize BERT? Although BERT proves its effecti...
10/18/2021

BERMo: What can BERT learn from ELMo?

We propose BERMo, an architectural modification to BERT, which makes pre...