UniPELT: A Unified Framework for Parameter-Efficient Language Model Tuning

10/14/2021
by   Yuning Mao, et al.
1

Conventional fine-tuning of pre-trained language models tunes all model parameters and stores a full model copy for each downstream task, which has become increasingly infeasible as the model size grows larger. Recent parameter-efficient language model tuning (PELT) methods manage to match the performance of fine-tuning with much fewer trainable parameters and perform especially well when the training data is limited. However, different PELT methods may perform rather differently on the same task, making it nontrivial to select the most appropriate method for a specific task, especially considering the fast-growing number of new PELT methods and downstream tasks. In light of model diversity and the difficulty of model selection, we propose a unified framework, UniPELT, which incorporates different PELT methods as submodules and learns to activate the ones that best suit the current data or task setup. Remarkably, on the GLUE benchmark, UniPELT consistently achieves 1 3pt gains compared to the best individual PELT method that it incorporates and even outperforms fine-tuning under different setups. Moreover, UniPELT often surpasses the upper bound when taking the best performance of all its submodules used individually on each task, indicating that a mixture of multiple PELT methods may be inherently more effective than single methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/31/2022

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

Standard fine-tuning of large pre-trained language models (PLMs) for dow...
research
08/23/2023

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

With the increasing size of pre-trained language models (PLMs), fine-tun...
research
10/08/2021

Towards a Unified View of Parameter-Efficient Transfer Learning

Fine-tuning large pre-trained language models on downstream tasks has be...
research
11/16/2022

Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

Conventional fine-tuning encounters increasing difficulties given the si...
research
05/24/2022

AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models

Fine-tuning large-scale pre-trained language models to downstream tasks ...
research
04/28/2023

Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques for LLMs

As foundation models continue to exponentially scale in size, efficient ...
research
11/29/2022

SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformers

Fine-tuning pre-trained language models (PLMs) achieves impressive perfo...

Please sign up or login with your details

Forgot password? Click here to reset