One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

05/28/2023
by   Guangtao Zeng, et al.
0

Fine-tuning pre-trained language models for multiple tasks tends to be expensive in terms of storage. To mitigate this, parameter-efficient transfer learning (PETL) methods have been proposed to address this issue, but they still require a significant number of parameters and storage when being applied to broader ranges of tasks. To achieve even greater storage reduction, we propose PROPETL, a novel method that enables efficient sharing of a single PETL module which we call prototype network (e.g., adapter, LoRA, and prefix-tuning) across layers and tasks. We then learn binary masks to select different sub-networks from the shared prototype network and apply them as PETL modules into different layers. We find that the binary masks can determine crucial information from the network, which is often ignored in previous studies. Our work can also be seen as a type of pruning method, where we find that overparameterization also exists in the seemingly small PETL modules. We evaluate PROPETL on various downstream tasks and show that it can outperform other PETL methods with approximately 10 the latter.

READ FULL TEXT

page 1

page 3

research
01/27/2023

Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning

As the size of the pre-trained language model (PLM) continues to increas...
research
01/13/2020

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

Inductive transfer learning has had a big impact on computer vision and ...
research
01/13/2020

Parameter-Efficient Transfer from Sequential Behaviors for User Profiling and Recommendation

Inductive transfer learning has greatly impacted the computer vision and...
research
07/27/2023

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-trained Vision-Language Models

Prompt tuning and adapter tuning have shown great potential in transferr...
research
12/25/2021

Neural Network Module Decomposition and Recomposition

We propose a modularization method that decomposes a deep neural network...
research
12/06/2022

Parameter Efficient Transfer Learning for Various Speech Processing Tasks

Fine-tuning of self-supervised models is a powerful transfer learning me...
research
03/27/2018

Efficient parametrization of multi-domain deep neural networks

A practical limitation of deep neural networks is their high degree of s...

Please sign up or login with your details

Forgot password? Click here to reset