On Transferability of Prompt Tuning for Natural Language Understanding

11/12/2021
by   Yusheng Su, et al.
19

Prompt tuning (PT) is a promising parameter-efficient method to utilize extremely large pre-trained language models (PLMs), which could achieve comparable performance to full-parameter fine-tuning by only tuning a few soft prompts. However, compared to fine-tuning, PT empirically requires much more training steps. To explore whether we can improve the efficiency of PT by reusing trained soft prompts and sharing learned knowledge, we empirically investigate the transferability of soft prompts across different tasks and models. In cross-task transfer, we find that trained soft prompts can well transfer to similar tasks and initialize PT for them to accelerate training and improve performance. Moreover, to explore what factors influence prompts' transferability across tasks, we investigate how to measure the prompt similarity and find that the overlapping rate of activated neurons highly correlates to the transferability. In cross-model transfer, we explore how to project the prompts of a PLM to another PLM and successfully train a kind of projector which can achieve non-trivial transfer performance on similar tasks. However, initializing PT with the projected prompts does not work well, which may be caused by optimization preferences and PLMs' high redundancy. Our findings show that improving PT with knowledge transfer is possible and promising, while prompts' cross-task transferability is generally better than the cross-model transferability.

READ FULL TEXT

page 5

page 17

page 18

research
08/22/2022

PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation

Prompt-tuning, which freezes pretrained language models (PLMs) and only ...
research
04/23/2018

Parameter Transfer Unit for Deep Neural Networks

Parameters in deep neural networks which are trained on large-scale data...
research
04/23/2018

Dropping Networks for Transfer Learning

In natural language understanding, many challenges require learning rela...
research
08/23/2022

Learning More May Not Be Better: Knowledge Transferability in Vision and Language Tasks

Is more data always better to train vision-and-language models? We study...
research
10/13/2021

Newer is not always better: Rethinking transferability metrics, their peculiarities, stability and performance

Fine-tuning of large pre-trained image and language models on small cust...
research
03/29/2020

Sequential Transfer Machine Learning in Networks: Measuring the Impact of Data and Neural Net Similarity on Transferability

In networks of independent entities that face similar predictive tasks, ...
research
10/20/2022

Evidence > Intuition: Transferability Estimation for Encoder Selection

With the increase in availability of large pre-trained language models (...

Please sign up or login with your details

Forgot password? Click here to reset