Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models

03/07/2022
by   Shengnan An, et al.
0

Recently the prompt-tuning paradigm has attracted significant attention. By only tuning continuous prompts with a frozen pre-trained language model (PLM), prompt-tuning takes a step towards deploying a shared frozen PLM to serve numerous downstream tasks. Although prompt-tuning shows good performance on certain natural language understanding (NLU) tasks, its effectiveness on natural language generation (NLG) tasks is still under-explored. In this paper, we argue that one of the factors hindering the development of prompt-tuning on NLG tasks is the unfamiliar inputs (i.e., inputs are linguistically different from the pretraining corpus). For example, our preliminary exploration reveals a large performance gap between prompt-tuning and fine-tuning when unfamiliar inputs occur frequently in NLG tasks. This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations, leading to a more effective way to adapt unfamiliar inputs to frozen PLMs. Our proposed input-tuning is conceptually simple and empirically powerful. Experimental results on seven NLG tasks demonstrate that input-tuning is significantly and consistently better than prompt-tuning. Furthermore, on three of these tasks, input-tuning can achieve a comparable or even better performance than fine-tuning.

READ FULL TEXT
research
01/01/2021

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Fine-tuning is the de facto way to leverage large pretrained language mo...
research
03/18/2021

GPT Understands, Too

While GPTs with traditional fine-tuning fail to achieve strong results o...
research
01/21/2022

Context-Tuning: Learning Contextualized Prompts for Natural Language Generation

Recently, pretrained language models (PLMs) have made exceptional succes...
research
02/18/2023

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

This technical report briefly describes our JDExplore d-team's submissio...
research
03/31/2023

A Closer Look at Parameter-Efficient Tuning in Diffusion Models

Large-scale diffusion models like Stable Diffusion are powerful and find...
research
05/08/2021

Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine-Tuning

Transfer learning has become the dominant paradigm for many natural lang...
research
08/19/2023

On-the-fly Improving Performance of Deep Code Models via Input Denoising

Deep learning has been widely adopted to tackle various code-based tasks...

Please sign up or login with your details

Forgot password? Click here to reset