Prefix Propagation: Parameter-Efficient Tuning for Long Sequences

05/20/2023
by   Jonathan Li, et al.
0

Parameter-efficient tuning aims to mitigate the large memory requirements of adapting pretrained language models for downstream tasks. For example, one popular method, prefix-tuning, prepends trainable tokens to sequences while freezing the rest of the model's parameters. Although such models attain comparable performance with fine-tuning when applied to sequences with short to moderate lengths, we show their inferior performance when modelling long sequences. To bridge this gap, we propose prefix-propagation, a simple but effective approach that conditions prefixes on previous hidden states. We empirically demonstrate that prefix-propagation outperforms prefix-tuning across long-document tasks, while using 50 investigate the proposed architecture, we also show its advantage in calibration, and perform additional study on its relationship with kernel attention. To the best of our knowledge, this work is the first to focus on parameter-efficient learning for long-sequence language tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/10/2022

Parameter-Efficient Tuning with Special Token Adaptation

Parameter-efficient tuning aims at updating only a small subset of param...
research
04/10/2022

Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks

Parameter-efficient tuning aims to distill knowledge for downstream task...
research
06/20/2019

Conflict as an Inverse of Attention in Sequence Relationship

Attention is a very efficient way to model the relationship between two ...
research
10/10/2022

XPrompt: Exploring the Extreme of Prompt Tuning

Prompt tuning learns soft prompts to condition frozen Pre-trained Langua...
research
07/06/2023

Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain

Adapting pretrained language models to novel domains, such as clinical a...
research
03/31/2023

A Closer Look at Parameter-Efficient Tuning in Diffusion Models

Large-scale diffusion models like Stable Diffusion are powerful and find...
research
08/30/2023

LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models

In recent years, there have been remarkable advancements in the performa...

Please sign up or login with your details

Forgot password? Click here to reset