Composing Parameter-Efficient Modules with Arithmetic Operations

06/26/2023
by   Jinghan Zhang, et al.
0

As an efficient alternative to conventional full finetuning, parameter-efficient finetuning (PEFT) is becoming the prevailing method to adapt pretrained language models. In PEFT, a lightweight module is learned on each dataset while the underlying pretrained language model remains unchanged, resulting in multiple compact modules representing diverse skills when applied to various domains and tasks. In this paper, we propose to compose these parameter-efficient modules through linear arithmetic operations in the weight space, thereby integrating different module capabilities. Specifically, we first define addition and negation operators for the module, and then further compose these two basic operators to perform flexible arithmetic. Our approach requires no additional training and enables highly flexible module composition. We apply different arithmetic operations to compose the parameter-efficient modules for (1) distribution generalization, (2) multi-tasking, (3) unlearning, and (4) domain transfer. Additionally, we extend our approach to detoxify Alpaca-LoRA, the latest instruction-tuned large language model based on LLaMA. Empirical results demonstrate that our approach produces new and effective parameter-efficient modules that significantly outperform existing ones across all settings.

READ FULL TEXT

page 2

page 19

page 20

research
09/06/2023

GPT Can Solve Mathematical Problems Without a Calculator

Previous studies have typically assumed that large language models are u...
research
08/16/2023

Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation

Large language models (LLMs) have been widely used in various applicatio...
research
04/21/2023

Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition

In recent years, Large Language Models such as GPT-3 showed remarkable c...
research
12/16/2021

Efficient Hierarchical Domain Adaptation for Pretrained Language Models

Generative language models are trained on diverse, general domain corpor...
research
07/14/2022

Convolutional Bypasses Are Better Vision Transformer Adapters

The pretrain-then-finetune paradigm has been widely adopted in computer ...
research
08/22/2023

Diversity Measures: Domain-Independent Proxies for Failure in Language Model Queries

Error prediction in large language models often relies on domain-specifi...
research
09/23/2018

Neural Arithmetic Expression Calculator

This paper presents a pure neural solver for arithmetic expression calcu...

Please sign up or login with your details

Forgot password? Click here to reset