Task Adaptive Parameter Sharing for Multi-Task Learning

03/30/2022
by   Matthew Wallingford, et al.
1

Adapting pre-trained models with broad capabilities has become standard practice for learning a wide range of downstream tasks. The typical approach of fine-tuning different models for each task is performant, but incurs a substantial memory cost. To efficiently learn multiple downstream tasks we introduce Task Adaptive Parameter Sharing (TAPS), a general method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers. This enables multi-task learning while minimizing resources used and competition between tasks. TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights. Further, a sparsity penalty on the number of active layers encourages weight sharing with the base model. Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters. Moreover, TAPS is agnostic to the model architecture and requires only minor changes to the training scheme. We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.

READ FULL TEXT

page 7

page 13

page 14

research
06/08/2021

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

State-of-the-art parameter-efficient fine-tuning methods rely on introdu...
research
07/01/2019

Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment

Parallel deep learning architectures like fine-tuned BERT and MT-DNN, ha...
research
11/17/2022

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

Despite the remarkable success of foundation models, their task-specific...
research
03/08/2023

Provable Pathways: Learning Multiple Tasks over Multiple Paths

Constructing useful representations across a large number of tasks is a ...
research
06/29/2019

NetTailor: Tuning the Architecture, Not Just the Weights

Real-world applications of object recognition often require the solution...
research
05/04/2023

Neuralizer: General Neuroimage Analysis without Re-Training

Neuroimage processing tasks like segmentation, reconstruction, and regis...
research
12/14/2020

Parameter-Efficient Transfer Learning with Diff Pruning

While task-specific finetuning of pretrained networks has led to signifi...

Please sign up or login with your details

Forgot password? Click here to reset