One Model, Multiple Tasks: Pathways for Natural Language Understanding

03/07/2022
by   Duyu Tang, et al.
1

This paper presents a Pathways approach to handle many tasks at once. Our approach is general-purpose and sparse. Unlike prevailing single-purpose models that overspecialize at individual tasks and learn from scratch when being extended to new tasks, our approach is general-purpose with the ability of stitching together existing skills to learn new tasks more effectively. Different from traditional dense models that always activate all the model parameters, our approach is sparsely activated: only relevant parts of the model (like pathways through the network) are activated. We take natural language understanding as a case study and define a set of skills like the skill of understanding the sentiment of text and the skill of understanding natural language questions. These skills can be reused and combined to support many different tasks and situations. We develop our system using Transformer as the backbone. For each skill, we implement skill-specific feed-forward networks, which are activated only if the skill is relevant to the task. An appealing feature of our model is that it not only supports sparsely activated fine-tuning, but also allows us to pretrain skills in the same sparse way with masked language modeling and next sentence prediction. We call this model SkillNet. We have three major findings. First, with only one model checkpoint, SkillNet performs better than task-specific fine-tuning and two multi-task learning baselines (i.e., dense model and Mixture-of-Experts model) on six tasks. Second, sparsely activated pre-training further improves the overall performance. Third, SkillNet significantly outperforms baseline systems when being extended to new tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2023

SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills

Traditional multitask learning methods basically can only exploit common...
research
04/26/2022

SkillNet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach

We present SkillNet-NLG, a sparsely activated approach that handles many...
research
07/26/2023

Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

The quality of training data impacts the performance of pre-trained larg...
research
08/19/2022

Coarse-to-Fine: Hierarchical Multi-task Learning for Natural Language Understanding

Generalized text representations are the foundation of many natural lang...
research
04/01/2021

Towards General Purpose Vision Systems

A special purpose learning system assumes knowledge of admissible tasks ...
research
06/04/2022

Actuarial Applications of Natural Language Processing Using Transformers: Case Studies for Using Text Features in an Actuarial Context

This tutorial demonstrates workflows to incorporate text data into actua...
research
09/21/2023

On the Relationship between Skill Neurons and Robustness in Prompt Tuning

Prompt Tuning is a popular parameter-efficient finetuning method for pre...

Please sign up or login with your details

Forgot password? Click here to reset