Lifelong Learning of Few-shot Learners across NLP Tasks

04/18/2021
by   Xisen Jin, et al.
0

Recent advances in large pre-trained language models have greatly improved the performance on a broad set of NLP tasks. However, adapting an existing model to new tasks often requires (repeated) re-training over enormous labeled data that is prohibitively expensive to obtain. Moreover, models learned on new tasks may gradually "forget" about the knowledge learned from earlier tasks (i.e., catastrophic forgetting). In this paper, we study the challenge of lifelong learning to few-shot learn over a sequence of diverse NLP tasks, through continuously fine-tuning a language model. We investigate the model's ability of few-shot generalization to new tasks while retaining its performance on the previously learned tasks. We explore existing continual learning methods in solving this problem and propose a continual meta-learning approach which learns to generate adapter weights from a few examples while regularizing changes of the weights to mitigate catastrophic forgetting. We demonstrate our approach preserves model performance over training tasks and leads to positive knowledge transfer when the future tasks are learned.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2018

Continual Classification Learning Using Generative Models

Continual learning is the ability to sequentially learn over time by acc...
research
09/09/2021

MetaXT: Meta Cross-Task Transfer between Disparate Label Spaces

Albeit the universal representational power of pre-trained language mode...
research
10/14/2021

LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5

Existing approaches to lifelong language learning rely on plenty of labe...
research
10/11/2022

Continual Training of Language Models for Few-Shot Learning

Recent work on applying large language models (LMs) achieves impressive ...
research
03/04/2022

Continual Few-shot Relation Learning via Embedding Space Regularization and Data Augmentation

Existing continual relation learning (CRL) methods rely on plenty of lab...
research
12/16/2021

An Empirical Investigation of the Role of Pre-training in Lifelong Learning

The lifelong learning paradigm in machine learning is an attractive alte...
research
10/15/2021

Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning

How can pre-trained language models (PLMs) learn universal representatio...

Please sign up or login with your details

Forgot password? Click here to reset