CALM: Continuous Adaptive Learning for Language Modeling

04/08/2020
by   Kristjan Arumae, et al.
0

Training large language representation models has become a standard in the natural language processing community. This allows for fine tuning on any number of specific tasks, however, these large high capacity models can continue to train on domain specific unlabeled data to make initialization even more robust for supervised tasks. We demonstrate that in practice these pre-trained models present performance deterioration in the form of catastrophic forgetting when evaluated on tasks from a general domain such as GLUE. In this work we propose CALM, Continuous Adaptive Learning for Language Modeling: techniques to render models which retain knowledge across multiple domains. With these methods, we are able to reduce the performance gap across supervised tasks introduced by task specific models which we demonstrate using a continual learning setting in biomedical and clinical domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2021

Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning

Continual learning (CL) learns a sequence of tasks incrementally with th...
research
10/28/2016

Towards a continuous modeling of natural language domains

Humans continuously adapt their style and language to a variety of domai...
research
12/14/2022

MANTa: Efficient Gradient-Based Tokenization for Robust End-to-End Language Modeling

Static subword tokenization algorithms have been an essential component ...
research
06/25/2021

Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains

Large pre-trained models have achieved great success in many natural lan...
research
03/20/2022

Hierarchical Inductive Transfer for Continual Dialogue Learning

Pre-trained models have achieved excellent performance on the dialogue t...
research
07/18/2022

The Multiple Subnetwork Hypothesis: Enabling Multidomain Learning by Isolating Task-Specific Subnetworks in Feedforward Neural Networks

Neural networks have seen an explosion of usage and research in the past...
research
01/31/2019

Learning and Evaluating General Linguistic Intelligence

We define general linguistic intelligence as the ability to reuse previo...

Please sign up or login with your details

Forgot password? Click here to reset