Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

04/23/2020
by   Suchin Gururangan, et al.
0

Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. Moreover, adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining. Finally, we show that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Overall, we consistently find that multi-phase adaptive pretraining offers large gains in task performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2021

Back-Translated Task Adaptive Pretraining: Improving Accuracy and Robustness on Text Classification

Language models (LMs) pretrained on a large text corpus and fine-tuned o...
research
11/14/2021

Time Waits for No One! Analysis and Challenges of Temporal Misalignment

When an NLP model is trained on text data from one time period and teste...
research
05/22/2023

A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, Toxicity

Pretraining is the preliminary and fundamental step in developing capabl...
research
04/17/2021

Combating Temporal Drift in Crisis with Adapted Embeddings

Language usage changes over time, and this can impact the effectiveness ...
research
05/28/2021

Domain-Adaptive Pretraining Methods for Dialogue Understanding

Language models like BERT and SpanBERT pretrained on open-domain data ha...
research
11/11/2021

Improving Large-scale Language Models and Resources for Filipino

In this paper, we improve on existing language resources for the low-res...
research
07/31/2023

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

While deep learning (DL) models are state-of-the-art in text and image d...

Please sign up or login with your details

Forgot password? Click here to reset