Pre-train or Annotate? Domain Adaptation with a Constrained Budget

09/10/2021
by   Fan Bai, et al.
0

Recent work has demonstrated that pre-training in-domain language models can boost performance when adapting to a new domain. However, the costs associated with pre-training raise an important question: given a fixed budget, what steps should an NLP practitioner take to maximize performance? In this paper, we study domain adaptation under budget constraints, and approach it as a customer choice problem between data annotation and pre-training. Specifically, we measure the annotation cost of three procedural text datasets and the pre-training cost of three in-domain language models. Then we evaluate the utility of different combinations of pre-training and data annotation under varying budget constraints to assess which combination strategy works best. We find that, for small budgets, spending all funds on annotation leads to the best performance; once the budget becomes large enough, a combination of data annotation and in-domain pre-training works more optimally. We therefore suggest that task-specific data annotation should be part of an economical strategy when adapting an NLP model to a new domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2021

Task-adaptive Pre-training of Language Models with Word Embedding Regularization

Pre-trained language models (PTLMs) acquire domain-independent linguisti...
research
07/14/2022

BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling

The pre-training of large language models usually requires massive amoun...
research
03/22/2022

A Broad Study of Pre-training for Domain Generalization and Adaptation

Deep models must learn robust and transferable representations in order ...
research
03/21/2021

AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization

State-of-the-art abstractive summarization models generally rely on exte...
research
06/27/2023

Large Language Models as Annotators: Enhancing Generalization of NLP Models at Minimal Cost

State-of-the-art supervised NLP models achieve high accuracy but are als...
research
03/31/2022

Domain Adaptation for Sparse-Data Settings: What Do We Gain by Not Using Bert?

The practical success of much of NLP depends on the availability of trai...
research
11/24/2021

Temporal Effects on Pre-trained Models for Language Processing Tasks

Keeping the performance of language technologies optimal as time passes ...

Please sign up or login with your details

Forgot password? Click here to reset