Domain Adaptation for Sparse-Data Settings: What Do We Gain by Not Using Bert?

03/31/2022
by   Marina Sedinkina, et al.
0

The practical success of much of NLP depends on the availability of training data. However, in real-world scenarios, training data is often scarce, not least because many application domains are restricted and specific. In this work, we compare different methods to handle this problem and provide guidelines for building NLP applications when there is only a small amount of labeled training data available for a specific domain. While transfer learning with pre-trained language models outperforms other methods across tasks, alternatives do not perform much worse while requiring much less computational effort, thus significantly reducing monetary and environmental cost. We examine the performance tradeoffs of several such alternatives, including models that can be trained up to 175K times faster and do not require a single GPU.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2019

Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models

Training models on low-resource named entity recognition tasks has been ...
research
10/03/2022

Characterization of effects of transfer learning across domains and languages

With ever-expanding datasets of domains, tasks and languages, transfer l...
research
09/06/2019

Abductive Reasoning as Self-Supervision for Common Sense Question Answering

Question answering has seen significant advances in recent times, especi...
research
09/10/2021

Pre-train or Annotate? Domain Adaptation with a Constrained Budget

Recent work has demonstrated that pre-training in-domain language models...
research
05/23/2022

Domain Adaptation for Memory-Efficient Dense Retrieval

Dense retrievers encode documents into fixed dimensional embeddings. How...
research
03/22/2023

Generate labeled training data using Prompt Programming and GPT-3. An example of Big Five Personality Classification

We generated 25000 conversations labeled with Big Five Personality trait...
research
10/23/2020

Ranking Creative Language Characteristics in Small Data Scenarios

The ability to rank creative natural language provides an important gene...

Please sign up or login with your details

Forgot password? Click here to reset