On the Complementarity of Data Selection and Fine Tuning for Domain Adaptation

09/15/2021
by   Dan Iter, et al.
0

Domain adaptation of neural networks commonly relies on three training phases: pretraining, selected data training and then fine tuning. Data selection improves target domain generalization by training further on pretraining data identified by relying on a small sample of target domain data. This work examines the benefit of data selection for language modeling and machine translation. Our experiments assess the complementarity of selection with fine tuning and result in practical recommendations: (i) selected data must be similar to the fine-tuning domain but not so much as to erode the complementary effect of fine-tuning; (ii) there is a trade-off between selecting little data for fast but limited progress or much data for slow but long lasting progress; (iii) data selection can be applied early during pretraining, with performance gains comparable to long pretraining session; (iv) data selection from domain classifiers is often more effective than the popular contrastive data selection method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/03/2021

Gradual Fine-Tuning for Low-Resource Domain Adaptation

Fine-tuning is known to improve NLP models by adapting an initial model ...
research
04/04/2019

Unsupervised Domain Adaptation of Contextualized Embeddings: A Case Study in Early Modern English

Contextualized word embeddings such as ELMo and BERT provide a foundatio...
research
09/12/2023

Annotating Data for Fine-Tuning a Neural Ranker? Current Active Learning Strategies are not Better than Random Selection

Search methods based on Pretrained Language Models (PLM) have demonstrat...
research
09/16/2020

Similarity-based data mining for online domain adaptation of a sonar ATR system

Due to the expensive nature of field data gathering, the lack of trainin...
research
10/20/2022

Automatic Document Selection for Efficient Encoder Pretraining

Building pretrained language models is considered expensive and data-int...
research
10/23/2019

Instance-Based Model Adaptation For Direct Speech Translation

Despite recent technology advancements, the effectiveness of neural appr...
research
10/22/2020

ConVEx: Data-Efficient and Few-Shot Slot Labeling

We propose ConVEx (Conversational Value Extractor), an efficient pretrai...

Please sign up or login with your details

Forgot password? Click here to reset