Multi-Stage Pre-training for Low-Resource Domain Adaptation

10/12/2020
by   Rong Zhang, et al.
0

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize structure in the unlabeled data to create auxiliary synthetic tasks, which helps the LM transfer to downstream tasks. We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2021

Different Strokes for Different Folks: Investigating Appropriate Further Pre-training Approaches for Diverse Dialogue Tasks

Loading models pre-trained on the large-scale corpus in the general doma...
research
05/22/2023

TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

Intermediate training of pre-trained transformer-based language models o...
research
12/30/2020

DEER: A Data Efficient Language Model for Event Temporal Reasoning

Pretrained language models (LMs) such as BERT, RoBERTa, and ELECTRA are ...
research
12/09/2022

From Clozing to Comprehending: Retrofitting Pre-trained Language Model to Pre-trained Machine Reader

We present Pre-trained Machine Reader (PMR), a novel method to retrofit ...
research
12/22/2020

ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces

As mobile devices are becoming ubiquitous, regularly interacting with a ...
research
06/02/2020

A Pairwise Probe for Understanding BERT Fine-Tuning on Machine Reading Comprehension

Pre-trained models have brought significant improvements to many NLP tas...
research
06/06/2022

Domain-specific Language Pre-training for Dialogue Comprehension on Clinical Inquiry-Answering Conversations

There is growing interest in the automated extraction of relevant inform...

Please sign up or login with your details

Forgot password? Click here to reset