Adapting a Language Model While Preserving its General Knowledge

01/21/2023
by   Zixuan Ke, et al.
0

Domain-adaptive pre-training (or DA-training for short), also known as post-training, aims to train a pre-trained general-purpose language model (LM) using an unlabeled corpus of a particular domain to adapt the LM so that end-tasks in the domain can give improved performances. However, existing DA-training methods are in some sense blind as they do not explicitly identify what knowledge in the LM should be preserved and what should be changed by the domain corpus. This paper shows that the existing methods are suboptimal and proposes a novel method to perform a more informed adaptation of the knowledge in the LM by (1) soft-masking the attention heads based on their importance to best preserve the general knowledge in the LM and (2) contrasting the representations of the general and the full (both general and domain knowledge) to learn an integrated representation with both general and domain-specific knowledge. Experimental results will demonstrate the effectiveness of the proposed approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2023

Continual Pre-training of Language Models

Language models (LMs) have been instrumental for the rapid advance of na...
research
04/22/2022

KALA: Knowledge-Augmented Language Model Adaptation

Pre-trained language models (PLMs) have achieved remarkable success on v...
research
06/10/2021

Linguistically Informed Masking for Representation Learning in the Patent Domain

Domain-specific contextualized language models have demonstrated substan...
research
08/13/2019

Domain Adaptive Training BERT for Response Selection

We focus on multi-turn response selection in a retrieval-based dialog sy...
research
09/17/2019

K-BERT: Enabling Language Representation with Knowledge Graph

Pre-trained language representation models, such as BERT, capture a gene...
research
05/27/2019

QuesNet: A Unified Representation for Heterogeneous Test Questions

Understanding learning materials (e.g. test questions) is a crucial issu...
research
07/10/2023

KU-DMIS-MSRA at RadSum23: Pre-trained Vision-Language Model for Radiology Report Summarization

In this paper, we introduce CheXOFA, a new pre-trained vision-language m...

Please sign up or login with your details

Forgot password? Click here to reset