KALA: Knowledge-Augmented Language Model Adaptation

04/22/2022
by   Minki Kang, et al.
13

Pre-trained language models (PLMs) have achieved remarkable success on various natural language understanding tasks. Simple fine-tuning of PLMs, on the other hand, might be suboptimal for domain-specific tasks because they cannot possibly cover knowledge from all domains. While adaptive pre-training of PLMs can help them obtain domain-specific knowledge, it requires a large training cost. Moreover, adaptive pre-training can harm the PLM's performance on the downstream task by causing catastrophic forgetting of its general knowledge. To overcome such limitations of adaptive pre-training for PLM adaption, we propose a novel domain adaption framework for PLMs coined as Knowledge-Augmented Language model Adaptation (KALA), which modulates the intermediate hidden representations of PLMs with domain knowledge, consisting of entities and their relational facts. We validate the performance of our KALA on question answering and named entity recognition tasks on multiple datasets across various domains. The results show that, despite being computationally efficient, our KALA largely outperforms adaptive pre-training. Code is available at: https://github.com/Nardien/KALA/.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/12/2020

BioMegatron: Larger Biomedical Domain Language Model

There has been an influx of biomedical domain-specific language models, ...
research
12/10/2022

A Unified Knowledge Graph Service for Developing Domain Language Models in AI Software

Natural Language Processing (NLP) is one of the core techniques in AI so...
research
06/10/2021

Linguistically Informed Masking for Representation Learning in the Patent Domain

Domain-specific contextualized language models have demonstrated substan...
research
11/01/2022

VarMAE: Pre-training of Variational Masked Autoencoder for Domain-adaptive Language Understanding

Pre-trained language models have achieved promising performance on gener...
research
07/05/2023

PULSAR at MEDIQA-Sum 2023: Large Language Models Augmented by Synthetic Dialogue Convert Patient Dialogues to Medical Records

This paper describes PULSAR, our system submission at the ImageClef 2023...
research
10/01/2020

An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training

Pre-training large language models has become a standard in the natural ...
research
01/21/2023

Adapting a Language Model While Preserving its General Knowledge

Domain-adaptive pre-training (or DA-training for short), also known as p...

Please sign up or login with your details

Forgot password? Click here to reset