TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

05/22/2023
by   Chia-Chien Hung, et al.
0

Intermediate training of pre-trained transformer-based language models on domain-specific data leads to substantial gains for downstream tasks. To increase efficiency and prevent catastrophic forgetting alleviated from full domain-adaptive pre-training, approaches such as adapters have been developed. However, these require additional parameters for each layer, and are criticized for their limited expressiveness. In this work, we introduce TADA, a novel task-agnostic domain adaptation method which is modular, parameter-efficient, and thus, data-efficient. Within TADA, we retrain the embeddings to learn domain-aware input representations and tokenizers for the transformer encoder, while freezing all other parameters of the model. Then, task-specific fine-tuning is performed. We further conduct experiments with meta-embeddings and newly introduced meta-tokenizers, resulting in one model per task in multi-domain use cases. Our broad evaluation in 4 downstream tasks for 14 domains across single- and multi-domain setups and high- and low-resource scenarios reveals that TADA is an effective and efficient alternative to full domain-adaptive pre-training and adapters for domain adaptation, while not introducing additional parameters or complex training steps.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2021

Task-adaptive Pre-training of Language Models with Word Embedding Regularization

Pre-trained language models (PTLMs) acquire domain-independent linguisti...
research
03/21/2021

AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization

State-of-the-art abstractive summarization models generally rely on exte...
research
10/12/2020

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Transfer learning techniques are particularly useful in NLP tasks where ...
research
06/10/2021

Linguistically Informed Masking for Representation Learning in the Patent Domain

Domain-specific contextualized language models have demonstrated substan...
research
10/09/2020

Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding

Neural models have yielded state-of-the-art results in deciphering spoke...
research
03/22/2022

A Broad Study of Pre-training for Domain Generalization and Adaptation

Deep models must learn robust and transferable representations in order ...
research
07/20/2023

PASTA: Pretrained Action-State Transformer Agents

Self-supervised learning has brought about a revolutionary paradigm shif...

Please sign up or login with your details

Forgot password? Click here to reset