DeepAI AI Chat
Log In Sign Up

DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

10/15/2021
by   Chia-Chien Hung, et al.
0

Recent work has shown that self-supervised dialog-specific pretraining on large conversational datasets yields substantial gains over traditional language modeling (LM) pretraining in downstream task-oriented dialog (TOD). These approaches, however, exploit general dialogic corpora (e.g., Reddit) and thus presumably fail to reliably embed domain-specific knowledge useful for concrete downstream TOD domains. In this work, we investigate the effects of domain specialization of pretrained language models (PLMs) for task-oriented dialog. Within our DS-TOD framework, we first automatically extract salient domain-specific terms, and then use them to construct DomainCC and DomainReddit – resources that we leverage for domain-specific pretraining, based on (i) masked language modeling (MLM) and (ii) response selection (RS) objectives, respectively. We further propose a resource-efficient and modular domain specialization by means of domain adapters – additional parameter-light layers in which we encode the domain knowledge. Our experiments with two prominent TOD tasks – dialog state tracking (DST) and response retrieval (RR) – encompassing five domains from the MultiWOZ TOD benchmark demonstrate the effectiveness of our domain specialization approach. Moreover, we show that the light-weight adapter-based specialization (1) performs comparably to full fine-tuning in single-domain setups and (2) is particularly suitable for multi-domain specialization, in which, besides advantageous computational footprint, it can offer better downstream performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/22/2020

ConVEx: Data-Efficient and Few-Shot Slot Labeling

We propose ConVEx (Conversational Value Extractor), an efficient pretrai...
06/02/2019

Pretraining Methods for Dialog Context Representation Learning

This paper examines various unsupervised pretraining objectives for lear...
12/15/2022

Injecting Domain Knowledge in Language Models for Task-Oriented Dialogue Systems

Pre-trained language models (PLM) have advanced the state-of-the-art acr...
02/14/2023

AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models

Pretrained language models (PLMs) are trained on massive corpora, but of...
04/05/2020

Improved Pretraining for Domain-specific Contextual Embedding Models

We investigate methods to mitigate catastrophic forgetting during domain...
11/03/2020

CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search

Neural rankers based on deep pretrained language models (LMs) have been ...
03/15/2022

Representation Learning for Resource-Constrained Keyphrase Generation

State-of-the-art keyphrase generation methods generally depend on large ...