ConVEx: Data-Efficient and Few-Shot Slot Labeling

by   Matthew Henderson, et al.

We propose ConVEx (Conversational Value Extractor), an efficient pretraining and fine-tuning neural approach for slot-labeling dialog tasks. Instead of relying on more general pretraining objectives from prior work (e.g., language modeling, response selection), ConVEx's pretraining objective, a novel pairwise cloze task using Reddit data, is well aligned with its intended usage on sequence labeling tasks. This enables learning domain-specific slot labelers by simply fine-tuning decoding layers of the pretrained general-purpose sequence labeling model, while the majority of the pretrained model's parameters are kept frozen. We report state-of-the-art performance of ConVEx across a range of diverse domains and data sets for dialog slot-labeling, with the largest gains in the most challenging, few-shot setups. We believe that ConVEx's reduced pretraining times (i.e., only 18 hours on 12 GPUs) and cost, along with its efficient fine-tuning and strong performance, promise wider portability and scalability for data-efficient sequence-labeling tasks in general.



There are no comments yet.


page 1

page 2

page 3

page 4


DS-TOD: Efficient Domain Specialization for Task Oriented Dialog

Recent work has shown that self-supervised dialog-specific pretraining o...

Improved and Efficient Conversational Slot Labeling through Question Answering

Transformer-based pretrained language models (PLMs) offer unmatched perf...

Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling

Contextualized word embeddings such as ELMo and BERT provide a foundatio...

On the Complementarity of Data Selection and Fine Tuning for Domain Adaptation

Domain adaptation of neural networks commonly relies on three training p...

ConveRT: Efficient and Accurate Conversational Representations from Transformers

General-purpose pretrained sentence encoders such as BERT are not ideal ...

Augmented Natural Language for Generative Sequence Labeling

We propose a generative framework for joint sequence labeling and senten...

Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

We introduce Span-ConveRT, a light-weight model for dialog slot-filling ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.