ConvFiT: Conversational Fine-Tuning of Pretrained Language Models

09/21/2021
by   Ivan Vulić, et al.
5

Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge. However, 1) they are not effective as sentence encoders when used off-the-shelf, and 2) thus typically lag behind conversationally pretrained (e.g., via response selection) encoders on conversational tasks such as intent detection (ID). In this work, we propose ConvFiT, a simple and efficient two-stage procedure which turns any pretrained LM into a universal conversational encoder (after Stage 1 ConvFiT-ing) and task-specialised sentence encoder (after Stage 2). We demonstrate that 1) full-blown conversational pretraining is not required, and that LMs can be quickly transformed into effective conversational encoders with much smaller amounts of unannotated data; 2) pretrained LMs can be fine-tuned into task-specialised sentence encoders, optimised for the fine-grained semantics of a particular task. Consequently, such specialised sentence encoders allow for treating ID as a simple semantic similarity task based on interpretable nearest neighbours retrieval. We validate the robustness and versatility of the ConvFiT framework with such similarity-based inference on the standard ID evaluation sets: ConvFiT-ed LMs achieve state-of-the-art ID performance across the board, with particular gains in the most challenging, few-shot setups.

READ FULL TEXT
research
09/28/2021

Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders

Pretrained Masked Language Models (MLMs) have revolutionised NLP in rece...
research
11/09/2019

ConveRT: Efficient and Accurate Conversational Representations from Transformers

General-purpose pretrained sentence encoders such as BERT are not ideal ...
research
06/08/2021

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Pretrained Transformer-based models were reported to be robust in intent...
research
03/10/2020

Efficient Intent Detection with Dual Sentence Encoders

Building conversational systems in new domains and with added functional...
research
09/01/2019

Higher-order Comparisons of Sentence Encoder Representations

Representational Similarity Analysis (RSA) is a technique developed by n...
research
04/16/2021

Fast, Effective and Self-Supervised: Transforming Masked LanguageModels into Universal Lexical and Sentence Encoders

Pretrained Masked Language Models (MLMs) have revolutionised NLP in rece...
research
04/04/2020

Conversational Question Reformulation via Sequence-to-Sequence Architectures and Pretrained Language Models

This paper presents an empirical study of conversational question reform...

Please sign up or login with your details

Forgot password? Click here to reset