ConveRT: Efficient and Accurate Conversational Representations from Transformers

11/09/2019
by   Matthew Henderson, et al.
0

General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train. We propose ConveRT (Conversational Representations from Transformers), a faster, more compact dual sentence encoder specifically optimized for dialog tasks. We pretrain using a retrieval-based response selection task, effectively leveraging quantization and subword-level parameterization in the dual encoder to build a lightweight memory- and energy-efficient model. In our evaluation, we show that ConveRT achieves state-of-the-art performance across widely established response selection tasks. We also demonstrate that the use of extended dialog history as context yields further performance gains. Finally, we show that pretrained representations from the proposed encoder can be transferred to the intent classification task, yielding strong results across three diverse data sets. ConveRT trains substantially faster than standard sentence encoders or previous state-of-the-art dual encoders. With its reduced size and superior performance, we believe this model promises wider portability and scalability for Conversational AI applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2021

ConvFiT: Conversational Fine-Tuning of Pretrained Language Models

Transformer-based language models (LMs) pretrained on large text collect...
research
03/10/2020

Efficient Intent Detection with Dual Sentence Encoders

Building conversational systems in new domains and with added functional...
research
05/18/2020

Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

We introduce Span-ConveRT, a light-weight model for dialog slot-filling ...
research
10/22/2020

ConVEx: Data-Efficient and Few-Shot Slot Labeling

We propose ConVEx (Conversational Value Extractor), an efficient pretrai...
research
03/11/2021

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Pretrained text encoders, such as BERT, have been applied increasingly i...
research
04/22/2019

Real-time Inference in Multi-sentence Tasks with Deep Pretrained Transformers

The use of deep pretrained bidirectional transformers has led to remarka...
research
01/23/2023

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Real-world data generation often involves complex inter-dependencies amo...

Please sign up or login with your details

Forgot password? Click here to reset