Real-time Inference in Multi-sentence Tasks with Deep Pretrained Transformers

04/22/2019
by   Samuel Humeau, et al.
0

The use of deep pretrained bidirectional transformers has led to remarkable progress in learning multi-sentence representations for downstream language understanding tasks (Devlin et al., 2018). For tasks that make pairwise comparisons, e.g. matching a given context with a corresponding response, two approaches have permeated the literature. A Cross-encoder performs full self-attention over the pair; a Bi-encoder performs self-attention for each sequence separately, and the final representation is a function of the pair. While Cross-encoders nearly always outperform Bi-encoders on various tasks, both in our work and others' (Urbanek et al., 2019), they are orders of magnitude slower, which hampers their ability to perform real-time inference. In this work, we develop a new architecture, the Poly-encoder, that is designed to approach the performance of the Cross-encoder while maintaining reasonable computation time. Additionally, we explore two pretraining schemes with different datasets to determine how these affect the performance on our chosen dialogue tasks: ConvAI2 and DSTC7 Track 1. We show that our models achieve state-of-the-art results on both tasks; that the Poly-encoder is a suitable replacement for Bi-encoders and Cross-encoders; and that even better results can be obtained by pretraining on a large dialogue dataset.

READ FULL TEXT
research
04/23/2020

Distilling Knowledge for Fast Retrieval-based Chat-bots

Response retrieval is a subset of neural ranking in which a model select...
research
01/11/2019

Grammatical Analysis of Pretrained Sentence Encoders with Acceptability Judgments

Recent pretrained sentence encoders achieve state of the art results on ...
research
03/12/2023

MWE as WSD: Solving Multiword Expression Identification with Word Sense Disambiguation

Recent work in word sense disambiguation (WSD) utilizes encodings of the...
research
09/27/2021

Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations

In NLP, a large volume of tasks involve pairwise comparison between two ...
research
10/16/2020

Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

There are two approaches for pairwise sentence scoring: Cross-encoders, ...
research
11/06/2019

Enriching Conversation Context in Retrieval-based Chatbots

Work on retrieval-based chatbots, like most sequence pair matching tasks...
research
11/09/2019

ConveRT: Efficient and Accurate Conversational Representations from Transformers

General-purpose pretrained sentence encoders such as BERT are not ideal ...

Please sign up or login with your details

Forgot password? Click here to reset