Question Answering Infused Pre-training of General-Purpose Contextualized Representations

by   Robin Jia, et al.

This paper proposes a pre-training objective based on question answering (QA) for learning general-purpose contextual representations, motivated by the intuition that the representation of a phrase in a passage should encode all questions that the phrase can answer in context. We accomplish this goal by training a bi-encoder QA model, which independently encodes passages and questions, to match the predictions of a more accurate cross-encoder model on 80 million synthesized QA pairs. By encoding QA-relevant information, the bi-encoder's token-level representations are useful for non-QA downstream tasks without extensive (or in some cases, any) fine-tuning. We show large improvements over both RoBERTa-large and previous state-of-the-art results on zero-shot and few-shot paraphrase detection on four datasets, few-shot named entity recognition on two datasets, and zero-shot sentiment analysis on three datasets.


page 1

page 2

page 3

page 4


CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training

With the rise of large-scale pre-trained language models, open-domain qu...

Encoding Explanatory Knowledge for Zero-shot Science Question Answering

This paper describes N-XKT (Neural encoding based on eXplanatory Knowled...

Intermediate Training on Question Answering Datasets Improves Generative Data Augmentation

Manually annotating datasets requires domain experts to read through man...

Unsupervised Pre-training for Biomedical Question Answering

We explore the suitability of unsupervised representation learning metho...

Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering

The aim of all Question Answering (QA) systems is to be able to generali...

A Russian Jeopardy! Data Set for Question-Answering Systems

Question answering (QA) is one of the most common NLP tasks that relates...

General-Purpose Question-Answering with Macaw

Despite the successes of pretrained language models, there are still few...