DeepAI AI Chat
Log In Sign Up

Span Selection Pre-training for Question Answering

by   Michael Glaß, et al.
indian institute of science

BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension and an effort to avoid encoding general knowledge in the transformer network itself. We find significant and consistent improvements over both BERT-BASE and BERT-LARGE on multiple reading comprehension (MRC) and paraphrasing datasets. Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERT-LARGE by 3 F1 points on short answer prediction. We also establish a new SOTA in HotpotQA, improving answer prediction F1 by 4 F1 points and supporting fact prediction by 1 F1 point. Moreover, we show that our pre-training approach is particularly effective when training data is limited, improving the learning curve by a large amount.


page 1

page 2

page 3

page 4


BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stan...

SpanBERT: Improving Pre-training by Representing and Predicting Spans

We present SpanBERT, a pre-training method that is designed to better re...

Contextualized Representations Using Textual Encyclopedic Knowledge

We present a method to represent input texts by contextualizing them joi...

Support-BERT: Predicting Quality of Question-Answer Pairs in MSDN using Deep Bidirectional Transformer

Quality of questions and answers from community support websites (e.g. M...

Towards Confident Machine Reading Comprehension

There has been considerable progress on academic benchmarks for the Read...

BERTQA – Attention on Steroids

In this work, we extend the Bidirectional Encoder Representations from T...

Bridging the Gap between Language Model and Reading Comprehension: Unsupervised MRC via Self-Supervision

Despite recent success in machine reading comprehension (MRC), learning ...