End-to-End Open-Domain Question Answering with BERTserini

by   Wei Yang, et al.

We demonstrate an end-to-end question answering system that integrates BERT with the open-source Anserini information retrieval toolkit. In contrast to most question answering and reading comprehension models today, which operate over small amounts of input text, our system integrates best practices from IR with a BERT-based reader to identify answers from a large corpus of Wikipedia articles in an end-to-end fashion. We report large improvements over previous results on a standard benchmark test collection, showing that fine-tuning pretrained BERT with SQuAD is sufficient to achieve high accuracy in identifying answer spans.


page 1

page 2

page 3

page 4


Neural Arabic Question Answering

This paper tackles the problem of open domain factual Arabic question an...

CFO: A Framework for Building Production NLP Systems

This paper introduces a novel orchestration framework, called CFO (COMPU...

Fine-tune the Entire RAG Architecture (including DPR retriever) for Question-Answering

In this paper, we illustrate how to fine-tune the entire Retrieval Augme...

Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering

Open Domain Question Answering (ODQA) on a large-scale corpus of documen...

A Study of BERT for Non-Factoid Question-Answering under Passage Length Constraints

We study the use of BERT for non-factoid question-answering, focusing on...

A Replication Study of Dense Passage Retriever

Text retrieval using learned dense representations has recently emerged ...

Data Augmentation for BERT Fine-Tuning in Open-Domain Question Answering

Recently, a simple combination of passage retrieval using off-the-shelf ...