A Replication Study of Dense Passage Retriever

by   Xueguang Ma, et al.

Text retrieval using learned dense representations has recently emerged as a promising alternative to "traditional" text retrieval using sparse bag-of-words representations. One recent work that has garnered much attention is the dense passage retriever (DPR) technique proposed by Karpukhin et al. (2020) for end-to-end open-domain question answering. We present a replication study of this work, starting with model checkpoints provided by the authors, but otherwise from an independent implementation in our group's Pyserini IR toolkit and PyGaggle neural text ranking library. Although our experimental results largely verify the claims of the original paper, we arrived at two important additional findings that contribute to a better understanding of DPR: First, it appears that the original authors under-report the effectiveness of the BM25 baseline and hence also dense–sparse hybrid retrieval results. Second, by incorporating evidence from the retriever and an improved answer span scoring technique, we are able to improve end-to-end question answering effectiveness using exactly the same models as in the original work.



There are no comments yet.


page 1

page 2

page 3

page 4


Dense Passage Retrieval for Open-Domain Question Answering

Open-domain question answering relies on efficient passage retrieval to ...

Densifying Sparse Representations for Passage Retrieval by Representational Slicing

Learned sparse and dense representations capture different successful ap...

Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations

Pyserini is an easy-to-use Python toolkit that supports replicable IR re...

Fine-tune the Entire RAG Architecture (including DPR retriever) for Question-Answering

In this paper, we illustrate how to fine-tune the entire Retrieval Augme...

Flexible retrieval with NMSLIB and FlexNeuART

Our objective is to introduce to the NLP community an existing k-NN sear...

Dense Hierarchical Retrieval for Open-Domain Question Answering

Dense neural text retrieval has achieved promising results on open-domai...

On Single and Multiple Representations in Dense Passage Retrieval

The advent of contextualised language models has brought gains in search...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.