RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering

10/16/2020
by   Yingqi Qu Yuchen Ding, et al.
0

In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for answer finding. Typically, the dual-encoder architecture is adopted to learn dense representations of questions and passages for matching. However, it is difficult to train an effective dual-encoder due to the challenges including the discrepancy between training and inference, the existence of unlabeled positives and limited training data. To address these challenges, we propose an optimized training approach, called RocketQA, to improving dense passage retrieval. We make three major technical contributions in RocketQA, namely cross-batch negatives, denoised negative sampling and data augmentation. Extensive experiments show that RocketQA significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions. Besides, built upon RocketQA, we achieve the first rank at the leaderboard of MSMARCO Passage Ranking Task.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/13/2021

PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval

Recently, dense passage retrieval has become a mainstream approach to fi...
research
10/23/2020

Neural Passage Retrieval with Improved Negative Contrast

In this paper we explore the effects of negative sampling in dual encode...
research
05/04/2022

Analysing the Robustness of Dual Encoders for Dense Retrieval Against Misspellings

Dense retrieval is becoming one of the standard approaches for document ...
research
10/07/2021

Adversarial Retriever-Ranker for dense text retrieval

Current dense text retrieval models face two typical challenges. First, ...
research
01/01/2021

UnitedQA: A Hybrid Approach for Open Domain Question Answering

To date, most of recent work under the retrieval-reader framework for op...
research
12/14/2021

You Only Need One Model for Open-domain Question Answering

Recent works for Open-domain Question Answering refer to an external kno...
research
09/29/2020

Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation

Neural models that independently project questions and answers into a sh...

Please sign up or login with your details

Forgot password? Click here to reset