Neural Passage Retrieval with Improved Negative Contrast

10/23/2020
by   Jing Lu, et al.
0

In this paper we explore the effects of negative sampling in dual encoder models used to retrieve passages for automatic question answering. We explore four negative sampling strategies that complement the straightforward random sampling of negatives, typically used to train dual encoder models. Out of the four strategies, three are based on retrieval and one on heuristics. Our retrieval-based strategies are based on the semantic similarity and the lexical overlap between questions and passages. We train the dual encoder models in two stages: pre-training with synthetic data and fine tuning with domain-specific data. We apply negative sampling to both stages. The approach is evaluated in two passage retrieval tasks. Even though it is not evident that there is one single sampling strategy that works best in all the tasks, it is clear that our strategies contribute to improving the contrast between the response and all the other passages. Furthermore, mixing the negatives from different strategies achieve performance on par with the best performing strategy in all tasks. Our results establish a new state-of-the-art level of performance on two of the open-domain question answering datasets that we evaluated.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2020

RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering

In open-domain question answering, dense passage retrieval has become a ...
research
12/04/2019

An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering

To produce a domain-agnostic question answering model for the Machine Re...
research
04/14/2022

Exploring Dual Encoder Architectures for Question Answering

Dual encoders have been used for question-answering (QA) and information...
research
10/07/2020

Cross-Thought for Sentence Encoder Pre-training

In this paper, we propose Cross-Thought, a novel approach to pre-trainin...
research
01/02/2021

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Recent work on training neural retrievers for open-domain question answe...
research
11/09/2022

Distribution-Aligned Fine-Tuning for Efficient Neural Retrieval

Dual-encoder-based neural retrieval models achieve appreciable performan...
research
10/07/2021

Adversarial Retriever-Ranker for dense text retrieval

Current dense text retrieval models face two typical challenges. First, ...

Please sign up or login with your details

Forgot password? Click here to reset