PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval

08/13/2021
by   Ruiyang Ren, et al.
0

Recently, dense passage retrieval has become a mainstream approach to finding relevant information in various natural language processing tasks. A number of studies have been devoted to improving the widely adopted dual-encoder architecture. However, most of the previous studies only consider query-centric similarity relation when learning the dual-encoder retriever. In order to capture more comprehensive similarity relations, we propose a novel approach that leverages both query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval. To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations, generating high-quality pseudo labeled data via knowledge distillation, and designing an effective two-stage training procedure that incorporates passage-centric similarity relation constraint. Extensive experiments show that our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/16/2020

RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering

In open-domain question answering, dense passage retrieval has become a ...
research
03/31/2023

Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders

In this paper, we consider the problem of improving the inference latenc...
research
12/18/2022

Curriculum Sampling for Dense Retrieval with Document Expansion

The dual-encoder has become the de facto architecture for dense retrieva...
research
07/18/2023

Class-relation Knowledge Distillation for Novel Class Discovery

We tackle the problem of novel class discovery, which aims to learn nove...
research
03/27/2023

Empowering Dual-Encoder with Query Generator for Cross-Lingual Dense Retrieval

In monolingual dense retrieval, lots of works focus on how to distill kn...
research
05/18/2023

BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval

Dense retrieval has shown promise in the first-stage retrieval process w...
research
05/05/2021

TENET: A Framework for Modeling Tensor Dataflow Based on Relation-centric Notation

Accelerating tensor applications on spatial architectures provides high ...

Please sign up or login with your details

Forgot password? Click here to reset