Confidence-Calibrated Ensemble Dense Phrase Retrieval

06/28/2023
by   William Yang, et al.
0

In this paper, we consider the extent to which the transformer-based Dense Passage Retrieval (DPR) algorithm, developed by (Karpukhin et. al. 2020), can be optimized without further pre-training. Our method involves two particular insights: we apply the DPR context encoder at various phrase lengths (e.g. one-sentence versus five-sentence segments), and we take a confidence-calibrated ensemble prediction over all of these different segmentations. This somewhat exhaustive approach achieves start-of-the-art results on benchmark datasets such as Google NQ and SQuAD. We also apply our method to domain-specific datasets, and the results suggest how different granularities are optimal for different domains

READ FULL TEXT
research
10/25/2022

Bridging the Training-Inference Gap for Dense Phrase Retrieval

Building dense retrievers requires a series of standard procedures, incl...
research
11/06/2018

Neural Phrase-to-Phrase Machine Translation

In this paper, we propose Neural Phrase-to-Phrase Machine Translation (N...
research
05/24/2022

RetroMAE: Pre-training Retrieval-oriented Transformers via Masked Auto-Encoder

Pre-trained models have demonstrated superior power on many important ta...
research
12/23/2020

Learning Dense Representations of Phrases at Scale

Open-domain question answering can be reformulated as a phrase retrieval...
research
12/14/2021

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

Dense retrieval approaches can overcome the lexical gap and lead to sign...
research
01/02/2022

Establishing Strong Baselines for TripClick Health Retrieval

We present strong Transformer-based re-ranking and dense retrieval basel...

Please sign up or login with your details

Forgot password? Click here to reset