Towards Consistency Filtering-Free Unsupervised Learning for Dense Retrieval

08/05/2023
by   Haoxiang Shi, et al.
0

Domain transfer is a prevalent challenge in modern neural Information Retrieval (IR). To overcome this problem, previous research has utilized domain-specific manual annotations and synthetic data produced by consistency filtering to finetune a general ranker and produce a domain-specific ranker. However, training such consistency filters are computationally expensive, which significantly reduces the model efficiency. In addition, consistency filtering often struggles to identify retrieval intentions and recognize query and corpus distributions in a target domain. In this study, we evaluate a more efficient solution: replacing the consistency filter with either direct pseudo-labeling, pseudo-relevance feedback, or unsupervised keyword generation methods for achieving consistent filtering-free unsupervised dense retrieval. Our extensive experimental evaluations demonstrate that, on average, TextRank-based pseudo relevance feedback outperforms other methods. Furthermore, we analyzed the training and inference efficiency of the proposed paradigm. The results indicate that filtering-free unsupervised learning can continuously improve training and inference efficiency while maintaining retrieval performance. In some cases, it can even improve performance based on particular datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2023

Offline Pseudo Relevance Feedback for Efficient and Effective Single-pass Dense Retrieval

Dense retrieval has made significant advancements in information retriev...
research
12/13/2022

Domain Adaptation for Dense Retrieval through Self-Supervision by Pseudo-Relevance Labeling

Although neural information retrieval has witnessed great improvements, ...
research
12/17/2022

Unsupervised Dense Retrieval Deserves Better Positive Pairs: Scalable Augmentation with Query Extraction and Generation

Dense retrievers have made significant strides in obtaining state-of-the...
research
12/14/2021

GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval

Dense retrieval approaches can overcome the lexical gap and lead to sign...
research
07/01/2017

An Approach for Weakly-Supervised Deep Information Retrieval

Recent developments in neural information retrieval models have been pro...
research
07/06/2018

On the Equilibrium of Query Reformulation and Document Retrieval

In this paper, we study the interactions between query reformulation and...
research
03/31/2022

IITD-DBAI: Multi-Stage Retrieval with Pseudo-Relevance Feedback and Query Reformulation

Resolving the contextual dependency is one of the most challenging tasks...

Please sign up or login with your details

Forgot password? Click here to reset