Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls

08/25/2021
by   Hang Li, et al.
0

Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At the same time, deep language models have been shown to outperform traditional bag-of-words rerankers. However, it is unclear how to integrate PRF directly with emergent deep language models. In this article, we address this gap by investigating methods for integrating PRF signals into rerankers and dense retrievers based on deep language models. We consider text-based and vector-based PRF approaches, and investigate different ways of combining and scoring relevance signals. An extensive empirical evaluation was conducted across four different datasets and two task settings (retrieval and ranking). Text-based PRF results show that the use of PRF had a mixed effect on deep rerankers across different datasets. We found that the best effectiveness was achieved when (i) directly concatenating each PRF passage with the query, searching with the new set of queries, and then aggregating the scores; (ii) using Borda to aggregate scores from PRF runs. Vector-based PRF results show that the use of PRF enhanced the effectiveness of deep rerankers and dense retrievers over several evaluation metrics. We found that higher effectiveness was achieved when (i) the query retains either the majority or the same weight within the PRF mechanism, and (ii) a shallower PRF signal (i.e., a smaller number of top-ranked passages) was employed, rather than a deeper signal. Our vector-based PRF method is computationally efficient; thus this represents a general PRF method others can use with deep rerankers and dense retrievers.

READ FULL TEXT

page 12

page 13

research
12/13/2021

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback: A Reproducibility Study

Pseudo-Relevance Feedback (PRF) utilises the relevance signals from the ...
research
04/30/2022

To Interpolate or not to Interpolate: PRF, Dense and Sparse Retrievers

Current pre-trained language model approaches to information retrieval c...
research
05/12/2022

How does Feedback Signal Quality Impact Effectiveness of Pseudo Relevance Feedback for Passage Retrieval?

Pseudo-Relevance Feedback (PRF) assumes that the top results retrieved b...
research
08/01/2023

Generative Query Reformulation for Effective Adhoc Search

Performing automatic reformulations of a user's query is a popular parad...
research
04/25/2022

LoL: A Comparative Regularization Loss over Query Reformulation Losses for Pseudo-Relevance Feedback

Pseudo-relevance feedback (PRF) has proven to be an effective query refo...
research
04/18/2019

The Simplest Thing That Can Possibly Work: Pseudo-Relevance Feedback Using Text Classification

Motivated by recent commentary that has questioned today's pursuit of ev...
research
06/23/2021

Mixtures of Deep Neural Experts for Automated Speech Scoring

The paper copes with the task of automatic assessment of second language...

Please sign up or login with your details

Forgot password? Click here to reset