An Analysis of Fusion Functions for Hybrid Retrieval

10/21/2022
by   Sebastian Bruch, et al.
0

We study hybrid search in text retrieval where lexical and semantic search are fused together with the intuition that the two are complementary in how they model relevance. In particular, we examine fusion by a convex combination (CC) of lexical and semantic scores, as well as the Reciprocal Rank Fusion (RRF) method, and identify their advantages and potential pitfalls. Contrary to existing studies, we find RRF to be sensitive to its parameters; that the learning of a CC fusion is generally agnostic to the choice of score normalization; that CC outperforms RRF in in-domain and out-of-domain settings; and finally, that CC is sample efficient, requiring only a small set of training examples to tune its only parameter to a target domain.

READ FULL TEXT

page 1

page 17

research
06/20/2022

A Dense Representation Framework for Lexical and Semantic Matching

Lexical and semantic matching capture different successful approaches to...
research
01/25/2022

Out-of-Domain Semantics to the Rescue! Zero-Shot Hybrid Retrieval Models

The pre-trained language model (eg, BERT) based deep retrieval models ac...
research
04/20/2022

Synthetic Target Domain Supervision for Open Retrieval QA

Neural passage retrieval is a new and promising approach in open retriev...
research
10/02/2020

Leveraging Semantic and Lexical Matching to Improve the Recall of Document Retrieval Systems: A Hybrid Approach

Search engines often follow a two-phase paradigm where in the first stag...
research
09/29/2020

Neural Retrieval for Question Answering with Cross-Attention Supervised Data Augmentation

Neural models that independently project questions and answers into a sh...
research
09/10/2023

Streamlined Data Fusion: Unleashing the Power of Linear Combination with Minimal Relevance Judgments

Linear combination is a potent data fusion method in information retriev...
research
11/30/2018

Learning From Weights: A Cost-Sensitive Approach For Ad Retrieval

Retrieval models such as CLSM is trained on click-through data which tre...

Please sign up or login with your details

Forgot password? Click here to reset