Resources and Evaluations for Multi-Distribution Dense Information Retrieval

06/21/2023
by   Soumya Chatterjee, et al.
0

We introduce and define the novel problem of multi-distribution information retrieval (IR) where given a query, systems need to retrieve passages from within multiple collections, each drawn from a different distribution. Some of these collections and distributions might not be available at training time. To evaluate methods for multi-distribution retrieval, we design three benchmarks for this task from existing single-distribution datasets, namely, a dataset based on question answering and two based on entity matching. We propose simple methods for this task which allocate the fixed retrieval budget (top-k passages) strategically across domains to prevent the known domains from consuming most of the budget. We show that our methods lead to an average of 3.8+ and up to 8.0 points improvements in Recall@100 across the datasets and that improvements are consistent when fine-tuning different base retrieval models. Our benchmarks are made publicly available.

READ FULL TEXT

page 6

page 8

research
12/16/2021

Towards Unsupervised Dense Information Retrieval with Contrastive Learning

Information retrieval is an important component in natural language proc...
research
05/05/2020

MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models

Retrieval question answering (ReQA) is the task of retrieving a sentence...
research
03/30/2019

On the Estimation and Use of Statistical Modelling in Information Retrieval

Several tasks in information retrieval (IR) rely on assumptions regardin...
research
12/02/2022

Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

Neural information retrieval (IR) systems have progressed rapidly in rec...
research
12/01/2022

NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Information retrieval aims to find information that meets users' needs f...
research
09/06/2017

Active Sampling for Large-scale Information Retrieval Evaluation

Evaluation is crucial in Information Retrieval. The development of model...
research
03/18/2022

RELIC: Retrieving Evidence for Literary Claims

Humanities scholars commonly provide evidence for claims that they make ...

Please sign up or login with your details

Forgot password? Click here to reset