On Survivorship Bias in MS MARCO

04/27/2022
by   Prashansa Gupta, et al.
0

Survivorship bias is the tendency to concentrate on the positive outcomes of a selection process and overlook the results that generate negative outcomes. We observe that this bias could be present in the popular MS MARCO dataset, given that annotators could not find answers to 38–45 to these queries being discarded in training and evaluation processes. Although we find that some discarded queries in MS MARCO are ill-defined or otherwise unanswerable, many are valid questions that could be answered had the collection been annotated more completely (around two thirds using modern ranking techniques). This survivability problem distorts the MS MARCO collection in several ways. We find that it affects the natural distribution of queries in terms of the type of information needed. When used for evaluation, we find that the bias likely yields a significant distortion of the absolute performance scores observed. Finally, given that MS MARCO is frequently used for model training, we train models based on subsets of MS MARCO that simulates more survivorship bias. We find that models trained in this setting are up to 9.9 annotations, and up to 3.5 complementary to other recent suggestions for further annotation of MS MARCO, but with a focus on discarded queries. Code and data for reproducing the results of this paper are available in an online appendix.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2022

How Train-Test Leakage Affects Zero-shot Retrieval

Neural retrieval models are often trained on (subsets of) the millions o...
research
04/25/2023

The tale of two MS MARCO – and their unfair comparisons

The MS MARCO-passage dataset has been the main large-scale dataset open ...
research
04/27/2012

Magic Sets for Disjunctive Datalog Programs

In this paper, a new technique for the optimization of (partially) bound...
research
09/14/2023

MMEAD: MS MARCO Entity Annotations and Disambiguations

MMEAD, or MS MARCO Entity Annotations and Disambiguations, is a resource...
research
04/05/2022

Positive and Negative Critiquing for VAE-based Recommenders

Providing explanations for recommended items allows users to refine the ...
research
05/09/2021

MS MARCO: Benchmarking Ranking Models in the Large-Data Regime

Evaluation efforts such as TREC, CLEF, NTCIR and FIRE, alongside public ...
research
04/11/2021

The Cardan grille approach to the Voynich MS taken to the next level

The Voynich MS is an illustrated 15th century manuscript, whose text is ...

Please sign up or login with your details

Forgot password? Click here to reset