Recall as a Measure of Ranking Robustness

02/22/2023
by   Fernando Diaz, et al.
0

Researchers use recall to evaluate rankings across a variety of retrieval, recommendation, and machine learning tasks. While there is a colloquial interpretation of recall in set-based evaluation, the research community is far from a principled understanding of recall metrics for rankings. The lack of principled understanding of or motivation for recall has resulted in criticism amongst the retrieval community that recall is useful as a measure at all. In this light, we reflect on the measurement of recall in rankings from a formal perspective. Our analysis is composed of three tenets: recall, robustness, and lexicographic evaluation. First, we formally define `recall-orientation' as sensitivity to movement of the bottom-ranked relevant item. Second, we analyze our concept of recall orientation from the perspective of robustness with respect to possible searchers and content providers. Finally, we extend this conceptual and theoretical treatment of recall by developing a practical preference-based evaluation method based on lexicographic comparison. Through extensive empirical analysis across 17 TREC tracks, we establish that our new evaluation method, lexirecall, is correlated with existing recall metrics and exhibits substantially higher discriminative power and stability in the presence of missing labels. Our conceptual, theoretical, and empirical analysis substantially deepens our understanding of recall and motivates its adoption through connections to robustness and fairness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/13/2023

Best-Case Retrieval Evaluation: Improving the Sensitivity of Reciprocal Rank with Lexicographic Precision

Across a variety of ranking tasks, researchers use reciprocal rank to me...
research
07/07/2022

On the Metric Properties of IR Evaluation Measures Based on Ranking Axioms

The axiomatic analysis of IR evaluation metrics has contributed to a bet...
research
07/04/2022

On the Effect of Ranking Axioms on IR Evaluation Metrics

The study of IR evaluation metrics through axiomatic analysis enables a ...
research
12/01/2022

Principled Multi-Aspect Evaluation Measures of Rankings

Information Retrieval evaluation has traditionally focused on defining p...
research
04/27/2020

Evaluating Stochastic Rankings with Expected Exposure

We introduce the concept of expected exposure as the average attention r...
research
03/19/2023

Two Kinds of Recall

It is an established assumption that pattern-based models are good at pr...
research
01/18/2014

Combining Evaluation Metrics via the Unanimous Improvement Ratio and its Application to Clustering Tasks

Many Artificial Intelligence tasks cannot be evaluated with a single qua...

Please sign up or login with your details

Forgot password? Click here to reset