Surprise: Result List Truncation via Extreme Value Theory

10/19/2020
by   Dara Bahri, et al.
1

Work in information retrieval has largely been centered around ranking and relevance: given a query, return some number of results ordered by relevance to the user. The problem of result list truncation, or where to truncate the ranked list of results, however, has received less attention despite being crucial in a variety of applications. Such truncation is a balancing act between the overall relevance, or usefulness of the results, with the user cost of processing more results. Result list truncation can be challenging because relevance scores are often not well-calibrated. This is particularly true in large-scale IR systems where documents and queries are embedded in the same metric space and a query's nearest document neighbors are returned during inference. Here, relevance is inversely proportional to the distance between the query and candidate document, but what distance constitutes relevance varies from query to query and changes dynamically as more documents are added to the index. In this work, we propose Surprise scoring, a statistical method that leverages the Generalized Pareto distribution that arises in extreme value theory to produce interpretable and calibrated relevance scores at query time using nothing more than the ranked scores. We demonstrate its effectiveness on the result list truncation task across image, text, and IR datasets and compare it to both classical and recent baselines. We draw connections to hypothesis testing and p-values.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2020

Choppy: Cut Transformer For Ranked List Truncation

Work in information retrieval has traditionally focused on ranking and r...
research
03/03/2022

Do Perceived Gender Biases in Retrieval Results Affect Relevance Judgements?

This work investigates the effect of gender-stereotypical biases in the ...
research
02/25/2021

Learning to Truncate Ranked Lists for Information Retrieval

Ranked list truncation is of critical importance in a variety of profess...
research
01/21/2022

Less is Less: When Are Snippets Insufficient for Human vs Machine Relevance Estimation?

Traditional information retrieval (IR) ranking models process the full t...
research
05/01/2023

A Blueprint of IR Evaluation Integrating Task and User Characteristics: Test Collection and Evaluation Metrics

Relevance is generally understood as a multi-level and multi-dimensional...
research
04/11/2019

Investigating Retrieval Method Selection with Axiomatic Features

We consider algorithm selection in the context of ad-hoc information ret...
research
06/01/2022

Optical character recognition quality affects perceived usefulness of historical newspaper clippings

Introduction. We study effect of different quality optical character rec...

Please sign up or login with your details

Forgot password? Click here to reset