New Metrics to Encourage Innovation and Diversity in Information Retrieval Approaches

01/19/2023
by   Mehmet Deniz Türkmen, et al.
0

In evaluation campaigns, participants often explore variations of popular, state-of-the-art baselines as a low-risk strategy to achieve competitive results. While effective, this can lead to local "hill climbing" rather than more radical and innovative departure from standard methods. Moreover, if many participants build on similar baselines, the overall diversity of approaches considered may be limited. In this work, we propose a new class of IR evaluation metrics intended to promote greater diversity of approaches in evaluation campaigns. Whereas traditional IR metrics focus on user experience, our two "innovation" metrics instead reward exploration of more divergent, higher-risk strategies finding relevant documents missed by other systems. Experiments on four TREC collections show that our metrics do change system rankings by rewarding systems that find such rare, relevant documents. This result is further supported by a controlled, synthetic data experiment, and a qualitative analysis. In addition, we show that our metrics achieve higher evaluation stability and discriminative power than the standard metrics we modify. To support reproducibility, we share our source code.

READ FULL TEXT
research
07/07/2022

On the Metric Properties of IR Evaluation Measures Based on Ranking Axioms

The axiomatic analysis of IR evaluation metrics has contributed to a bet...
research
07/04/2022

On the Effect of Ranking Axioms on IR Evaluation Metrics

The study of IR evaluation metrics through axiomatic analysis enables a ...
research
01/05/2022

Atomized Search Length: Beyond User Models

We argue that current IR metrics, modeled on optimizing user experience,...
research
02/01/2018

Correlation and Prediction of Evaluation Metrics in Information Retrieval

Because researchers typically do not have the time or space to present m...
research
09/07/2018

Data Requirements for Evaluation of Personalization of Information Retrieval - A Position Paper

Two key, but usually ignored, issues for the evaluation of methods of pe...
research
07/27/2023

On (Normalised) Discounted Cumulative Gain as an Offline Evaluation Metric for Top-n Recommendation

Approaches to recommendation are typically evaluated in one of two ways:...
research
05/30/2023

The Information Retrieval Experiment Platform

We integrate ir_datasets, ir_measures, and PyTerrier with TIRA in the In...

Please sign up or login with your details

Forgot password? Click here to reset