An Analysis of Variations in the Effectiveness of Query Performance Prediction

02/13/2022
by   Debasis Ganguly, et al.
0

A query performance predictor estimates the retrieval effectiveness of an IR system for a given query. An important characteristic of QPP evaluation is that, since the ground truth retrieval effectiveness for QPP evaluation can be measured with different metrics, the ground truth itself is not absolute, which is in contrast to other retrieval tasks, such as that of ad-hoc retrieval. Motivated by this argument, the objective of this paper is to investigate how such variances in the ground truth for QPP evaluation can affect the outcomes of QPP experiments. We consider this not only in terms of the absolute values of the evaluation metrics being reported (e.g. Pearson's r, Kendall's τ), but also with respect to the changes in the ranks of different QPP systems when ordered by the QPP metric scores. Our experiments reveal that the observed QPP outcomes can vary considerably, both in terms of the absolute evaluation metric values and also in terms of the relative system ranks. Through our analysis, we report the optimal combinations of QPP evaluation metric and experimental settings that are likely to lead to smaller variations in the observed results.

READ FULL TEXT
research
05/07/2018

An Axiomatic Analysis of Diversity Evaluation Metrics: Introducing the Rank-Biased Utility Metric

Many evaluation metrics have been defined to evaluate the effectiveness ...
research
06/17/2022

Accelerating Shapley Explanation via Contributive Cooperator Selection

Even though Shapley value provides an effective explanation for a DNN mo...
research
01/25/2021

A metric for evaluating 3D reconstruction and mapping performance with no ground truthing

It is not easy when evaluating 3D mapping performance because existing m...
research
08/08/2022

Relevance Judgment Convergence Degree – A Measure of Inconsistency among Assessors for Information Retrieval

Relevance judgment of human assessors is inherently subjective and dynam...
research
04/16/2020

SQE: a Self Quality Evaluation Metric for Parameters Optimization in Multi-Object Tracking

We present a novel self quality evaluation metric SQE for parameters opt...
research
02/19/2021

Subjective Assessments of Legibility in Ancient Manuscript Images – The SALAMI Dataset

The research field concerned with the digital restoration of degraded wr...
research
09/18/2023

How Much Freedom Does An Effectiveness Metric Really Have?

It is tempting to assume that because effectiveness metrics have free ch...

Please sign up or login with your details

Forgot password? Click here to reset