Relevance Judgment Convergence Degree – A Measure of Inconsistency among Assessors for Information Retrieval

08/08/2022
by   Dengya Zhu, et al.
0

Relevance judgment of human assessors is inherently subjective and dynamic when evaluation datasets are created for Information Retrieval (IR) systems. However, a small group of experts' relevance judgment results are usually taken as ground truth to "objectively" evaluate the performance of the IR systems. Recent trends intend to employ a group of judges, such as outsourcing, to alleviate the potentially biased judgment results stemmed from using only a single expert's judgment. Nevertheless, different judges may have different opinions and may not agree with each other, and the inconsistency in human relevance judgment may affect the IR system evaluation results. In this research, we introduce a Relevance Judgment Convergence Degree (RJCD) to measure the quality of queries in the evaluation datasets. Experimental results reveal a strong correlation coefficient between the proposed RJCD score and the performance differences between the two IR systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2021

A Linguistic Study on Relevance Modeling in Information Retrieval

Relevance plays a central role in information retrieval (IR), which has ...
research
08/23/2017

Evaluation Measures for Relevance and Credibility in Ranked Lists

Recent discussions on alternative facts, fake news, and post truth polit...
research
06/28/2018

Impact of the Query Set on the Evaluation of Expert Finding Systems

Expertise is a loosely defined concept that is hard to formalize. Much r...
research
04/13/2023

Perspectives on Large Language Models for Relevance Judgment

When asked, current large language models (LLMs) like ChatGPT claim that...
research
05/01/2023

A Blueprint of IR Evaluation Integrating Task and User Characteristics: Test Collection and Evaluation Metrics

Relevance is generally understood as a multi-level and multi-dimensional...
research
06/15/2023

Prompt Performance Prediction for Generative IR

The ability to predict the performance of a query in Information Retriev...
research
02/13/2022

An Analysis of Variations in the Effectiveness of Query Performance Prediction

A query performance predictor estimates the retrieval effectiveness of a...

Please sign up or login with your details

Forgot password? Click here to reset