Graded Relevance Assessments and Graded Relevance Measures of NTCIR: A Survey of the First Twenty Years

03/27/2019
by   Tetsuya Sakai, et al.
0

NTCIR was the first large-scale IR evaluation conference to construct test collections with graded relevance assessments: the NTCIR-1 test collections from 1998 already featured relevant and partially relevant documents. In this paper, I first describe a few graded-relevance measures that originated from NTCIR (and a few variants) which are used across different NTCIR tasks. I then provide a survey on the use of graded relevance assessments and of graded relevance measures in the past NTCIR tasks which primarily tackled ranked retrieval. My survey shows that the majority of the past tasks fully utilised graded relevance by means of graded evaluation measures, but not all of them; interestingly, even a few relatively recent tasks chose to adhere to binary relevance measures. I conclude this paper by a summary of my survey in table form, and a brief discussion on what may lie beyond graded relevance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2020

Cheap IR Evaluation: Fewer Topics, No Relevance Judgements, and Crowdsourced Assessments

To evaluate Information Retrieval (IR) effectiveness, a possible approac...
research
08/23/2017

Evaluation Measures for Relevance and Credibility in Ranked Lists

Recent discussions on alternative facts, fake news, and post truth polit...
research
04/13/2023

Perspectives on Large Language Models for Relevance Judgment

When asked, current large language models (LLMs) like ChatGPT claim that...
research
06/03/2018

Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collections Accurately and Affordably

Crowdsourcing offers an affordable and scalable means to collect relevan...
research
11/20/2021

Effects of context, complexity, and clustering on evaluation for math formula retrieval

There are now several test collections for the formula retrieval task, i...
research
12/03/2018

Modeling Temporal Evidence from External Collections

Newsworthy events are broadcast through multiple mediums and prompt the ...
research
01/17/2018

Efficient Test Collection Construction via Active Learning

To create a new IR test collection at minimal cost, we must carefully se...

Please sign up or login with your details

Forgot password? Click here to reset