Understanding and Predicting the Characteristics of Test Collections

12/24/2020
by   Md Mustafizur Rahman, et al.
0

Shared-task campaigns such as NIST TREC select documents to judge by pooling rankings from many participant systems. Therefore, the quality of the test collection greatly depends on the number of participants and the quality of submitted runs. In this work, we investigate i) how the number of participants, coupled with other factors, affects the quality of a test collection; and ii) whether the quality of a test collection can be inferred prior to collecting relevance judgments. Experiments on six TREC collections demonstrate that the required number of participants to construct a high-quality test collection varies significantly across different test collections due to a variety of factors. Furthermore, results suggest that the quality of test collections can be predicted.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/17/2018

Efficient Test Collection Construction via Active Learning

To create a new IR test collection at minimal cost, we must carefully se...
research
06/03/2018

Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collections Accurately and Affordably

Crowdsourcing offers an affordable and scalable means to collect relevan...
research
08/27/2018

Harnessing Historical Corrections to build Test Collections for Named Entity Disambiguation

Matching mentions of persons to the actual persons (the name disambiguat...
research
01/26/2022

Can Old TREC Collections Reliably Evaluate Modern Neural Retrieval Models?

Neural retrieval models are generally regarded as fundamentally differen...
research
04/21/2023

Hear Me Out: A Study on the Use of the Voice Modality for Crowdsourced Relevance Assessments

The creation of relevance assessments by human assessors (often nowadays...
research
04/28/2020

On the Reliability of Test Collections for Evaluating Systems of Different Types

As deep learning based models are increasingly being used for informatio...
research
11/16/2016

How to do lexical quality estimation of a large OCRed historical Finnish newspaper collection with scarce resources

The National Library of Finland has digitized the historical newspapers ...

Please sign up or login with your details

Forgot password? Click here to reset