Cheap IR Evaluation: Fewer Topics, No Relevance Judgements, and Crowdsourced Assessments

11/01/2020
by   Kevin Roitero, et al.
0

To evaluate Information Retrieval (IR) effectiveness, a possible approach is to use test collections, which are composed of a collection of documents, a set of description of information needs (called topics), and a set of relevant documents to each topic. Test collections are modelled in a competition scenario: for example, in the well known TREC initiative, participants run their own retrieval systems over a set of topics and they provide a ranked list of retrieved documents; some of the retrieved documents (usually the first ranked) constitute the so called pool, and their relevance is evaluated by human assessors; the document list is then used to compute effectiveness metrics and rank the participant systems. Private Web Search companies also run their in-house evaluation exercises; although the details are mostly unknown, and the aims are somehow different, the overall approach shares several issues with the test collection approach. The aim of this work is to: (i) develop and improve some state-of-the-art work on the evaluation of IR effectiveness while saving resources, and (ii) propose a novel, more principled and engineered, overall approach to test collection based effectiveness evaluation. [...]

READ FULL TEXT
research
11/01/2020

CURE: Collection for Urdu Information Retrieval Evaluation and Ranking

Urdu is a widely spoken language with 163 million speakers worldwide acr...
research
03/27/2019

Graded Relevance Assessments and Graded Relevance Measures of NTCIR: A Survey of the First Twenty Years

NTCIR was the first large-scale IR evaluation conference to construct te...
research
04/23/2023

Query-specific Variable Depth Pooling via Query Performance Prediction towards Reducing Relevance Assessment Effort

Due to the massive size of test collections, a standard practice in IR e...
research
12/06/2021

A Sensitivity Analysis of the MSMARCO Passage Collection

The recent MSMARCO passage retrieval collection has allowed researchers ...
research
10/05/2018

C-DLSI: An Extended LSI Tailored for Federated Text Retrieval

As the web expands in data volume and in geographical distribution, cent...
research
05/01/2023

A Blueprint of IR Evaluation Integrating Task and User Characteristics: Test Collection and Evaluation Metrics

Relevance is generally understood as a multi-level and multi-dimensional...
research
12/01/2022

Principled Multi-Aspect Evaluation Measures of Rankings

Information Retrieval evaluation has traditionally focused on defining p...

Please sign up or login with your details

Forgot password? Click here to reset