Active Sampling for Large-scale Information Retrieval Evaluation

09/06/2017
by   Dan Li, et al.
0

Evaluation is crucial in Information Retrieval. The development of models, tools and methods has significantly benefited from the availability of reusable test collections formed through a standardized and thoroughly tested methodology, known as the Cranfield paradigm. Constructing these collections requires obtaining relevance judgments for a pool of documents, retrieved by systems participating in an evaluation task; thus involves immense human labor. To alleviate this effort different methods for constructing collections have been proposed in the literature, falling under two broad categories: (a) sampling, and (b) active selection of documents. The former devises a smart sampling strategy by choosing only a subset of documents to be assessed and inferring evaluation measure on the basis of the obtained sample; the sampling distribution is being fixed at the beginning of the process. The latter recognizes that systems contributing documents to be judged vary in quality, and actively selects documents from good systems. The quality of systems is measured every time a new document is being judged. In this paper we seek to solve the problem of large-scale retrieval evaluation combining the two approaches. We devise an active sampling method that avoids the bias of the active selection methods towards good systems, and at the same time reduces the variance of the current sampling approaches by placing a distribution over systems, which varies as judgments become available. We validate the proposed method using TREC data and demonstrate the advantages of this new method compared to past approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2022

HC4: A New Suite of Test Collections for Ad Hoc CLIR

HC4 is a new suite of test collections for ad hoc Cross-Language Informa...
research
04/23/2023

Query-specific Variable Depth Pooling via Query Performance Prediction towards Reducing Relevance Assessment Effort

Due to the massive size of test collections, a standard practice in IR e...
research
04/20/2016

Local Binary Pattern for Word Spotting in Handwritten Historical Document

Digital libraries store images which can be highly degraded and to index...
research
06/21/2023

Resources and Evaluations for Multi-Distribution Dense Information Retrieval

We introduce and define the novel problem of multi-distribution informat...
research
08/18/2023

How Discriminative Are Your Qrels? How To Study the Statistical Significance of Document Adjudication Methods

Creating test collections for offline retrieval evaluation requires huma...
research
02/13/2022

Web-Based File Clustering and Indexing for Mindoro State University

The Web Based File Clustering and Indexing for Mindoro State University ...
research
04/28/2020

On the Reliability of Test Collections for Evaluating Systems of Different Types

As deep learning based models are increasingly being used for informatio...

Please sign up or login with your details

Forgot password? Click here to reset