LongEval-Retrieval: French-English Dynamic Test Collection for Continuous Web Search Evaluation

03/06/2023
by   Petra Galuščáková, et al.
0

LongEval-Retrieval is a Web document retrieval benchmark that focuses on continuous retrieval evaluation. This test collection is intended to be used to study the temporal persistence of Information Retrieval systems and will be used as the test collection in the Longitudinal Evaluation of Model Performance Track (LongEval) at CLEF 2023. This benchmark simulates an evolving information system environment - such as the one a Web search engine operates in - where the document collection, the query distribution, and relevance all move continuously, while following the Cranfield paradigm for offline evaluation. To do that, we introduce the concept of a dynamic test collection that is composed of successive sub-collections each representing the state of an information system at a given time step. In LongEval-Retrieval, each sub-collection contains a set of queries, documents, and soft relevance assessments built from click models. The data comes from Qwant, a privacy-preserving Web search engine that primarily focuses on the French market. LongEval-Retrieval also provides a 'mirror' collection: it is initially constructed in the French language to benefit from the majority of Qwant's traffic, before being translated to English. This paper presents the creation process of LongEval-Retrieval and provides baseline runs and analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2022

TripJudge: A Relevance Judgement Test Collection for TripClick Health Retrieval

Robust test collections are crucial for Information Retrieval research. ...
research
08/21/2023

Evaluating Temporal Persistence Using Replicability Measures

In real-world Information Retrieval (IR) experiments, the Evaluation Env...
research
01/26/2022

Can Old TREC Collections Reliably Evaluate Modern Neural Retrieval Models?

Neural retrieval models are generally regarded as fundamentally differen...
research
05/05/2021

WTR: A Test Collection for Web Table Retrieval

We describe the development, characteristics and availability of a test ...
research
12/06/2021

A Sensitivity Analysis of the MSMARCO Passage Collection

The recent MSMARCO passage retrieval collection has allowed researchers ...
research
10/05/2018

C-DLSI: An Extended LSI Tailored for Federated Text Retrieval

As the web expands in data volume and in geographical distribution, cent...
research
03/24/2021

CSFCube – A Test Collection of Computer Science Research Articles for Faceted Query by Example

Query by Example is a well-known information retrieval task in which a d...

Please sign up or login with your details

Forgot password? Click here to reset