Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators

11/25/2021
by   Gustavo Penha, et al.
0

Heavily pre-trained transformers for language modelling, such as BERT, have shown to be remarkably effective for Information Retrieval (IR) tasks, typically applied to re-rank the results of a first-stage retrieval model. IR benchmarks evaluate the effectiveness of retrieval pipelines based on the premise that a single query is used to instantiate the underlying information need. However, previous research has shown that (I) queries generated by users for a fixed information need are extremely variable and, in particular, (II) neural models are brittle and often make mistakes when tested with modified inputs. Motivated by those observations we aim to answer the following question: how robust are retrieval pipelines with respect to different variations in queries that do not change the queries' semantics? In order to obtain queries that are representative of users' querying variability, we first created a taxonomy based on the manual annotation of transformations occurring in a dataset (UQV100) of user-created query variations. For each syntax-changing category of our taxonomy, we employed different automatic methods that when applied to a query generate a query variation. Our experimental results across two datasets for two IR tasks reveal that retrieval pipelines are not robust to these query variations, with effectiveness drops of ≈20% on average. The code and datasets are available at https://github.com/Guzpenha/query_variation_generators.

READ FULL TEXT

page 8

page 9

research
06/22/2023

On the Robustness of Generative Retrieval Models: An Out-of-Distribution Perspective

Recently, we have witnessed generative retrieval increasingly gaining at...
research
01/19/2022

Grep-BiasIR: A Dataset for Investigating Gender Representation-Bias in Information Retrieval Results

The provided contents by information retrieval (IR) systems can reflect ...
research
02/20/2023

Query Performance Prediction for Neural IR: Are We There Yet?

Evaluation in Information Retrieval relies on post-hoc empirical procedu...
research
02/07/2018

To Phrase or Not to Phrase - Impact of User versus System Term Dependence Upon Retrieval

When submitting queries to information retrieval (IR) systems, users oft...
research
11/20/2018

Alignment Analysis of Sequential Segmentation of Lexicons to Improve Automatic Cognate Detection

Ranking functions in information retrieval are often used in search engi...
research
07/28/2020

Declarative Experimentation in Information Retrieval using PyTerrier

The advent of deep machine learning platforms such as Tensorflow and Pyt...
research
08/20/2023

Offline Pseudo Relevance Feedback for Efficient and Effective Single-pass Dense Retrieval

Dense retrieval has made significant advancements in information retriev...

Please sign up or login with your details

Forgot password? Click here to reset