Declarative Experimentation in Information Retrieval using PyTerrier

07/28/2020
by   Craig Macdonald, et al.
0

The advent of deep machine learning platforms such as Tensorflow and Pytorch, developed in expressive high-level languages such as Python, have allowed more expressive representations of deep neural network architectures. We argue that such a powerful formalism is missing in information retrieval (IR), and propose a framework called PyTerrier that allows advanced retrieval pipelines to be expressed, and evaluated, in a declarative manner close to their conceptual design. Like the aforementioned frameworks that compile deep learning experiments into primitive GPU operations, our framework targets IR platforms as backends in order to execute and evaluate retrieval pipelines. Further, we can automatically optimise the retrieval pipelines to increase their efficiency to suite a particular IR platform backend. Our experiments, conducted on TREC Robust and ClueWeb09 test collections, demonstrate the efficiency benefits of these optimisations for retrieval pipelines involving both the Anserini and Terrier IR platforms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2022

Patapasco: A Python Framework for Cross-Language Information Retrieval Experiments

While there are high-quality software frameworks for information retriev...
research
06/12/2018

Information Retrieval in African Languages

Developing Information Retrieval (IR) tools and techniques in African la...
research
09/15/2018

Commentary on Quantum-Inspired Information Retrieval

There have been suggestions within the Information Retrieval (IR) commun...
research
02/01/2021

Forensicability of Deep Neural Network Inference Pipelines

We propose methods to infer properties of the execution environment of m...
research
11/25/2021

Evaluating the Robustness of Retrieval Pipelines with Query Variation Generators

Heavily pre-trained transformers for language modelling, such as BERT, h...
research
04/29/2019

On the Effect of Low-Frequency Terms on Neural-IR Models

Low-frequency terms are a recurring challenge for information retrieval ...
research
03/26/2020

Real-time information retrieval from Identity cards

Information is frequently retrieved from valid personal ID cards by the ...

Please sign up or login with your details

Forgot password? Click here to reset