The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models

01/14/2021
by   Ronak Pradeep, et al.
0

We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains. At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture. "Expando" refers to the use of document expansion techniques to enrich keyword representations of texts prior to inverted indexing. "Mono" and "Duo" refer to components in a reranking pipeline based on a pointwise model and a pairwise model that rerank initial candidates retrieved using keyword search. We present experimental results from the MS MARCO passage and document ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge that validate our design. In all these tasks, we achieve effectiveness that is at or near the state of the art, in some cases using a zero-shot approach that does not exploit any training data from the target task. To support replicability, implementations of our design pattern are open-sourced in the Pyserini IR toolkit and PyGaggle neural reranking library.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2020

Document Ranking with a Pretrained Sequence-to-Sequence Model

This work proposes a novel adaptation of a pretrained sequence-to-sequen...
research
12/27/2020

Neural document expansion for ad-hoc information retrieval

Recently, Nogueira et al. [2019] proposed a new approach to document exp...
research
10/12/2022

RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses

Recently, substantial progress has been made in text ranking based on pr...
research
08/13/2021

TPRM: A Topic-based Personalized Ranking Model for Web Search

Ranking models have achieved promising results, but it remains challengi...
research
03/11/2022

LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval

In this paper, we propose LaPraDoR, a pretrained dual-tower dense retrie...
research
01/09/2023

Doc2Query–: When Less is More

Doc2Query – the process of expanding the content of a document before in...
research
10/28/2020

Flexible retrieval with NMSLIB and FlexNeuART

Our objective is to introduce to the NLP community an existing k-NN sear...

Please sign up or login with your details

Forgot password? Click here to reset