CoRT: Complementary Rankings from Transformers

10/20/2020
by   Marco Wrzalik, et al.
0

Recent approaches towards passage retrieval have successfully employed representations from pretrained Language Models(LMs) with large effectiveness gains. However, due to high computational cost those approaches are usually limited to re-ranking scenarios. The candidates in such a scenario are typically retrieved by scalable bag-of-words retrieval models such as BM25. Although BM25 has proven decent performance as a first-stage ranker, it tends to miss relevant passages. In this context we propose CoRT, a framework and neural first-stage ranking model that leverages contextual representations from transformer-based language models to complement candidates from term-based ranking functions while causing no significant delay. Using the MS MARCO dataset, we show that CoRT significantly increases first-stage ranking quality and recall by complementing BM25 with missing candidates. Consequently, we found subsequent re-rankers achieve superior results while requiring less candidates to saturate ranking quality. Finally, we demonstrate that with CoRT a representation-focused retrieval at web-scale can be realized with latencies as low as BM25.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2020

Transformer-Based Language Models for Similar Text Retrieval and Ranking

Most approaches for similar text retrieval and ranking with long natural...
research
06/15/2023

Ranking and Selection in Large-Scale Inference of Heteroscedastic Units

The allocation of limited resources to a large number of potential candi...
research
05/21/2022

HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking

Deep pre-trained language models (e,g. BERT) are effective at large-scal...
research
07/12/2023

Towards the Better Ranking Consistency: A Multi-task Learning Framework for Early Stage Ads Ranking

Dividing ads ranking system into retrieval, early, and final stages is a...
research
06/07/2021

Pre-trained Language Model for Web-scale Retrieval in Baidu Search

Retrieval is a crucial stage in web search that identifies a small set o...
research
03/08/2021

Semantic Models for the First-stage Retrieval: A Comprehensive Review

Multi-stage ranking pipelines have been a practical solution in modern s...
research
08/18/2022

Adaptive Re-Ranking with a Corpus Graph

Search systems often employ a re-ranking pipeline, wherein documents (or...

Please sign up or login with your details

Forgot password? Click here to reset