InPars-Light: Cost-Effective Unsupervised Training of Efficient Rankers

01/08/2023
by   Leonid Boytsov, et al.
0

We carried out a reproducibility study of InPars recipe for unsupervised training of neural rankers. As a by-product of this study, we developed a simple-yet-effective modification of InPars, which we called InPars-light. Unlike InPars, InPars-light uses only a freely available language model BLOOM and 7x-100x smaller ranking models. On all five English retrieval collections (used in the original InPars study) we obtained substantial (7-30 statistically significant improvements over BM25 in nDCG or MRR using only a 30M parameter six-layer MiniLM ranker. In contrast, in the InPars study only a 100x larger MonoT5-3B model consistently outperformed BM25, whereas their smaller MonoT5-220M model (which is still 7x larger than our MiniLM ranker), outperformed BM25 only on MS MARCO and TREC DL 2020. In a purely unsupervised setting, our 435M parameter DeBERTA v3 ranker was roughly at par with the 7x larger MonoT5-3B: In fact, on three out of five datasets, it slightly outperformed MonoT5-3B. Finally, these good results were achieved by re-ranking only 100 candidate documents compared to 1000 used in InPars. We believe that InPars-light is the first truly cost-effective prompt-based unsupervised recipe to train and deploy neural ranking models that outperform BM25.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/20/2022

Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models

The advent of transformer-based models such as BERT has led to the rise ...
research
03/18/2019

An Updated Duet Model for Passage Re-ranking

We propose several small modifications to Duet---a deep neural ranking m...
research
04/28/2017

Neural Ranking Models with Weak Supervision

Despite the impressive improvements achieved by unsupervised deep neural...
research
09/13/2023

Unsupervised Contrast-Consistent Ranking with Language Models

Language models contain ranking-based knowledge and are powerful solvers...
research
07/07/2022

Supervised Contrastive Learning Approach for Contextual Ranking

Contextual ranking models have delivered impressive performance improvem...
research
08/11/2023

Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Generative language models (LMs) have become omnipresent across data sci...
research
11/01/2021

Unsupervised Discovery of Unaccusative and Unergative Verbs

We present an unsupervised method to detect English unergative and unacc...

Please sign up or login with your details

Forgot password? Click here to reset