Fast Forward Indexes for Efficient Document Ranking

10/12/2021
by   Jurek Leonhardt, et al.
0

Neural approaches, specifically transformer models, for ranking documents have delivered impressive gains in ranking performance. However, query processing using such over-parameterized models is both resource and time intensive. Consequently, to keep query processing costs manageable, trade-offs are made to reduce the number of documents to be re-ranked or consider leaner models with fewer parameters. In this paper, we propose the fast-forward index – a simple vector forward index that facilitates ranking documents using interpolation-based ranking models. Fast-forward indexes pre-compute the dense transformer-based vector representations of documents and passages for fast CPU-based semantic similarity computation during query processing. We propose theoretically grounded index pruning and early stopping techniques to improve the query-processing throughput using fast-forward indexes. We conduct extensive large-scale experiments over the TREC-DL datasets and show up to 75 improvement in query-processing performance over hybrid indexes using only CPUs. Along with the efficiency benefits, we show that fast-forward indexes can deliver superior ranking performance due to the complementary benefits of interpolation between lexical and semantic similarities.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2022

DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index

Web search provides a promising way for people to obtain information and...
research
08/23/2021

Query Embedding Pruning for Dense Retrieval

Recent advances in dense retrieval techniques have offered the promise o...
research
04/30/2020

Query-level Early Exit for Additive Learning-to-Rank Ensembles

Search engine ranking pipelines are commonly based on large ensembles of...
research
04/18/2021

Anytime Ranking on Document-Ordered Indexes

Inverted indexes continue to be a mainstay of text search engines, allow...
research
05/06/2021

Learning Early Exit Strategies for Additive Ranking Ensembles

Modern search engine ranking pipelines are commonly based on large machi...
research
09/13/2022

SpaDE: Improving Sparse Representations using a Dual Document Encoder for First-stage Retrieval

Sparse document representations have been widely used to retrieve releva...
research
09/22/2021

Predicting Efficiency/Effectiveness Trade-offs for Dense vs. Sparse Retrieval Strategy Selection

Over the last few years, contextualized pre-trained transformer models s...

Please sign up or login with your details

Forgot password? Click here to reset