SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking

07/12/2021
by   Thibault Formal, et al.
0

In neural Information Retrieval, ongoing research is directed towards improving the first retriever in ranking pipelines. Learning dense embeddings to conduct retrieval using efficient approximate nearest neighbors methods has proven to work well. Meanwhile, there has been a growing interest in learning sparse representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes. In this work, we present a new first-stage ranker based on explicit sparsity regularization and a log-saturation effect on term weights, leading to highly sparse representations and competitive results with respect to state-of-the-art dense and sparse methods. Our approach is simple, trained end-to-end in a single stage. We also explore the trade-off between effectiveness and efficiency, by controlling the contribution of the sparsity regularization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/21/2021

SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval

In neural Information Retrieval (IR), ongoing research is directed towar...
research
06/29/2023

Exploring the Representation Power of SPLADE Models

The SPLADE (SParse Lexical AnD Expansion) model is a highly effective ap...
research
08/09/2022

Early Stage Sparse Retrieval with Entity Linking

Despite the advantages of their low-resource settings, traditional spars...
research
12/17/2021

Sparsifying Sparse Representations for Passage Retrieval by Top-k Masking

Sparse lexical representation learning has demonstrated much progress in...
research
05/10/2022

From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective

Neural retrievers based on dense representations combined with Approxima...
research
04/15/2021

UHD-BERT: Bucketed Ultra-High Dimensional Sparse Representations for Full Ranking

Neural information retrieval (IR) models are promising mainly because th...
research
04/12/2020

Minimizing FLOPs to Learn Efficient Sparse Representations

Deep representation learning has become one of the most widely adopted a...

Please sign up or login with your details

Forgot password? Click here to reset