Flexible retrieval with NMSLIB and FlexNeuART

10/28/2020
by   Leonid Boytsov, et al.
0

Our objective is to introduce to the NLP community an existing k-NN search library NMSLIB, a new retrieval toolkit FlexNeuART, as well as their integration capabilities. NMSLIB, while being one the fastest k-NN search libraries, is quite generic and supports a variety of distance/similarity functions. Because the library relies on the distance-based structure-agnostic algorithms, it can be further extended by adding new distances. FlexNeuART is a modular, extendible and flexible toolkit for candidate generation in IR and QA applications, which supports mixing of classic and neural ranking signals. FlexNeuART can efficiently retrieve mixed dense and sparse representations (with weights learned from training data), which is achieved by extending NMSLIB. In that, other retrieval systems work with purely sparse representations (e.g., Lucene), purely dense representations (e.g., FAISS and Annoy), or only perform mixing at the re-ranking stage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2021

Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations

Pyserini is an easy-to-use Python toolkit that supports replicable IR re...
research
04/24/2023

Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes

Anserini is a Lucene-based toolkit for reproducible information retrieva...
research
06/28/2021

A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques

Recent developments in representational learning for information retriev...
research
03/11/2022

Tevatron: An Efficient and Flexible Toolkit for Dense Retrieval

Recent rapid advancements in deep pre-trained language models and the in...
research
06/10/2021

CodemixedNLP: An Extensible and Open NLP Toolkit for Code-Mixing

The NLP community has witnessed steep progress in a variety of tasks acr...
research
01/14/2021

The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models

We propose a design pattern for tackling text ranking problems, dubbed "...
research
04/23/2022

Dual Skipping Guidance for Document Retrieval with Learned Sparse Representations

This paper proposes a dual skipping guidance scheme with hybrid scoring ...

Please sign up or login with your details

Forgot password? Click here to reset