A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques

06/28/2021
by   Jimmy Lin, et al.
0

Recent developments in representational learning for information retrieval can be organized in a conceptual framework that establishes two pairs of contrasts: sparse vs. dense representations and unsupervised vs. learned representations. Sparse learned representations can further be decomposed into expansion and term weighting components. This framework allows us to understand the relationship between recently proposed techniques such as DPR, ANCE, DeepCT, DeepImpact, and COIL, and furthermore, gaps revealed by our analysis point to "low hanging fruit" in terms of techniques that have yet to be explored. We present a novel technique dubbed "uniCOIL", a simple extension of COIL that achieves to our knowledge the current state-of-the-art in sparse retrieval on the popular MS MARCO passage ranking dataset. Our implementation using the Anserini IR toolkit is built on the Lucene search library and thus fully compatible with standard inverted indexes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2021

A Proposed Conceptual Framework for a Representational Approach to Information Retrieval

This paper outlines a conceptual framework for understanding recent deve...
research
04/24/2023

Anserini Gets Dense Retrieval: Integration of Lucene's HNSW Indexes

Anserini is a Lucene-based toolkit for reproducible information retrieva...
research
09/21/2021

SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval

In neural Information Retrieval (IR), ongoing research is directed towar...
research
10/28/2020

Flexible retrieval with NMSLIB and FlexNeuART

Our objective is to introduce to the NLP community an existing k-NN sear...
research
02/19/2021

Pyserini: An Easy-to-Use Python Toolkit to Support Replicable IR Research with Sparse and Dense Representations

Pyserini is an easy-to-use Python toolkit that supports replicable IR re...
research
07/19/2023

SPRINT: A Unified Toolkit for Evaluating and Demystifying Zero-shot Neural Sparse Retrieval

Traditionally, sparse retrieval systems relied on lexical representation...
research
05/23/2023

BM25 Query Augmentation Learned End-to-End

Given BM25's enduring competitiveness as an information retrieval baseli...

Please sign up or login with your details

Forgot password? Click here to reset