COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List

04/15/2021
by   Luyu Gao, et al.
0

Classical information retrieval systems such as BM25 rely on exact lexical match and carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft semantic matching all query document terms, but they lose the computation efficiency of exact match systems. This paper presents COIL, a contextualized exact match retrieval architecture that brings semantic lexical matching. COIL scoring is based on overlapping query document tokens' contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2020

Complementing Lexical Retrieval with Semantic Residual Embedding

Information retrieval traditionally has relied on lexical matching signa...
research
12/10/2021

Match Your Words! A Study of Lexical Matching in Neural Information Retrieval

Neural Information Retrieval models hold the promise to replace lexical ...
research
09/13/2022

SpaDE: Improving Sparse Representations using a Dual Document Encoder for First-stage Retrieval

Sparse document representations have been widely used to retrieve releva...
research
09/21/2021

SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval

In neural Information Retrieval (IR), ongoing research is directed towar...
research
10/11/2022

Better Than Whitespace: Information Retrieval for Languages without Custom Tokenizers

Tokenization is a crucial step in information retrieval, especially for ...
research
11/08/2016

Getting Started with Neural Models for Semantic Matching in Web Search

The vocabulary mismatch problem is a long-standing problem in informatio...
research
09/09/2023

FaNS: a Facet-based Narrative Similarity Metric

Similar Narrative Retrieval is a crucial task since narratives are essen...

Please sign up or login with your details

Forgot password? Click here to reset