A Proposed Conceptual Framework for a Representational Approach to Information Retrieval

10/04/2021
by   Jimmy Lin, et al.
0

This paper outlines a conceptual framework for understanding recent developments in information retrieval and natural language processing that attempts to integrate dense and sparse retrieval methods. I propose a representational approach that breaks the core text retrieval problem into a logical scoring model and a physical retrieval model. The scoring model is defined in terms of encoders, which map queries and documents into a representational space, and a comparison function that computes query-document scores. The physical retrieval model defines how a system produces the top-k scoring documents from an arbitrarily large corpus with respect to a query. The scoring model can be further analyzed along two dimensions: dense vs. sparse representations and supervised (learned) vs. unsupervised approaches. I show that many recently proposed retrieval methods, including multi-stage ranking designs, can be seen as different parameterizations in this framework, and that a unified view suggests a number of open research questions, providing a roadmap for future work. As a bonus, this conceptual framework establishes connections to sentence similarity tasks in natural language processing and information access "technologies" prior to the dawn of computing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2021

A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques

Recent developments in representational learning for information retriev...
research
11/10/2021

Cross-language Information Retrieval

Two key assumptions shape the usual view of ranked retrieval: (1) that t...
research
12/16/2021

CODER: An efficient framework for improving retrieval through COntextualized Document Embedding Reranking

We present a framework for improving the performance of a wide class of ...
research
02/21/2015

Unified vector space mapping for knowledge representation systems

One of the most significant problems which inhibits further developments...
research
10/13/2020

Pretrained Transformers for Text Ranking: BERT and Beyond

The goal of text ranking is to generate an ordered list of texts retriev...
research
12/14/2022

Explainability of Text Processing and Retrieval Methods: A Critical Survey

Deep Learning and Machine Learning based models have become extremely po...
research
04/05/2020

Natural language processing for word sense disambiguation and information extraction

This research work deals with Natural Language Processing (NLP) and extr...

Please sign up or login with your details

Forgot password? Click here to reset