UnifieR: A Unified Retriever for Large-Scale Retrieval

05/23/2022
by   Tao Shen, et al.
0

Large-scale retrieval is to recall relevant documents from a huge collection given a query. It relies on representation learning to embed documents and queries into a common semantic encoding space. According to the encoding space, recent retrieval methods based on pre-trained language models (PLM) can be coarsely categorized into either dense-vector or lexicon-based paradigms. These two paradigms unveil the PLMs' representation capability in different granularities, i.e., global sequence-level compression and local word-level contexts, respectively. Inspired by their complementary global-local contextualization and distinct representing views, we propose a new learning framework, UnifieR, which unifies dense-vector and lexicon-based retrieval in one model with a dual-representing capability. Experiments on passage retrieval benchmarks verify its effectiveness in both paradigms. A uni-retrieval scheme is further presented with even better retrieval quality. We lastly evaluate the model on BEIR benchmark to verify its transferability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/16/2022

Multi-View Document Representation Learning for Open-Domain Dense Retrieval

Dense retrieval has achieved impressive advances in first-stage retrieva...
research
08/29/2022

LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval

Retrieval models based on dense representations in semantic space have b...
research
08/16/2022

ConTextual Mask Auto-Encoder for Dense Passage Retrieval

Dense passage retrieval aims to retrieve the relevant passages of a quer...
research
05/01/2020

Sparse, Dense, and Attentional Representations for Text Retrieval

Dual encoder architectures perform retrieval by encoding documents and q...
research
05/04/2023

RetroMAE-2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models

To better support information retrieval tasks such as web search and ope...
research
04/27/2023

Multivariate Representation Learning for Information Retrieval

Dense retrieval models use bi-encoder network architectures for learning...
research
06/04/2023

I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval

Passage retrieval is a fundamental task in many information systems, suc...

Please sign up or login with your details

Forgot password? Click here to reset