Multivariate Representation Learning for Information Retrieval

04/27/2023
by   Hamed Zamani, et al.
0

Dense retrieval models use bi-encoder network architectures for learning query and document representations. These representations are often in the form of a vector representation and their similarities are often computed using the dot product function. In this paper, we propose a new representation learning framework for dense retrieval. Instead of learning a vector for each query and document, our framework learns a multivariate distribution and uses negative multivariate KL divergence to compute the similarity between distributions. For simplicity and efficiency reasons, we assume that the distributions are multivariate normals and then train large language models to produce mean and variance vectors for these distributions. We provide a theoretical foundation for the proposed framework and show that it can be seamlessly integrated into the existing approximate nearest neighbor algorithms to perform retrieval efficiently. We conduct an extensive suite of experiments on a wide range of datasets, and demonstrate significant improvements compared to competitive dense retrieval models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2021

More Robust Dense Retrieval with Contrastive Dual Learning

Dense retrieval conducts text retrieval in the embedding space and has s...
research
04/26/2023

A Personalized Dense Retrieval Framework for Unified Information Access

Developing a universal model that can efficiently and effectively respon...
research
10/12/2021

Learning Discrete Representations via Constrained Clustering for Effective and Efficient Dense Retrieval

Dense Retrieval (DR) has achieved state-of-the-art first-stage ranking e...
research
05/25/2022

Refining Query Representations for Dense Retrieval at Test Time

Dense retrieval uses a contrastive learning framework to learn dense rep...
research
05/23/2022

UnifieR: A Unified Retriever for Large-Scale Retrieval

Large-scale retrieval is to recall relevant documents from a huge collec...
research
02/13/2023

Improving Out-of-Distribution Generalization of Neural Rerankers with Contextualized Late Interaction

Recent progress in information retrieval finds that embedding query and ...
research
06/20/2023

Generative Retrieval as Dense Retrieval

Generative retrieval is a promising new neural retrieval paradigm that a...

Please sign up or login with your details

Forgot password? Click here to reset