Multi-View Document Representation Learning for Open-Domain Dense Retrieval

03/16/2022
by   Shunyu Zhang, et al.
0

Dense retrieval has achieved impressive advances in first-stage retrieval from a large-scale document collection, which is built on bi-encoder architecture to produce single vector representation of query and document. However, a document can usually answer multiple potential queries from different views. So the single vector representation of a document is hard to match with multi-view queries, and faces a semantic mismatch problem. This paper proposes a multi-view document representation learning framework, aiming to produce multi-view embeddings to represent documents and enforce them to align with different queries. First, we propose a simple yet effective method of generating multiple embeddings through viewers. Second, to prevent multi-view embeddings from collapsing to the same one, we further propose a global-local loss with annealed temperature to encourage the multiple viewers to better align with different potential queries. Experiments show our method outperforms recent works and achieves state-of-the-art results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2020

RepBERT: Contextualized Text Embeddings for First-Stage Retrieval

Although exact term match between queries and documents is the dominant ...
research
05/23/2022

UnifieR: A Unified Retriever for Large-Scale Retrieval

Large-scale retrieval is to recall relevant documents from a huge collec...
research
05/08/2021

Improving Document Representations by Generating Pseudo Query Embeddings for Dense Retrieval

Recently, the retrieval models based on dense representations have been ...
research
08/11/2022

On the Value of Behavioral Representations for Dense Retrieval

We consider text retrieval within dense representational space in real-w...
research
08/11/2016

Multi-View Product Image Search Using Deep ConvNets Representations

Multi-view product image queries can improve retrieval performance over ...
research
08/29/2022

LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval

Retrieval models based on dense representations in semantic space have b...
research
06/07/2023

Answering Compositional Queries with Set-Theoretic Embeddings

The need to compactly and robustly represent item-attribute relations ar...

Please sign up or login with your details

Forgot password? Click here to reset