DeepAI AI Chat
Log In Sign Up

RepBERT: Contextualized Text Embeddings for First-Stage Retrieval

by   Jingtao Zhan, et al.
Tsinghua University

Although exact term match between queries and documents is the dominant method to perform first-stage retrieval, we propose a different approach, called RepBERT, to represent documents and queries with fixed-length contextualized embeddings. The inner products of query and document embeddings are regarded as relevance scores. On MS MARCO Passage Ranking task, RepBERT achieves state-of-the-art results among all initial retrieval techniques. And its efficiency is comparable to bag-of-words methods.


page 1

page 2

page 3

page 4


Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback

Dense retrieval systems conduct first-stage retrieval using embedded rep...

Multi-View Document Representation Learning for Open-Domain Dense Retrieval

Dense retrieval has achieved impressive advances in first-stage retrieva...

Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval

Term frequency is a common method for identifying the importance of a te...

Design Patterns for Fusion-Based Object Retrieval

We address the task of ranking objects (such as people, blogs, or vertic...

Cross-genre Document Retrieval: Matching between Conversational and Formal Writings

This paper challenges a cross-genre document retrieval task, where the q...

Information retrieval for label noise document ranking by bag sampling and group-wise loss

Long Document retrieval (DR) has always been a tremendous challenge for ...