Aggretriever: A Simple Approach to Aggregate Textual Representation for Robust Dense Passage Retrieval

07/31/2022
by   Sheng-Chieh Lin, et al.
0

Pre-trained transformers has declared its success in many NLP tasks. One thread of work focuses on training bi-encoder models (i.e., dense retrievers) to effectively encode sentences or passages into single-vector dense vectors for efficient approximate nearest neighbor (ANN) search. However, recent work has demonstrated that transformers pre-trained with mask language modeling (MLM) are not capable of effectively aggregating text information into a single dense vector due to task-mismatch between pre-training and fine-tuning. Therefore, computationally expensive techniques have been adopted to train dense retrievers, such as large batch size, knowledge distillation or post pre-training. In this work, we present a simple approach to effectively aggregate textual representation from the pre-trained transformer into a dense vector. Extensive experiments show that our approach improves the robustness of the single-vector approach under both in-domain and zero-shot evaluations without any computationally expensive training techniques. Our work demonstrates that MLM pre-trained transformers can be used to effectively encode text information into a single-vector for dense retrieval. Code are available at: https://github.com/castorini/dhr

READ FULL TEXT
research
04/16/2021

Is Your Language Model Ready for Dense Representation Fine-tuning?

Pre-trained language models (LM) have become go-to text representation e...
research
12/15/2022

MASTER: Multi-task Pre-trained Bottlenecked Masked Autoencoders are Better Dense Retrievers

Dense retrieval aims to map queries and passages into low-dimensional ve...
research
03/22/2022

MetaMorph: Learning Universal Controllers with Transformers

Multiple domains like vision, natural language, and audio are witnessing...
research
05/31/2021

Greedy Layer Pruning: Decreasing Inference Time of Transformer Models

Fine-tuning transformer models after unsupervised pre-training reaches a...
research
05/06/2022

Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder

Dense retrievers encode texts and map them in an embedding space using p...
research
08/22/2023

Pre-training with Aspect-Content Text Mutual Prediction for Multi-Aspect Dense Retrieval

Grounded on pre-trained language models (PLMs), dense retrieval has been...
research
07/07/2021

Can Transformer Models Measure Coherence In Text? Re-Thinking the Shuffle Test

The Shuffle Test is the most common task to evaluate whether NLP models ...

Please sign up or login with your details

Forgot password? Click here to reset