Quick Dense Retrievers Consume KALE: Post Training Kullback Leibler Alignment of Embeddings for Asymmetrical dual encoders

03/31/2023
by   Daniel Campos, et al.
0

In this paper, we consider the problem of improving the inference latency of language model-based dense retrieval systems by introducing structural compression and model size asymmetry between the context and query encoders. First, we investigate the impact of pre and post-training compression on the MSMARCO, Natural Questions, TriviaQA, SQUAD, and SCIFACT, finding that asymmetry in the dual encoders in dense retrieval can lead to improved inference efficiency. Knowing this, we introduce Kullback Leibler Alignment of Embeddings (KALE), an efficient and accurate method for increasing the inference efficiency of dense retrieval methods by pruning and aligning the query encoder after training. Specifically, KALE extends traditional Knowledge Distillation after bi-encoder training, allowing for effective query encoder compression without full retraining or index generation. Using KALE and asymmetric training, we can generate models which exceed the performance of DistilBERT despite having 3x faster inference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2023

Noise-Robust Dense Retrieval via Contrastive Alignment Post Training

The success of contextual word representations and advances in neural in...
research
06/05/2023

Query Encoder Distillation via Embedding Alignment is a Strong Baseline Method to Boost Dense Retriever Online Efficiency

The information retrieval community has made significant progress in imp...
research
06/04/2023

I^3 Retriever: Incorporating Implicit Interaction in Pre-trained Language Models for Passage Retrieval

Passage retrieval is a fundamental task in many information systems, suc...
research
08/13/2021

PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval

Recently, dense passage retrieval has become a mainstream approach to fi...
research
03/21/2022

Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval

With the recent success of dense retrieval methods based on bi-encoders,...
research
10/31/2021

PIE: Pseudo-Invertible Encoder

We consider the problem of information compression from high dimensional...
research
07/08/2022

An Efficiency Study for SPLADE Models

Latency and efficiency issues are often overlooked when evaluating IR mo...

Please sign up or login with your details

Forgot password? Click here to reset