What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary

12/20/2022
by   Ori Ram, et al.
0

Dual encoders are now the dominant architecture for dense retrieval. Yet, we have little understanding of how they represent text, and why this leads to good performance. In this work, we shed light on this question via distributions over the vocabulary. We propose to interpret the vector representations produced by dual encoders by projecting them into the model's vocabulary space. We show that the resulting distributions over vocabulary tokens are intuitive and contain rich semantic information. We find that this view can explain some of the failure cases of dense retrievers. For example, the inability of models to handle tail entities can be explained via a tendency of the token distributions to forget some of the tokens of those entities. We leverage this insight and propose a simple way to enrich query and passage representations with lexical information at inference time, and show that this significantly improves performance compared to the original model in out-of-domain settings.

READ FULL TEXT

page 2

page 5

page 7

page 15

research
11/02/2022

Multi-Vector Retrieval as Sparse Alignment

Multi-vector retrieval models improve over single-vector dual encoders o...
research
02/06/2023

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval

Image-text retrieval (ITR) is a task to retrieve the relevant images/tex...
research
02/08/2020

LAVA NAT: A Non-Autoregressive Translation Model with Look-Around Decoding and Vocabulary Attention

Non-autoregressive translation (NAT) models generate multiple tokens in ...
research
05/01/2020

Sparse, Dense, and Attentional Representations for Text Retrieval

Dual encoder architectures perform retrieval by encoding documents and q...
research
09/17/2021

Simple Entity-Centric Questions Challenge Dense Retrievers

Open-domain question answering has exploded in popularity recently due t...
research
08/11/2022

On the Value of Behavioral Representations for Dense Retrieval

We consider text retrieval within dense representational space in real-w...
research
03/30/2020

Pruned Wasserstein Index Generation Model and wigpy Package

Recent proposal of Wasserstein Index Generation model (WIG) has shown a ...

Please sign up or login with your details

Forgot password? Click here to reset