NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

05/23/2023
by   Livio Baldini Soares, et al.
0

Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86 cross-attention model with a lexicalized scoring function that only requires 10-6 CPUs. When combined with a BM25 retriever, this approach matches the quality of a state-of-the art dual encoder retriever, that still requires an accelerator for query encoding. We introduce NAIL (Non-Autoregressive Indexing with Language models) as a model architecture that is compatible with recent encoder-decoder and decoder-only large language models, such as T5, GPT-3 and PaLM. This model architecture can leverage existing pre-trained checkpoints and can be fine-tuned for efficiently constructing document representations that do not require neural processing of queries.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/25/2022

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

State-of-the-art neural models typically encode document-query pairs usi...
research
09/13/2022

SpaDE: Improving Sparse Representations using a Dual Document Encoder for First-stage Retrieval

Sparse document representations have been widely used to retrieve releva...
research
12/19/2022

Latent Diffusion for Language Generation

Diffusion models have achieved great success in modeling continuous data...
research
05/09/2023

An Exploration of Encoder-Decoder Approaches to Multi-Label Classification for Legal and Biomedical Text

Standard methods for multi-label text classification largely rely on enc...
research
02/22/2022

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

Recently, end-to-end automatic speech recognition models based on connec...
research
02/18/2021

Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder

Many real-world applications use Siamese networks to efficiently match t...
research
04/17/2023

Learning to Compress Prompts with Gist Tokens

Prompting is now the primary way to utilize the multitask capabilities o...

Please sign up or login with your details

Forgot password? Click here to reset