NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

05/23/2023

∙

Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86 cross-attention model with a lexicalized scoring function that only requires 10-6 CPUs. When combined with a BM25 retriever, this approach matches the quality of a state-of-the art dual encoder retriever, that still requires an accelerator for query encoding. We introduce NAIL (Non-Autoregressive Indexing with Language models) as a model architecture that is compatible with recent encoder-decoder and decoder-only large language models, such as T5, GPT-3 and PaLM. This model architecture can leverage existing pre-trained checkpoints and can be fine-tuned for efficiently constructing document representations that do not require neural processing of queries.

READ FULL TEXT

NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders

Sign in with Google

Consider DeepAI Pro