You can't pick your neighbors, or can you? When and how to rely on retrieval in the kNN-LM

10/28/2022
by   Andrew Drozdov, et al.
0

Retrieval-enhanced language models (LMs), which condition their predictions on text retrieved from large external datastores, have recently shown significant perplexity improvements compared to standard LMs. One such approach, the kNN-LM, interpolates any existing LM's predictions with the output of a k-nearest neighbors model and requires no additional training. In this paper, we explore the importance of lexical and semantic matching in the context of items retrieved by kNN-LM. We find two trends: (1) the presence of large overlapping n-grams between the datastore and evaluation set plays an important factor in strong performance, even when the datastore is derived from the training data; and (2) the kNN-LM is most beneficial when retrieved items have high semantic similarity with the query. Based on our analysis, we define a new formulation of the kNN-LM that uses retrieval quality to assign the interpolation coefficient. We empirically measure the effectiveness of our approach on two English language modeling datasets, Wikitext-103 and PG-19. Our re-formulation of the kNN-LM is beneficial in both cases, and leads to nearly 4

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/29/2023

Test-Time Training on Nearest Neighbors for Large Language Models

Many recent efforts aim to augment language models with relevant informa...
research
05/25/2023

Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models

Augmenting language models with a retrieval mechanism has been shown to ...
research
11/01/2019

Generalization through Memorization: Nearest Neighbor Language Models

We introduce kNN-LMs, which extend a pre-trained neural language model (...
research
02/27/2017

Approches d'analyse distributionnelle pour améliorer la désambiguïsation sémantique

Word sense disambiguation (WSD) improves many Natural Language Processin...
research
08/09/2021

IntenT5: Search Result Diversification using Causal Language Models

Search result diversification is a beneficial approach to overcome under...
research
03/29/2022

The Inefficiency of Language Models in Scholarly Retrieval: An Experimental Walk-through

Language models are increasingly becoming popular in AI-powered scientif...
research
02/02/2018

Preserved Structure Across Vector Space Representations

Certain concepts, words, and images are intuitively more similar than ot...

Please sign up or login with your details

Forgot password? Click here to reset