Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval

01/28/2022
by   Uri Alon, et al.
0

Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step. In this paper, we present RetoMaton – retrieval automaton – which approximates the datastore search, based on (1) clustering of entries into "states", and (2) state transitions from previous entries. This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list. The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity, or alternatively saves up to 83 al., 2020), without hurting perplexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2023

REPLUG: Retrieval-Augmented Black-Box Language Models

We introduce REPLUG, a retrieval-augmented language modeling framework t...
research
07/06/2023

Improving Retrieval-Augmented Large Language Models via Data Importance Learning

Retrieval augmentation enables large language models to take advantage o...
research
04/19/2023

BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer

Retrieval-based language models are increasingly employed in question-an...
research
10/11/2022

Decoupled Context Processing for Context Augmented Language Modeling

Language models can be augmented with a context retriever to incorporate...
research
05/29/2021

NeuralLog: Natural Language Inference with Joint Neural and Logical Reasoning

Deep learning (DL) based language models achieve high performance on var...
research
07/12/2018

Optimal Strategies for Matching and Retrieval Problems by Comparing Covariates

In many retrieval problems, where we must retrieve one or more entries f...
research
07/30/2022

Smoothing Entailment Graphs with Language Models

The diversity and Zipfian frequency distribution of natural language pre...

Please sign up or login with your details

Forgot password? Click here to reset