REALM: Retrieval-Augmented Language Model Pre-Training

02/10/2020
by   Kelvin Guu, et al.
0

Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training (REALM) by fine-tuning on the challenging task of Open-domain Question Answering (Open-QA). We compare against state-of-the-art models for both explicit and implicit knowledge storage on three popular Open-QA benchmarks, and find that we outperform all previous methods by a significant margin (4-16 such as interpretability and modularity.

READ FULL TEXT
research
10/14/2021

CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training

With the rise of large-scale pre-trained language models, open-domain qu...
research
12/03/2019

Unsupervised Inflection Generation Using Neural Language Modeling

The use of Deep Neural Network architectures for Language Modeling has r...
research
12/31/2020

Studying Strategically: Learning to Mask for Closed-book QA

Closed-book question-answering (QA) is a challenging task that requires ...
research
11/15/2022

Large Language Models Struggle to Learn Long-Tail Knowledge

The internet contains a wealth of knowledge – from the birthdays of hist...
research
05/04/2023

2x Faster Language Model Pre-training via Masked Structural Growth

Acceleration of large language model pre-training is a critical issue in...
research
10/30/2022

Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts

Explicit decomposition modeling, which involves breaking down complex ta...
research
05/02/2023

Discern and Answer: Mitigating the Impact of Misinformation in Retrieval-Augmented Models with Discriminators

Most existing retrieval-augmented language models (LMs) for question ans...

Please sign up or login with your details

Forgot password? Click here to reset