How Context Affects Language Models' Factual Predictions

05/10/2020
by   Fabio Petroni, et al.
0

When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing factual knowledge in a fixed number of weights of a language model clearly has limitations. Previous approaches have successfully provided access to information outside the model weights using supervised architectures that combine an information retrieval system with a machine reading component. In this paper, we go a step further and integrate information from a retrieval system with a pre-trained language model in a purely unsupervised way. We report that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline. Furthermore, processing query and context with different segment tokens allows BERT to utilize its Next Sentence Prediction pre-trained classifier to determine whether the context is relevant or not, substantially improving BERT's zero-shot cloze-style question-answering performance and making its predictions robust to noisy contexts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2021

Comprehension Based Question Answering using Bloom's Taxonomy

Current pre-trained language models have lots of knowledge, but a more l...
research
07/03/2020

On-The-Fly Information Retrieval Augmentation for Language Models

Here we experiment with the use of information retrieval as an augmentat...
research
04/11/2020

Unsupervised Commonsense Question Answering with Self-Talk

Natural language understanding involves reading between the lines with i...
research
05/05/2021

Rethinking Search: Making Experts out of Dilettantes

When experiencing an information need, users want to engage with an expe...
research
12/07/2022

Discovering Latent Knowledge in Language Models Without Supervision

Existing techniques for training language models can be misaligned with ...
research
05/16/2022

Heroes, Villains, and Victims, and GPT-3: Automated Extraction of Character Roles Without Training Data

This paper shows how to use large-scale pre-trained language models to e...
research
02/09/2023

Using Language Models for Enhancing the Completeness of Natural-language Requirements

[Context and motivation] Incompleteness in natural-language requirements...

Please sign up or login with your details

Forgot password? Click here to reset