Sorting through the noise: Testing robustness of information processing in pre-trained language models

09/25/2021
by   Lalchand Pandia, et al.
0

Pre-trained LMs have shown impressive performance on downstream NLP tasks, but we have yet to establish a clear understanding of their sophistication when it comes to processing, retaining, and applying information presented in their input. In this paper we tackle a component of this question by examining robustness of models' ability to deploy relevant context information in the face of distracting content. We present models with cloze tasks requiring use of critical context information, and introduce distracting content to test how robustly the models retain and use that critical information for prediction. We also systematically manipulate the nature of these distractors, to shed light on dynamics of models' use of contextual cues. We find that although models appear in simple contexts to make predictions based on understanding and applying relevant facts from prior context, the presence of distracting but irrelevant content has clear impact in confusing model predictions. In particular, models appear particularly susceptible to factors of semantic similarity and word position. The findings are consistent with the conclusion that LM predictions are driven in large part by superficial contextual cues, rather than by robust representations of context meaning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2023

Counteracts: Testing Stereotypical Representation in Pre-trained Language Models

Language models have demonstrated strong performance on various natural ...
research
09/09/2019

Reverse Transfer Learning: Can Word Embeddings Trained for Different NLP Tasks Improve Neural Language Models?

Natural language processing (NLP) tasks tend to suffer from a paucity of...
research
03/20/2023

Context-faithful Prompting for Large Language Models

Large language models (LLMs) encode parametric knowledge about world fac...
research
09/27/2021

Pragmatic competence of pre-trained language models through the lens of discourse connectives

As pre-trained language models (LMs) continue to dominate NLP, it is inc...
research
11/09/2022

Large Language Models with Controllable Working Memory

Large language models (LLMs) have led to a series of breakthroughs in na...
research
10/05/2022

"No, they did not": Dialogue response dynamics in pre-trained language models

A critical component of competence in language is being able to identify...
research
10/18/2022

On the Information Content of Predictions in Word Analogy Tests

An approach is proposed to quantify, in bits of information, the actual ...

Please sign up or login with your details

Forgot password? Click here to reset