DeepAI AI Chat
Log In Sign Up

"You are grounded!": Latent Name Artifacts in Pre-trained Language Models

by   Vered Shwartz, et al.
Allen Institute for Artificial Intelligence

Pre-trained language models (LMs) may perpetuate biases originating in their training corpus to downstream models. We focus on artifacts associated with the representation of given names (e.g., Donald), which, depending on the corpus, may be associated with specific entities, as indicated by next token prediction (e.g., Trump). While helpful in some contexts, grounding happens also in under-specified or inappropriate contexts. For example, endings generated for "Donald is a" substantially differ from those of other names, and often have more-than-average negative sentiment. We demonstrate the potential effect on downstream tasks with reading comprehension probes where name perturbation changes the model answers. As a silver lining, our experiments suggest that additional pre-training on different corpora may mitigate this bias.


On the Transferability of Pre-trained Language Models: A Study from Artificial Datasets

Pre-training language models (LMs) on large-scale unlabeled text data ma...

Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models

We use a dataset of U.S. first names with labels based on predominant ge...

REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training

Pre-trained Language Models (PLMs) have achieved great success on Machin...

Adaptation of Deep Bidirectional Multilingual Transformers for Russian Language

The paper introduces methods of adaptation of multilingual masked langua...

Identifying and Measuring Token-Level Sentiment Bias in Pre-trained Language Models with Prompts

Due to the superior performance, large-scale pre-trained language models...

Awakening Latent Grounding from Pretrained Language Models for Semantic Parsing

Recent years pretrained language models (PLMs) hit a success on several ...

Detoxifying Language Models with a Toxic Corpus

Existing studies have investigated the tendency of autoregressive langua...