Invariant Language Modeling

10/16/2021
by   Maxime Peyrard, et al.
0

Modern pretrained language models are critical components of NLP pipelines. Yet, they suffer from spurious correlations, poor out-of-domain generalization, and biases. Inspired by recent progress in causal machine learning, in particular the invariant risk minimization (IRM) paradigm, we propose invariant language modeling, a framework for learning invariant representations that generalize better across multiple environments. In particular, we adapt a game-theoretic implementation of IRM (IRM-games) to language models, where the invariance emerges from a specific training schedule in which all the environments compete to optimize their own environment-specific loss by updating subsets of the model in a round-robin fashion. In a series of controlled experiments, we demonstrate the ability of our method to (i) remove structured noise, (ii) ignore specific spurious correlations without affecting global performance, and (iii) achieve better out-of-domain generalization. These benefits come with a negligible computational overhead compared to standard training, do not require changing the local loss, and can be applied to any language model architecture. We believe this framework is promising to help mitigate spurious correlations and biases in language models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2023

FineDeb: A Debiasing Framework for Language Models

As language models are increasingly included in human-facing machine lea...
research
07/05/2019

Invariant Risk Minimization

We introduce Invariant Risk Minimization (IRM), a learning paradigm to e...
research
09/18/2023

Context is Environment

Two lines of work are taking the central stage in AI research. On the on...
research
08/04/2021

Mitigating harm in language models with conditional-likelihood filtration

Language models trained on large-scale unfiltered datasets curated from ...
research
07/28/2022

Diversity Boosted Learning for Domain Generalization with Large Number of Domains

Machine learning algorithms minimizing the average training loss usually...
research
10/31/2022

A Simple, Yet Effective Approach to Finding Biases in Code Generation

Recently, scores of high-performing code generation systems have surface...
research
07/14/2023

MorphPiece : Moving away from Statistical Language Representation

Tokenization is a critical part of modern NLP pipelines. However, contem...

Please sign up or login with your details

Forgot password? Click here to reset