Uninformative Input Features and Counterfactual Invariance: Two Perspectives on Spurious Correlations in Natural Language

04/09/2022
by   Jacob Eisenstein, et al.
0

Spurious correlations are a threat to the trustworthiness of natural language processing systems, motivating research into methods for identifying and eliminating them. Gardner et al (2021) argue that the compositional nature of language implies that all correlations between labels and individual input features are spurious. This paper analyzes this proposal in the context of a toy example, demonstrating three distinct conditions that can give rise to feature-label correlations in a simple PCFG. Linking the toy example to a structured causal model shows that (1) feature-label correlations can arise even when the label is invariant to interventions on the feature, and (2) feature-label correlations may be absent even when the label is sensitive to interventions on the feature. Because input features will be individually correlated with labels in all but very rare circumstances, domain knowledge must be applied to identify spurious correlations that pose genuine robustness threats.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/26/2023

Controlling Learned Effects to Reduce Spurious Correlations in Text Classifiers

To address the problem of NLP classifiers learning spurious correlations...
research
02/08/2019

Collaboration based Multi-Label Learning

It is well-known that exploiting label correlations is crucially importa...
research
03/14/2023

Features matching using natural language processing

The feature matching is a basic step in matching different datasets. Thi...
research
04/17/2021

Competency Problems: On Finding and Removing Artifacts in Language Data

Much recent work in NLP has documented dataset artifacts, bias, and spur...
research
06/02/2021

Towards Robust Classification Model by Counterfactual and Invariant Data Generation

Despite the success of machine learning applications in science, industr...
research
05/31/2021

Counterfactual Invariance to Spurious Correlations: Why and How to Pass Stress Tests

Informally, a `spurious correlation' is the dependence of a model on som...
research
04/27/2022

On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations

Recent work has shown that deep learning models in NLP are highly sensit...

Please sign up or login with your details

Forgot password? Click here to reset