Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation

10/04/2022
by   Aahlad Puli, et al.
0

There exist features that are related to the label in the same way across different settings for that task; these are semantic features or semantics. Features with varying relationships to the label are nuisances. For example, in detecting cows from natural images, the shape of the head is a semantic and because images of cows often have grass backgrounds but only in certain settings, the background is a nuisance. Relationships between a nuisance and the label are unstable across settings and, consequently, models that exploit nuisance-label relationships face performance degradation when these relationships change. Direct knowledge of a nuisance helps build models that are robust to such changes, but knowledge of a nuisance requires extra annotations beyond the label and the covariates. In this paper, we develop an alternative way to produce robust models by data augmentation. These data augmentations corrupt semantic information to produce models that identify and adjust for where nuisances drive predictions. We study semantic corruptions in powering different robust-modeling methods for multiple out-of distribution (OOD) tasks like classifying waterbirds, natural language inference, and detecting Cardiomegaly in chest X-rays.

READ FULL TEXT

page 5

page 6

page 13

research
06/29/2021

Predictive Modeling in the Presence of Nuisance-Induced Spurious Correlations

Deep predictive models often make use of spurious correlations between t...
research
08/13/2021

FlipDA: Effective and Robust Data Augmentation for Few-Shot Learning

Most previous methods for text data augmentation are limited to simple t...
research
12/16/2022

Better May Not Be Fairer: Can Data Augmentation Mitigate Subgroup Degradation?

It is no secret that deep learning models exhibit undesirable behaviors ...
research
07/11/2023

RoPDA: Robust Prompt-based Data Augmentation for Low-Resource Named Entity Recognition

Data augmentation has been widely used in low-resource NER tasks to tack...
research
08/12/2023

Semantic Equivariant Mixup

Mixup is a well-established data augmentation technique, which can exten...
research
10/14/2021

Nuisance-Label Supervision: Robustness Improvement by Free Labels

In this paper, we present a Nuisance-label Supervision (NLS) module, whi...

Please sign up or login with your details

Forgot password? Click here to reset