Rethinking Counterfactual Data Augmentation Under Confounding

05/29/2023
by   Abbavaram Gowtham Reddy, et al.
0

Counterfactual data augmentation has recently emerged as a method to mitigate confounding biases in the training data for a machine learning model. These biases, such as spurious correlations, arise due to various observed and unobserved confounding variables in the data generation process. In this paper, we formally analyze how confounding biases impact downstream classifiers and present a causal viewpoint to the solutions based on counterfactual data augmentation. We explore how removing confounding biases serves as a means to learn invariant features, ultimately aiding in generalization beyond the observed data distribution. Additionally, we present a straightforward yet powerful algorithm for generating counterfactual images, which effectively mitigates the influence of confounding effects on downstream classifiers. Through experiments on MNIST variants and the CelebA datasets, we demonstrate the effectiveness and practicality of our approach.

READ FULL TEXT

page 2

page 9

page 15

page 16

page 17

page 18

research
10/22/2022

Counterfactual Generation Under Confounding

A machine learning model, under the influence of observed or unobserved ...
research
08/27/2021

Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing

Machine learning models achieve state-of-the-art performance on many sup...
research
03/28/2021

Deconfounded Score Method: Scoring DAGs with Dense Unobserved Confounding

Unobserved confounding is one of the greatest challenges for causal disc...
research
04/26/2023

Implicit Counterfactual Data Augmentation for Deep Neural Networks

Machine-learning models are prone to capturing the spurious correlations...
research
07/22/2020

Debiasing Concept Bottleneck Models with Instrumental Variables

Concept-based explanation approach is a popular model interpertability t...
research
11/29/2021

Equitable modelling of brain imaging by counterfactual augmentation with morphologically constrained 3D deep generative models

We describe Countersynth, a conditional generative model of diffeomorphi...
research
12/12/2019

It's easy to fool yourself: Case studies on identifying bias and confounding in bio-medical datasets

Confounding variables are a well known source of nuisance in biomedical ...

Please sign up or login with your details

Forgot password? Click here to reset