Fair Data Integration

06/10/2020
by   Sainyam Galhotra, et al.
39

The use of machine learning (ML) in high-stakes societal decisions has encouraged the consideration of fairness throughout the ML lifecycle. Although data integration is one of the primary steps to generate high quality training data, most of the fairness literature ignores this stage. In this work, we consider fairness in the integration component of data management, aiming to identify features that improve prediction without adding any bias to the dataset. We work under the causal interventional fairness paradigm. Without requiring the underlying structural causal model a priori, we propose an approach to identify a sub-collection of features that ensure the fairness of the dataset by performing conditional independence tests between different subsets of features. We use group testing to improve the complexity of the approach. We theoretically prove the correctness of the proposed algorithm to identify features that ensure interventional fairness and show that sub-linear conditional independence tests are sufficient to identify these variables. A detailed empirical evaluation is performed on real-world datasets to demonstrate the efficacy and efficiency of our technique.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/20/2019

Data Management for Causal Algorithmic Fairness

Fairness is increasingly recognized as a critical component of machine l...
research
12/21/2020

The Importance of Modeling Data Missingness in Algorithmic Fairness: A Causal Perspective

Training datasets for machine learning often have some form of missingne...
research
09/20/2021

Algorithmic Fairness Verification with Graphical Models

In recent years, machine learning (ML) algorithms have been deployed in ...
research
02/16/2023

Individual Fairness Guarantee in Learning with Censorship

Algorithmic fairness, studying how to make machine learning (ML) algorit...
research
03/30/2023

Non-Invasive Fairness in Learning through the Lens of Data Drift

Machine Learning (ML) models are widely employed to drive many modern da...
research
10/24/2019

Fairness Sample Complexity and the Case for Human Intervention

With the aim of building machine learning systems that incorporate stand...
research
05/06/2020

Ensuring Fairness under Prior Probability Shifts

In this paper, we study the problem of fair classification in the presen...

Please sign up or login with your details

Forgot password? Click here to reset