Fairness-Aware Data Valuation for Supervised Learning

03/29/2023
by   José Pombal, et al.
0

Data valuation is a ML field that studies the value of training instances towards a given predictive task. Although data bias is one of the main sources of downstream model unfairness, previous work in data valuation does not consider how training instances may influence both performance and fairness of ML models. Thus, we propose Fairness-Aware Data vauatiOn (FADO), a data valuation framework that can be used to incorporate fairness concerns into a series of ML-related tasks (e.g., data pre-processing, exploratory data analysis, active learning). We propose an entropy-based data valuation metric suited to address our two-pronged goal of maximizing both performance and fairness, which is more computationally efficient than existing metrics. We then show how FADO can be applied as the basis for unfairness mitigation pre-processing techniques. Our methods achieve promising results – up to a 40 p.p. improvement in fairness at a less than 1 p.p. loss in performance compared to a baseline – and promote fairness in a data-centric way, where a deeper understanding of data quality takes center stage.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2022

Can Ensembling Pre-processing Algorithms Lead to Better Machine Learning Fairness?

As machine learning (ML) systems get adopted in more critical areas, it ...
research
07/14/2022

Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey

This paper provides a comprehensive survey of bias mitigation methods fo...
research
07/07/2022

A Comprehensive Empirical Study of Bias Mitigation Methods for Software Fairness

Software bias is an increasingly important operational concern for softw...
research
02/03/2022

FORML: Learning to Reweight Data for Fairness

Deployed machine learning models are evaluated by multiple metrics beyon...
research
02/04/2023

Matrix Estimation for Individual Fairness

In recent years, multiple notions of algorithmic fairness have arisen. O...
research
09/15/2022

iFlipper: Label Flipping for Individual Fairness

As machine learning becomes prevalent, mitigating any unfairness present...
research
05/15/2023

Private Training Set Inspection in MLaaS

Machine Learning as a Service (MLaaS) is a popular cloud-based solution ...

Please sign up or login with your details

Forgot password? Click here to reset