Mitigating Bias in Set Selection with Noisy Protected Attributes

by   Anay Mehrotra, et al.

Subset selection algorithms are ubiquitous in AI-driven applications, including, online recruiting portals and image search engines, so it is imperative that these tools are not discriminatory on the basis of protected attributes such as gender or race. Currently, fair subset selection algorithms assume that the protected attributes are known as part of the dataset. However, attributes may be noisy due to errors during data collection or if they are imputed (as is often the case in real-world settings). While a wide body of work addresses the effect of noise on the performance of machine learning algorithms, its effect on fairness remains largely unexamined. We find that in the presence of noisy protected attributes, in attempting to increase fairness without considering noise, one can, in fact, decrease the fairness of the result! Towards addressing this, we consider an existing noise model in which there is probabilistic information about the protected attributes (e.g.,[19, 32, 56, 44]), and ask is fair selection is possible under noisy conditions? We formulate a “denoised” selection problem which functions for a large class of fairness metrics; given the desired fairness goal, the solution to the denoised problem violates the goal by at most a small multiplicative amount with high probability. Although the denoised problem turns out to be NP-hard, we give a linear-programming based approximation algorithm for it. We empirically evaluate our approach on both synthetic and real-world datasets. Our empirical results show that this approach can produce subsets which significantly improve the fairness metrics despite the presence of noisy protected attributes, and, compared to prior noise-oblivious approaches, has better Pareto-tradeoffs between utility and fairness.


page 1

page 2

page 3

page 4


Fair Classification with Noisy Protected Attributes

Due to the growing deployment of classification algorithms in various so...

When Fair Classification Meets Noisy Protected Attributes

The operationalization of algorithmic fairness comes with several practi...

Paired-Consistency: An Example-Based Model-Agnostic Approach to Fairness Regularization in Machine Learning

As AI systems develop in complexity it is becoming increasingly hard to ...

Estimation of Fair Ranking Metrics with Incomplete Judgments

There is increasing attention to evaluating the fairness of search syste...

Bounding and Approximating Intersectional Fairness through Marginal Fairness

Discrimination in machine learning often arises along multiple dimension...

Fair Learning with Private Demographic Data

Sensitive attributes such as race are rarely available to learners in re...

Fair Classification with Adversarial Perturbations

We study fair classification in the presence of an omniscient adversary ...

Please sign up or login with your details

Forgot password? Click here to reset