1 Introduction
As machine learning (ML) algorithms become more mainstream and embedded into our society, evidence has surfaced questioning whether they produce highquality predictions for most members of diverse populations. The work on
fairness in machine learning aims to understand the extent to which existing ML methods produce equally highquality predictions for different individuals, and what new methods can remove the discrepancies therein (Barocas et al., 2018). The appropriate formalization of fair or highquality predictions necessarily varies based upon the domain, leading to a variety of definitions, largely falling into either the category of individual fairness (e.g., Dwork et al., 2012; Dwork and Ilvent, 2018) or group fairness (e.g., Kamishima et al., 2012; Hardt et al., 2016; Kleinberg et al., 2017; Pleiss et al., 2017; Zafar et al., 2017a, b). The former focuses on ensuring some property for every individual (and usually is agnostic to any group membership), with the latter asking that some statistic (e.g., accuracy or false positive rate) be similar for different groups. One key drawback of individual fairness is the need for the existence of a similarity metric over the space of individuals. Group fairness, analogously, usually requires knowledge of group membership (such as gender or race), encoded by a protected attribute. Arguably a more reasonable requirement than asking for a similarity metric, in many practical applications perfect knowledge of the protected attribute is still an invalid assumption. In this work, we ask to what extent one can guarantee group fairness criteria with only limited information about the protected attribute, generalizing the applicability of such methods.Our work explores the question of when imperfect or perturbed protected attribute information can be substituted for the true protected attribute into an existing algorithmic framework for fairness with limited harm to the resulting model’s fairness and accuracy. In particular, one would never want to end up in a situation where the “fair” classifier obtained from perturbed protected attribute information has worse fairness guarantees than a classifier that ignores fairness altogether, when tested on the true data distribution. In this work we explore the question posed above in the context of fair classification. In particular, due to its simplicity and widespread applicability, we study the prominent postprocessing method of Hardt et al. (2016) for ensuring equalized odds.
Another motivation for studying the robustness of an existing ML method for fairness comes from the fact that an adversary, with the knowledge that the method incorporates fairness, can easily corrupt the data. For example, the adversary could simply change the protected attribute labels of some fraction of the data points. Such corruptions might not be easily detectable via standard methods such as PCA. Hence, it is important to characterize the robustness of existing methods to such perturbations. We give surprisingly strong theoretical and empirical evidence that the equalized odds postprocessing method of Hardt et al. performs well even when based on data with perturbed attribute information.
Our main theoretical result is that as long as the perturbation of the protected attribute in the training data is somewhat moderate (in the balanced case, where all classes and groups have the same size, the attribute of almost half of the data points can be incorrect), the equalized odds postprocessing method of Hardt et al. based on the perturbed attribute produces a classifier that is more fair than the original classifier . At the same time, under some natural assumptions, the accuracy of will never be worse than the classifier obtained from running equalized odds with the true protected attribute. While a similar phenomenon was empirically observed in the recent work of Gupta et al. (2018) (see Section 3 for related work), our work is the first to provide formal guarantees on the effectiveness of a prominent method for fairness in ML even under highly perturbed group information. We further validate our claims empirically under a wide range of settings on both synthetic and real data. We also compare to a group agnostic approach recently proposed by Hashimoto et al. (2018) in a setting of repeated loss minimization.
2 Equalized odds with a perturbation of the protected attribute
We first review the equalized odds postprocessing method of Hardt et al. (2016), assuming the true protected attribute for every data point is known. We then describe our noise model for perturbing the protected attribute and present our analysis of equalized odds under this noise model. Like Hardt et al. and as it is common in the literature on fair machine learning (e.g., Pleiss et al., 2017; Hashimoto et al., 2018)
, we deal with the distributional setting and ignore the effect of estimating probabilities from finite training samples.
2.1 Equalized odds
Let , and
be random variables with some joint probability distribution. The variable
represents a data point ( is some suitable set), is the data point’s groundtruth label and its protected attribute. Like Hardt et al. (2016), we only consider the case of binary classification and a binary protected attribute. The goal in fair classification is to predict from , or from , such that the prediction is “fair” with respect to the two groups defined by and . Think of the standard example of hiring: in this case, would be a collection of features describing an applicant such as his / her GPA, work experience or language skills, would encode whether the applicant is a good fit for the job or not, and could encode the applicant’s gender or skin color. There are numerous formulations of what it means for a prediction to be fair in such an example (some of them contradicting each other; see Section 3), among which the notion of equalized odds as introduced by Hardt et al. is one of the most prominent ones. Denoting the (possibly randomized) prediction by , the prediction satisfies the equalized odds criterion if^{1}^{1}1Throughout the paper we assume that for and .(1) 
Equation (1) for requires that has equal true positive rates for the two groups and , and for it requires to have equal false positive rates. In their paper, Hardt et al. propose a simple postprocessing method to derive a predictor that satisfies the equalized odds criterion from a predictor that does not, which works as follows: given a data point with and , the predictor predicts with probability (note that depends on and only via and ). The four probabilities are computed in such a way that (i) satisfies the equalized odds criterion, and (ii) the probability of not equaling
is minimized. The former requirement and the latter objective naturally give rise to the following linear program:
(2) 
Note that the linear program (2) does not have a unique solution: rewriting the objective function by exploiting the constraints, it is easy to see that if is an optimal solution, then , for any such that , is an optimal solution too, and there might be even more other optimal solutions. Hence, the derived predictor is not uniquely defined. We will refer to any predictor that is derived via an optimal solution to (2) as a derived equalized odds predictor. Throughout the paper, we will use the terms predictor and classifier interchangeably.
2.2 Noise model for perturbing the protected attribute
When deriving an equalized odds predictor from a given classifier one has to estimate the probabilities and that appear in the linear program (2) from training data and then solve the resulting linear program (2) for some optimal probabilities . This is the training phase in the equalized odds procedure. In the test phase, when applying in order to predict the groundtruth label of a test point, one just needs to toss a coin and output a label estimate of with probability , or with probability , if and for the test point.
The noise model that we consider captures the scenario that the protected attribute in the training data has been corrupted. More specifically, we assume that in the training phase the probabilities and are replaced by and , respectively. The random variable denotes the perturbed, or corrupted, protected attribute. We assume that given the groundtruth label and the true protected attribute , the prediction and the corrupted attribute are independent, that is we have, for all and ,
(3) 
Other than in the training phase, in the test phase we assume that we have access to the true protected attribute without any corruption. Hence, the probabilities of a derived equalized odds predictor for predicting depend upon the perturbed protected attribute, but the predictions themselves depend on the true protected attribute. Our noise model applies to scenarios in which a classifier is trained on unreliable data (e.g., crowdsourced data, data obtained from a third party, or when a classifier predicts the unavailable protected attribute) and then applied to test data for which the protected attribute can easily be verified (for example, when performing inperson hiring).
2.3 Bias and error of a derived equalized odds predictor under perturbation
We define the bias for the class of a predictor as the absolute error in the equalized odds condition (1) for this class, that is
(4) 
Similarly, we define . The error of or is simply and , respectively. Note that and refer to the bias and error of in the test phase, and recall from Section 2.2 that in the test phase, according to our noise model, a derived equalized odds predictor always makes its prediction based on and the true protected attribute , regardless of whether the attribute has been corrupted in the training phase.
Let now be a derived equalized odds predictor for which the protected attribute in the equalized odds training phase has been corrupted, that is it is based on the linear program (2) with replaced by , and be a derived equalized odds predictor without any corruption. The main claim of our paper is that, under some mild assumptions, we have
(5) 
where is the given predictor from which and are derived, and that
(6) 
Our claim states a beneficial property of the equalized odds method: if one is willing to pay the price of fairness (i.e., a loss in prediction accuracy) and would run equalized odds when being guaranteed to observe the true protected attribute, one should also run it when the protected attribute in the training phase might have been corrupted. By running the equalized odds method, one still reduces the bias of while increasing its error by not more than what one is willing to pay for fairness. Actually, our proofs and experiments show that the bias and the error of interpolate nicely between those of and .
We begin with relating the bias of to the bias of in the following theorem. Recall from Section 2.1 that a derived equalized odds predictor is not uniquely defined.
Theorem 1 (Bias of vs. bias of ).
Assume that for and . We distinguish two cases depending on whether
(7) 
for , holds or not.

Assume that (7) holds for . Then, for , any derived equalized odds predictor satisfies
(8) where is some differentiable function that is strictly increasing both in and in with and for all with .

In the degenerate case with (7) not being true for , one derived equalized odds predictor is the constant predictor or with , .
The proof of Theorem 1 can be found in Section A.2 in the appendix. Note that in the nondegenerate case we have whenever the corruption of the protected attribute in the class is moderate in the sense that . If , this condition is equivalent to .
Next, we analyze the error of and relate it to the error of . We will assume that the given predictor is correlated with the groundtruth label in the sense that
(9) 
In our experiments in Section 4.1 we will see that assumption (9) is necessary for our claim (6) to hold. For our theoretical analysis we also make two simplifying assumptions. First, we assume a balanced case in which , , . Second, we assume that does not depend on the values of and (to give an example, this is the case if every protected attribute is flipped independently with the same probability ). However, our experiments in Section 4.1 show that our claim (6) also holds in the unbalanced case and when does depend on the values of and . Note that in the balanced case the assumption (9) is equivalent to , , that is to being a weak learner for both of the groups and . We have the following theorem:
Theorem 2 (Error of vs. error of ).
Assume that , , , and that the given classifier is a weak learner for both of the groups and . Furthermore, assume that and that this probability does not depend on and . Then we have for any derived equalized odds predictors and that
where the equality holds if and only if the given classifier is unbiased, that is .
3 Related work
By now, there is a huge body of work on fairness in ML, mainly in supervised learning
(e.g., Kamishima et al., 2012; Kamiran and Calders, 2012; Zemel et al., 2013; Feldman et al., 2015; Hardt et al., 2016; Kleinberg et al., 2017; Pleiss et al., 2017; Woodworth et al., 2017; Zafar et al., 2017a, b; Agarwal et al., 2018; Donini et al., 2018; Menon and Williamson, 2018; Xu et al., 2018), but more recently also in unsupervised learning
(e.g., Chierichetti et al., 2017; Celis et al., 2018; Samadi et al., 2018; Schmidt et al., 2018; Kleindessner et al., 2019a, b). All of these papers assume to know the true value of the protected attribute for each data point. We will discuss some papers not making this assumption below. First we discuss the pieces of work related to the fairness notion of equalized odds, which is central to our paper and one of the most prominent fairness notions (see Verma and Rubin, 2018, for a summary of the various notions and a citation count).Equalized odds
Our paper builds upon the equalized odds postprocessing method of Hardt et al. (2016) as described in Section 2.1. Hardt et al. also show how to derive an optimal predictor satisfying the equalized odds criterion based on a biased score function (with values in expressing the likelihood of ) rather than a binary classifier . However, in this case the resulting optimization problem is no longer a linear program and it is unclear how to extend our analysis to it. Concurrently with the paper by Hardt et al., the fairness notion of equalized odds has also been proposed by Zafar et al. (2017b) under the name of disparate mistreatment. Zafar et al. incorporate a proxy for the equalized odds criterion into the training phase of a decision boundarybased classifier, which leads to a convexconcave optimization problem and does not come with any theoretical guarantees. Kleinberg et al. (2017) show that, except for trivial cases, a classifier cannot satisfy the equalized odds criterion and the fairness notion of calibration within groups (Kleinberg et al., 2017) at the same time. Subsequently, Pleiss et al. (2017) show how to achieve calibration within groups and a relaxed form of the equalized odds constraints at the same time. The work of Woodworth et al. (2017)
shows that for certain loss functions postprocessing a Bayes optimal unfair classifier does not necessarily lead to a Bayes optimal fair classifier (fair / unfair with respect to the fairness notion of equalized odds). They propose a two stage procedure where some approximate fairness constraints are incorporated into the empirical risk minimization framework to get a classifier that is fair to a nontrivial degree, and then using the equalized odds postprocessing method to get the final classifier.
Fairness without protected attributes
Dwork et al. (2012) phrased the notion of individual fairness already mentioned in Section 1, according to which similar data points (as measured by a given metric) should be treated similarly by a randomized classifier. Only recently, there have been works studying how to satisfy group fairness criteria when having only limited information about the protected attribute. Most important to mention is the work of Gupta et al. (2018). Their paper empirically shows that when the protected attribute is not known, improving a fairness metric for a proxy of the true protected attribute might be a valuable strategy to improve the fairness metric for the true attribute. Also important to mention are the works by Lamy et al. (2019) and Hashimoto et al. (2018). Lamy et al. (2019) study a scenario related to ours and consider training a fair classifier when the protected attribute is corrupted. Similarly to our Theorem 1, they show that the bias of a classifier trained with the corrupted attribute grows in a certain way with the amount of corruption (where the bias is defined according to the fairness notion of equalized odds or demographic parity). However, they do not investigate the error / accuracy of such a classifier. Importantly, Lamy et al. only consider classifiers that do not use the protected attribute when making a prediction for a test point and pose it as an open question to extend their results to classifiers that do use the protected attribute when making a prediction. In our paper, we study the bias and the error of a derived equalized odds predictor under a perturbation of the protected attribute in the training phase of the equalized odds method. A derived equalized odds predictor crucially depends on the protected attribute when making a prediction, and hence our paper addresses the question raised by Lamy et al.. The paper by Hashimoto et al. (2018) uses distributionally robust optimization in order to minimize the worstcase misclassification risk in a ball around the data generating distribution. In doing so, under the assumption that the resulting nonconvex optimization problem was solved exactly (compare with Section 4.3), one provably controls the risk of each protected group without knowing which group a data point belongs to. Hashimoto et al. also show that their approach helps to avoid disparity amplification in a sequential classification setting in which a group’s fraction in the data decreases as its misclassification risk increases. As an application of our results, in Section 4.3 we experimentally compare the approach of Hashimoto et al. to the equalized odds method with perturbed protected attribute information in such a sequential setting. The paper by Kilbertus et al. (2018) provides an approach to fair classification when users to be classified are not willing to share their protected attribute but only an encrypted version of it. Their approach assumes the existence of a regulator with fairness aims and is based on secure multiparty computation. Chen et al. (2019) study the problem of assessing the demographic disparity of a classifier when the protected attribute is unknown and has to be estimated from data. Finally, Coston et al. (2019) study fair classification in a covariate shift setting where the protected attribute is only available in the source domain but not in the target domain (or the other way round).
4 Experiments
In this section, we present a number of experiments. First, we study the bias and the error of the equalized odds predictor as a function of the perturbation level in extensive simulations. Next, we show some experiments on real data. Finally, we consider the repeated loss minimization setting of Hashimoto et al. (2018) and demonstrate that the equalized odds method achieves the same goal as the strategy proposed by Hashimoto et al., even when the protected attribute is highly perturbed.
4.1 Simulations of bias and error
For various choices of the problem parameters and , we study how the bias and the error of a derived equalized odds predictor change as the perturbation probabilities , with which the protected attribute in the training phase is perturbed, increase. For doing so, we solve the linear program (2), where in all probabilities the random variable is replaced by (see Section A.3 in the appendix for details). We compare the bias and the error of to the bias and the error of , and we also compare the bias of to our theoretical bound provided in (8) in Theorem 1. Let , . Figure 1 shows the quantities of interest as a function of , where grow with in a certain way, in various scenarios (the probabilities can be read from the titles of the plots, and the other parameters are provided in Table 1 in Section A.1 in the appendix). For clarity, we only show the bias for the class , but note that for a corresponding choice of the parameters the bias for the class behaves in the same way. As suggested by our upper bound (8) in Theorem 1, the bias of is increasing as the perturbation level increases, and we can see that our upper bound is quite tight in most cases. For a moderate perturbation level with , the bias of is smaller than the bias of as claimed by Theorem 1. Although all plots show a nonbalanced case, which is not captured by Theorem 2, our claim (6) is still shown to be true: except for the bottom right plot, in which assumption (9) is not satisfied, the error of decreases as the perturbation level increases up to the point that the error of equals the error of . The bottom right plot shows that assumption (9) is indeed necessary. We make similar observations in a number of further experiments of this type presented in the appendix in Section A.4. Our findings empirically validate the main claims of our paper.
4.2 Experiments on real data
We run the equalized odds method on two real data sets when we perturb the protected attribute in one of two ways: either we set each protected attribute to its complementary value with probability independently of each other, or we (deterministically) flip the protected attribute of every data point whose score lies in the intervall . The score of a data point is the likelhood predicted by a classifier for the data point to belong to the class and is related to the given predictor in that predicts whenever the score is greater than and otherwise. We build upon the data provided by Pleiss et al. (2017). It contains the groundtruth labels, the true protected attributes and the predicted scores for the UCI Adult data set (Dua and Graff, 2019) and the COMPAS criminal recidivism risk assessment data set (Dieterich et al., 2016)
. The scores for the Adult data set are obtained from a multilayer perceptron, the scores for the COMPAS data set are the actual scores from the COMPAS risk assessment tool. We randomly split the data sets into a training and a test set of equal size (we report several statistics such as the sizes of the original data sets in the appendix in Section
A.5). Figure 2 shows the bias and the error of the given predictor and a derived equalized odds predictor for the two data sets and in the two perturbation scenarios as a function of the perturbation level and , respectively. The shown curves are obtained from averaging the results of 100 runs of the experiment. They look quite similar to the ones that we obtained in the experiments of Section 4.1 and again validate the main claims of our paper.4.3 Repeated loss minimization
We compare the equalized odds method to the method of Hashimoto et al. (2018), discussed in Section 3, in the sequential classification setting studied by Hashimoto et al.. In this setting, at each time step a classifier is trained on a data set that comprises several protected groups. The fraction of a group at time step step depends on the group’s fraction and the classifier’s accuracy on the group at time step . Hashimoto et al. show that in such a setting standard empirical risk minimization can lead to disparity amplification with a group having a very small fraction, and thus very small classification accuracy, after some time while their proposed method helps to avoid this situation.
In Figure 3 we present an experiment that reproduces and extends the experiment shown in Figure 5 in Hashimoto et al. (2018).^{2}^{2}2We used the code provided by Hashimoto et al. and extended it without changing any parameters. Figure 3
shows the classification accuracy (left plot) and the fraction (right plot) of the minority group over time for various classification strategies. In this experiment, there are only two groups that intially have the same size, and by minority group we mean the group that has a smaller fraction on average over time (hence, at some time steps the fraction of the minority group can be greater than one half). The classification strategies that we consider are all based on logistic regression. ERM refers to a logistic regression classifier trained with empirical risk minimization and DRO to a logistic regression classifier trained with distributionally robust optimization (the method propopsed by
Hashimoto et al.; see their paper for details). EO refers to the ERM strategy with equalized odds postprocessing. We consider EO using the true protected attribute and when the true attribute is perturbed and replaced by , which is obtained by flipping to its complementary value with probabilities and , respectively, independently for each data point. We can see from the plots that EO achieves the same goal as DRO, namely avoiding disparity amplification, even when the protected attribute is higly perturbed (orange and magenta curves). DRO achieves a slightly higher accuracy, at least in this experiment, and other than EO, it does not require knowledge about the protected attribute at all. However, the underlying optimization problem for DRO is nonconvex, and as a result the algorithm does not come with per step theoretical guarantees. Hence, we believe that in situations where one has access to a perturbed version of the protected attribute, the equalized odds method is a more sensible alternative.5 Discussion
In this paper, we studied the equalized odds method of Hardt et al. (2016) for fair classification when the protected attribute is perturbed. We gave strong theoretical and empirical evidence that as long as the perturbation is somewhat moderate, one should still run the equalized odds method with the perturbed attribute. In doing so, one still reduces the bias of the original classifier while not suffering too much in terms of accuracy. We believe that without such a property the practical applicability of a “fair” machine learning method is only limited. While there is some empirical work demonstrating the usefulness of using a proxy for the protected attribute when the protected attribute is not available (Gupta et al., 2018; see Section 3), our paper is the first to provide a rigorous theoretical analysis for such a claim. This opens up a new line of research in fairness in ML, asking which methods are robust to a perturbation of the protected attribute and if so, to what extent.
References
 Agarwal et al. (2018) A. Agarwal, A. Beygelzimer, M. Dudík, J. Langford, and H. Wallach. A reductions approach to fair classification. In International Conference on Machine Learning (ICML), 2018.
 Barocas et al. (2018) S. Barocas, M. Hardt, and A. Narayanan. Fairness and Machine Learning. fairmlbook.org, 2018. http://www.fairmlbook.org.
 Celis et al. (2018) L. E. Celis, V. Keswani, D. Straszak, A. Deshpande, T. Kathuria, and N. K. Vishnoi. Fair and diverse DPPbased data summarization. In International Conference on Machine Learning (ICML), 2018.
 Chen et al. (2019) J. Chen, N. Kallus, X. Mao, G. Svacha, and M. Udell. Fairness under unawareness: Assessing disparity when protected class is unobserved. In Conference on Fairness, Accountability, and Transparency (Fat*), 2019.
 Chierichetti et al. (2017) F. Chierichetti, R. Kumar, S. Lattanzi, and S. Vassilvitskii. Fair clustering through fairlets. In Neural Information Processing Systems (NIPS), 2017.

Coston et al. (2019)
A. Coston, K. N. Ramamurthy, D. Wei, K. Varshney, S. Speakman, Z. Mustahsan,
and S. Chakraborty.
Fair transfer learning with missing protected attributes.
InAAAI / ACM Conference on Artificial Intelligence, Ethics, and Society
, 2019.  Dieterich et al. (2016) W. Dieterich, C. Mendoza, and T. Brennan. COMPAS risk scales: Demonstrating accuracy equity and predictive parity. Technical report, Northpointe Inc., 2016. https://www.equivant.com/responsetopropublicademonstratingaccuracyequityandpredictiveparity/.
 Donini et al. (2018) M. Donini, L. Oneto, S. BenDavid, J. ShaweTaylor, and M. Pontil. Empirical risk minimization under fairness constraints. In Neural Information Processing Systems (NeurIPS), 2018.
 Dua and Graff (2019) D. Dua and C. Graff. UCI machine learning repository, 2019. https://archive.ics.uci.edu/ml/datasets/adult.
 Dwork and Ilvent (2018) C. Dwork and C. Ilvent. Individual fairness under composition. In Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML), 2018.
 Dwork et al. (2012) C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. Fairness through awareness. In Innovations in Theoretical Computer Science Conference (ITCS), 2012.
 Feldman et al. (2015) M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. In ACM International Conference on Knowledge Discovery and Data Mining (KDD), 2015.
 Gupta et al. (2018) M. Gupta, A. Cotter, M. M. Fard, and S. Wang. Proxy fairness. arXiv:1806.11212 [cs.LG], 2018.
 Hardt et al. (2016) M. Hardt, E. Price, and N. Srebro. Equality of opportunity in supervised learning. In Neural Information Processing Systems (NIPS), 2016.
 Hashimoto et al. (2018) T. Hashimoto, M. Srivastava, H. Namkoong, and P. Liang. Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning (ICML), 2018. Code available on https://bit.ly/2sFkDpE.
 Kamiran and Calders (2012) F. Kamiran and T. Calders. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1):1–33, 2012.
 Kamishima et al. (2012) T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Fairnessaware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 2012.
 Kilbertus et al. (2018) N. Kilbertus, A. Gascón, M. Kusner, M. Veale, K. P. Gummadi, and A. Weller. Blind justice: Fairness with encrypted sensitive attributes. In International Conference on Machine Learning (ICML), 2018.
 Kleinberg et al. (2017) J. Kleinberg, S. Mullainathan, and M. Raghavan. Inherent tradeoffs in the fair determination of risk scores. In Innovations in Theoretical Computer Science Conference (ITCS), 2017.
 Kleindessner et al. (2019a) M. Kleindessner, P. Awasthi, and J. Morgenstern. Fair center clustering for data summarization. In International Conference on Machine Learning (ICML), 2019a.

Kleindessner et al. (2019b)
M. Kleindessner, S. Samadi, P. Awasthi, and J. Morgenstern.
Guarantees for spectral clustering with fairness constraints.
In International Conference on Machine Learning (ICML), 2019b.  Lamy et al. (2019) A. L. Lamy, Z. Zhong, A. K. Menon, and N. Verma. Noisetolerant fair classification. arXiv:1901.10837 [cs.LG], 2019.
 Menon and Williamson (2018) A. K. Menon and R. C. Williamson. The cost of fairness in binary classification. In Conference on Fairness, Accountability, and Transparency, 2018.
 Pleiss et al. (2017) G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger. On fairness and calibration. In Neural Information Processing Systems (NIPS), 2017. Code and data available on https://github.com/gpleiss/equalized_odds_and_calibration.
 Samadi et al. (2018) S. Samadi, U. Tantipongpipat, J. Morgenstern, M. Singh, and S. Vempala. The price of fair PCA: One extra dimension. In Neural Information Processing Systems (NeurIPS), 2018.
 Schmidt et al. (2018) M. Schmidt, C. Schwiegelshohn, and C. Sohler. Fair coresets and streaming algorithms for fair kmeans clustering. arXiv:1812.10854 [cs.DS], 2018.
 Verma and Rubin (2018) S. Verma and J. Rubin. Fairness definitions explained. In International Workshop on Software Fairness (FairWare), 2018.
 Woodworth et al. (2017) B. Woodworth, S. Gunasekar, M. I. Ohannessian, and N. Srebro. Learning nondiscriminatory predictors. In Conference on Learning Theory (COLT), 2017.
 Xu et al. (2018) D. Xu, S. Yuan, L. Zhang, and X. Wu. Fairgan: Fairnessaware generative adversarial networks. In IEEE International Conference on Big Data, 2018.
 Zafar et al. (2017a) M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi. Fairness constraints: Mechanisms for fair classification. In International Conference on Artificial Intelligence and Statistics (AISTATS), 2017a.
 Zafar et al. (2017b) M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In International Conference on World Wide Web (WWW), 2017b.
 Zemel et al. (2013) R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork. Learning fair representations. In International Conference on Machine Learning (ICML), 2013.
Appendix A Appendix
a.1 Problem parameters for the experiments of Figure 1
Plot  
top left  0.9  0.8  0.4  0.1  
top right  0.9  0.6  0.7  0.1  
bottom left  0.9  0.6  0.4  0.1  
bottom right  0.9  0.6  0.3  0.8 
a.2 Proofs
We require a simple technical lemma.
Lemma 1.
Let and consider with
We have:

[leftmargin=1cm]

for all

for all

for all with

for all with

for all

and for all
Proof.
First note that for both denominators are greater than zero and is welldefined. Both fractions are not smaller than zero and not greater than one, which implies (i) to be true. It is trivial to show (ii). It is
from which (iii), (iv) and (v) follow. Finally, it is
We have
for all and hence
This shows . It follows from (v) that also for all . ∎
Now we can prove Theorem 1.
Proof of Theorem 1:
Let
(10) 
Then
(11) 
When computing the probabilities for , we have to replace and by and , respectively, in the linear program (2). Note that the assumption for and implies that for and . It is
and, for ,
Hence, we end up with the new linear program
(12) 
Some elementary calculations yield that the objective function in (12) equals
(13) 
and that the constraints are equivalent to
(14) 
Let
(15)  
(16)  
(17)  
(18) 
Then the constraints are
(19)  
Because of the constraints we have
(20) 
where
(21) 
It is straightforward to verify that condition (7) for is equivalent to and for it is equivalent to . If or , one optimal solution to (12) is or , depending on whether or . These probabilities correspond to the constant predictor or with , , and the proof for the degenerate case is complete.
So let us assume that and . Let . Because of
we have
(22) 