A Neural Network Framework for Fair Classifier

by   P Manisha, et al.
IIIT Hyderabad

Machine learning models are extensively being used in decision making, especially for prediction tasks. These models could be biased or unfair towards a specific sensitive group either of a specific race, gender or age. Researchers have put efforts into characterizing a particular definition of fairness and enforcing them into the models. In this work, mainly we are concerned with the following three definitions, Disparate Impact, Demographic Parity and Equalized Odds. Researchers have shown that Equalized Odds cannot be satisfied in calibrated classifiers unless the classifier is perfect. Hence the primary challenge is to ensure a degree of fairness while guaranteeing as much accuracy as possible. Fairness constraints are complex and need not be convex. Incorporating them into a machine learning algorithm is a significant challenge. Hence, many researchers have tried to come up with a surrogate loss which is convex in order to build fair classifiers. Besides, certain papers try to build fair representations by preprocessing the data, irrespective of the classifier used. Such methods, not only require a lot of unrealistic assumptions but also require human engineered analytical solutions to build a machine learning model. We instead propose an automated solution which is generalizable over any fairness constraint. We use a neural network which is trained on batches and directly enforces the fairness constraint as the loss function without modifying it further. We have also experimented with other complex performance measures such as H-mean loss, Q-mean-loss, F-measure; without the need for any surrogate loss functions. Our experiments prove that the network achieves similar performance as state of the art. Thus, one can just plug-in appropriate loss function as per required fairness constraint and performance measure of the classifier and train a neural network to achieve that.



There are no comments yet.


page 1

page 2

page 3

page 4


On Fairness, Diversity and Randomness in Algorithmic Decision Making

Consider a binary decision making process where a single machine learnin...

Increasing Fairness in Predictions Using Bias Parity Score Based Loss Function Regularization

Increasing utilization of machine learning based decision support system...

Fair Classification with Group-Dependent Label Noise

This work examines how to train fair classifiers in settings where train...

Dynamic fairness - Breaking vicious cycles in automatic decision making

In recent years, machine learning techniques have been increasingly appl...

Fair AutoML

We present an end-to-end automated machine learning system to find machi...

Fairness in Rating Prediction by Awareness of Verbal and Gesture Quality of Public Speeches

The role of verbal and non-verbal cues towards great public speaking has...

Provably Training Neural Network Classifiers under Fairness Constraints

Training a classifier under fairness constraints has gotten increasing a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In recent years machine learning models have been popularized as prediction models to supplement the process of decision making. Such models have been used for criminal risk assessment, credit approvals, in online advertisement. Recent observations have revealed that these algorithms unknowingly introduce a societal bias through their predictions [Barocas and Selbst, 2016], [Berk et al., 0], [Chouldechova, 2017]

. This notion was popularized after ProPublica conducted its study of the risk assessment tool which was widely used by the judiciary system in USA. It was observed that the risk values for recidivism estimated for African-American defendants were on an average higher than the Caucasian defendants. Since then much work has been dedicated to quantifying the notion of fairness concerning a sensitive feature like race, sex, and age, etc.

Broadly fairness could be divided into two categories. Individual fairness [Dwork et al., 2012] which entails similar decision outcomes for two individuals belonging to two different groups with respect to the sensitive feature and yet sharing similar non-sensitive features. The other notion is of group fairness [Zemel et al., 2013] which requires different sensitive groups to receive beneficial outcomes in similar proportions. We are concerned with group fairness and specifically three types of it i.e. Demographic Parity (DP) [Dwork et al., 2012], Disparate Impact (DI) [Feldman et al., 2015] and Equalized odds (EO) [Dawid, 1982]. DP ensures that the positive outcome is given to the groups at the same rate. DI ensures the same differently. However, both fail when the base rate itself differs, hence EO is the more useful notion of fairness which ensures even distribution of false positive rates and false negative rates among the groups. It is obvious to see that all these definitions make sense only when the classifier is well calibrated. [Pleiss et al., 2017],[Chouldechova, 2017] show that it is impossible to achieve calibration with EO unless we have perfect classifiers. So ensuring fairness would hurt the accuracy of the algorithm and vice-versa. Henceforth the major challenge is to devise an algorithm which guarantees the best predictive accuracy while satisfying the constraints to a certain degree.

In designing an algorithm, it may seem natural to remove the sensitive feature during prediction to resolve the issue. The fact that this information may be hidden in the other correlated features within the data may defeat our purpose. There are approaches which preprocess the data to make it fair and then train a classifier. [Madras et al., 2018], [Beutel et al., 2017] are few papers which learn fair representations using neural-networks. They propose a two-stage process where learning the representation is independent of the prediction phase. The major challenge is to have a model which guarantees fairness at the time of prediction. EO or DP are complicated measures and we cannot directly implement them in an optimization problem solved by the machine learning algorithm as a constraint. Hence, the researchers have introduced surrogate measures to replace these constraints [Bilal Zafar et al., 2015], [Kamishima et al., 2011]. Coming up with a good surrogate is analytically challenging and do not work efficiently in all cases. The work in [Zhang et al., 2018] has explored the notion of having fair classifiers when pitted against an adversary which tries to make the model unfair. The adversarial loss proposed by the authors does not directly implement the fairness constraints as defined but serves as an upper bound. The authors in [Narasimhan, 2018] and [Agarwal et al., 2018] propose a reductionist approach in which a common framework is set for minimizing loss under fairness constraints. Wherein these papers use optimization techniques that break down the complex constraints at every step into cost-sensitive loss and train many base classifiers.

All the existing techniques to achieve fairness, that is coming up with the surrogate loss functions or coming up with a methodology for generating fair representations not specific to a particular task are analytically challenging. Our primary objective is to automate and generalize this entire process. In doing so, we do not aim to propose a new fairness measure or a new optimization technique. Rather, as opposed to the above approaches, we propose to use neural networks for implementing such complex measures.

Neural networks provide us with a generalized framework, to learn and implement different kinds of tasks. Mini-batch Stochastic Gradient Descent is the most efficient optimizer for training a neural network. It shares the benefits of Gradient Descent as well as Stochastic Gradient Descent. Fairness measures cannot be evaluated for a single sample, they make sense only when calculated across a batch, which contains data points from all the sensitive groups. Given that at every iteration the network processes mini-batch of data, we can approximate the fairness measure given appropriate batch size. Hence, we take advantage of the batch processing used for optimizing the models to implement the complex constraints. The network directly implements DP, EO and DI in its loss and manages to produce fair classification. Our neural network has better accuracy rates than the state of the art in some cases or has matching accuracy in most of the cases. We also experiment on other complex loss measures like H-mean loss, Q-mean loss,

-measure, and complex constraints like Coverage and KLD Error which are not directly implementable in a machine learning framework.

Given the motivation, we now discuss how we implement these complex measures in a simple feed-forward neural network optimized with batch processing. In the next Section:

Related Work we discuss the state of the art. Then in Section:Preliminaries and Background we mention the problem set up and basic definitions. Following it in Section:Neural Network and Loss Functions we discuss how we implement the constraints as the loss. Then in the Section:Experiments and Results, we mention the training details and the results from various experiments on real-world datasets. In the final Section:Conclusion we conclude and provide a brief discussion about the major results.

2 Related Work

The notion of group fairness has been discussed in [Zemel et al., 2013] and DP, EO and DI are few of its types. It is a major challenge to enforce these in any general machine learning framework.

Widely there are three primary approaches to deal with the issue.

i) The first body of work focuses on pre-processing i.e. coming up with fair representations as opposed to fair classification. The work in [Feldman et al., 2015]; [Dwork et al., 2012]; [Kamiran and Calders, 2009]; [Kamiran and Calders, 2010] typically involves transforming the training data into space where the dependency between the attribute and the label is lost. Neural networks have been extensively used in such pursuit. For e.g. [Louizos et al., 2015] give a method for learning fair representations with variational auto-encoder by using maximum mean discrepancies between the two sensitive groups.

[Edwards and Storkey, 2016], [Madras et al., 2018], [Beutel et al., 2017], explore the notion of adversarially learning a classifier that achieves DP, EO or DI.

ii) The second approach focuses on analytically designing convex surrogates for the fairness definitions. These surrogates are easily implementable as the loss function while simultaneously serving as weaker counterparts of fairness. [Calders and Verwer, 2010]

design discriminative free Naive-Bayes.

[Kamishima et al., 2011]; [Bechavod and Ligett, 2017] introduce penalty functions to penalize unfairness. [Bilal Zafar et al., 2015] define a relaxed convex-concave version of the fairness constraints, and the resulting optimization problem is solved for a convex loss. [Zhang et al., 2018] use neural network based adversarial learning which attempts to predict the sensitive attribute based on the classifier output, to learn an equal opportunity classifier. They prove the adversarial loss as an upper bound to the actual definition of Equal opportunity.

iii) The third is the reductionist approach followed in [Narasimhan, 2018], and [Agarwal et al., 2018].

There are major drawbacks to the existing approaches. The pre-processing approach treats the classifier as a black box, hence their fair representations may lead to loss of accuracy. On the other hand, using surrogate loss guarantees good results only under strong assumptions. Moreover, some are specific to a particular kind of classifier. Besides coming up with such surrogate losses and optimization framework is challenging. Our goal is to have a generalized framework wherein we train a neural network which includes the non-convex form of DI or EO in its objective function. In the next section, we state the definitions of the constraints (including fairness) and the losses that we have implemented in our network.

3 Preliminaries and Background

In this section, we introduce the notation used and state the definitions of the fairness measures and the performance measures that we have analyzed.

We consider a general multi-class classification problem ( different classes) with no assumption on the instance space (- dimensional). is our instance space s.t. and output space . We also have a protected attribute associated with each individual instance, which for example could be age, sex or caste information. For each , could be a particular category of the sensitive attribute. We have a classifier , s.t. , and . We have access to a labeled dataset

Definition 1 (Demographic Parity (DP))

A classifier satisfies demographic parity under a distribution over if its predictions is independent of the protected attribute . That is,

Given that , we can say

Definition 2 (Equalized Odds (EO))

A classifier satisfies equalized odds under a distribution over if its predictions is independent of the protected attribute given the label . That is,

Given that , we can say

Definition 3 (Disparate Impact (DI))

The outcomes of a classifier disproportionately hurt people with certain sensitive attributes. The following is the definition for completely removing DI,

[Pleiss et al., 2017]

strongly claim that the above mentioned measures are rendered useless, if the classifier is not calibrated. In which case the probability estimate

of the classifier could carry different meanings for the different groups.

Definition 4 (Calibration)

A classifier is perfectly calibrated if

Given the definition the authors prove the following impossibility of calibration with equalized odds.

Theorem 1 (Impossibility Result)

Let , be two classifiers for Groups and with . Then and satisfy the equalized odds and calibration constraints if and only if and are perfect predictors.

Given the above result, we cannot guarantee to ensure the fairness constraints perfectly, hence we relax the conditions while setting up our optimization problem as follows,

Problem Framework

In the multi-class classification task as defined above, we also have a loss function. It is primarily the cross-entropy loss which is minimized over training. This loss function is minimized, under the additional constraints of fairness. For a small , the relaxed version of them requires ,

  • DP:

  • EO:

  • DI: It is not possible to completely remove DI but one has to ensure least possible DI specified by the ,


Apart from fairness, we have analyzed two other complex constraints not related to fairness, yet share similar form.

  • COVERAGE: A constraint on a binary classifier which ensures that the proportion of positive predictions is within a budget

  • KLD error:


It is common to ensure high accuracy while maintaining fairness, although we may have to work with other complex loss functions sometimes. Below are the loss functions that we have analyzed apart from the cross entropy loss,

  • -mean loss:

  • -mean loss:

  • -measure (Binary):


Let be our loss function parameterized by . We have the following generic optimization framework. Both the loss and the constraints can be replaced according to the need,


In order to solve the above problem, there have been three approaches as summarized in the Related Works.

Surrogate losses

In [Bilal Zafar et al., 2015], the authors have introduced the following decision boundary covariance to replace DI for ,


unlike DI is convex with respect to for all linear, convex margin based classifier. The above is one such example of a surrogate loss.

Fair representations

The other popular approach among the recent works is to learn fair representations using a neural network. In [Madras et al., 2018] authors have designed an adversarial network which tries to make the representations of the data independent of the sensitive attribute. The authors define adversarial losses for both DP and EO and show that these serve as upper bounds for the original DP and EO respectively. Such an approach again restricts the adversarial setting to few definitions of fairness and besides it involves minimizing the upper bound which may or may not be a good bound on the original definitions.

Reductionist approach

Finally, in this approach used in [Narasimhan, 2018], authors try to have different optimization methods for solving the problem in Equation 9. In [Narasimhan, 2018], they solve Frank Wolfe optimization and [Agarwal et al., 2018] solve a saddle point equilibrium as in a two-player zero-sum game. Both of these are iterative processes for solving the problem wherein each step the loss is reduced to a cost-sensitive form and then any classifier can be used.

All the above approaches call for considerable human intervention. Moreover, new definitions for fairness maybe coined as the community delves into further depth. Hence, our goal is to have a generic neural network based framework which is adaptable to any definition which could be integrated into the neural network without further modification. In the next section, we discuss the details of the network we used and its objective function to implement the optimization problem stated in Equation 9.

4 Neural Network and Loss Functions

In this section, we discuss how we use the neural network for solving the optimization problem framework in Equation 9. Our network is a two-layered feed forward neural network. Let be the function that the neural network learns parameterized by . In the last layer of this network is a softmax function which gives the prediction probabilities , where is the predicted probability that the data sample belongs to the class. We directly use the softmax outputs for the different losses and constraints that we have, including fairness. Below are the combinations of loss and constraints that we have experimented with,

4.1 Fairness constraints with cross entropy loss:

Fairness has many definitions, but we work with the following three, DP as in the Definition 1 is given by , ,

Hence the constraint can be defined as follows,

For the next constraint EO, we first define the difference in false positive rate between the two sensitive attributes,

The difference in false negative rate between the two sensitive attributes,

Following similar argument as before the empirical version of EO as defined by Equation 2 and also used by [Madras et al., 2018] in the experiments is,

EO as defined in [Agarwal et al., 2018] is,

Empirical version of DI as defined in Equation 3 as a constraint for binary is given by,

The degree to which the constraints are satisfied is given by , and for , . The loss due to the violation of is,


Given our aim is to maximize the prediction accuracy we use cross-entropy loss. is the batch size,

The overall loss if given by,


4.2 Coverage constraint with -mean loss:

Coverage constraint for binary setting i.e. as in Equation 4 is given by,

Hence, the loss with is,


The empirical version of -mean loss as defined in Equation 6

The overall loss,


4.3 Satisfying DP with -mean loss:

The loss due to DP as already defined before is ,


The empirical version of -mean loss as defined in Equation 7 is,

Hence, the overall loss,


4.4 KLD error as constraint with -measure:

The constraint defined in Equation 5 is implemented as,

The loss due to the above constraint is defined as,


We have the following implementation of -measure as defined in Equation 8

The overall loss is given by,


The combination of losses and constraints mentioned above are not exhaustive. The generic definition of the loss could thus be given by,


Any combination can be tried by changing and . For example as defined in Equation 12, 14, 16, 18. The hyper-parameter as defined in all these equations is set according to the need. Larger values of ensure that the constraints are satisfied strictly and vice-versa. But having large values of may hinder convergence in a neural network training, hence it is crucial to find an optimal . We discuss further details of training and the corresponding results in the following section.

5 Experiments and Results

In this section, we first explain the details of the neural network architecture and training. Then we mention the particulars of the real world datasets on which we trained our network. Finally, we compare our results with the five most relevant papers.

5.1 Architecture and Optimizer

The architecture that we used is a simple two-layered feed-forward network. The number of hidden neurons in both the layers was one of the following

which we selected based solely on experiments. On thorough inspection of the complex losses and constraints that we implemented, it is inevitable that the batch size is as large as possible. As fairness constraint has no meaning for a single sample, stochastic gradient descent optimizer cannot be used. Hence we use batch sampling. We fix the batch size to be either or depending on the dataset, to get proper estimates of the loss while training. It is to be noted that the usual architecture is used, just that batch processing is mandatory for this network to work efficiently.

The generic form of softmax function in the output is given by , where is the temperature parameter. The lower the temperature the prediction probabilities are closer to either or . Since for the complex constraints and losses, we would ideally like the value to be either or , but hard thresholding would make it non-differentiable. So for our experiments, we set the value of the temperature to enable gradients while simultaneously ensuring a better approximation to the losses or constraints. In our experiments the value of varies from to . For training, we have used inbuilt Adam Optimizer with a learning rate of or and the training typically continues for a maximum of epochs for each experiment before convergence. The results are averaged over 70-30, 5-fold cross-validation performance on the data.

The main challenge in this approach is, it involves many hyper-parameters. It is an exhaustive task to tune all of them. Hyper-parameters are sometimes specific to the dataset used. They also change with the degree of strictness of the constraint enforced.

5.2 Datasets

The datasets that we have used are as follows

  • Adult: It contains a total of 45,222 samples, each with 14 features. The label is binary which indicates if the income of the sample is above or below 50K USD. Gender is the binary sensitive attribute.

  • Bank: This dataset has 41,188 data points, each with 20 attributes. The label is binary which indicates whether that person (data point) has subscribed or not to a term deposit. The age is considered as the binary sensitive variable, individuals between 25 and 60 years of age form one group the rest form the other group.

  • Crimes: There are a total of 2215 individual’s data each having 147 features. The label predicts for each individual, whether his crime rate is above average or not. Majority race is the sensitive attribute.

  • Default: There are a total of 30000 samples with 24 features. Each data sample is a credit user and the task is to predict if the sample would default on payment or not. Gender is considered to be the sensitive attribute.

  • Compass: The Correctional Offender Management Profiling for Alternative Sanctions, which was made available online by ProPublica. The goal is to predict recidivism within two years. There is data of 5278 individuals each having 9 features. The majority race is used as the protected attribute.

5.3 Comparative Results

We compare our results with that of the following papers,

5.3.1 [Bilal Zafar et al., 2015]

In this paper, the authors try to maintain DI as the fairness constraint while maximizing accuracy. To this end, they introduce a different formulation to replace DI as given in Equation 10

. They propose C-SVM and C-LR by using the proposed formulation w.r.t svm and logistic regression classifiers. R-RL or regularized logistic regression is proposed by

[Kamishima et al., 2011]. They report results on Adult and Bank datasets as can be observed in the Figure 1. For obtaining the results we train our network using the loss given in Equation 11 with . We get better results upto , for , the accuracy reduces by 2 %.

Figure 1: Accuracy vs comparision of results with [Bilal Zafar et al., 2015] on Adult dataset in the left subplot and Bank dataset in the right subplot

5.3.2 [Madras et al., 2018]

In this work, the authors follow a different approach, where their aim is to learn fair representations as opposed to fair classifier. They use a adversarial model to make the representations fair. Their procedure is named LAFTR (Learning Adversarially Fair and Transferable Representation) In this paper, the authors try to ensure DP and EO while maximizing accuracy on Adult dataset. We have compared our results with theirs in Figure 2. For this, we have used loss defined in Equation 11 with .

Figure 2: Accuracy vs ( is tolerance for DP and EO respectively) and compare with [Madras et al., 2018] on Adult dataset
Female Male
[Beutel et al., 2017] FPR 0.0308 0.1778
FNR 0.0822 0.1520
[Zhang et al., 2018] FPR 0.0647 0.0701
FNR 0.4458 0.4349
Our approach FPR 0.1228 0.1132
FNR 0.0797 0.0814
Table 1: False Positive Rate (FPR) and False Negative Rate (FNR) for income prediction for the two sex groups in Adult dataset

5.3.3 [Zhang et al., 2018]

This work is aimed at learning an unbiased classifier, by making the prediction independent of the sensitive attribute. The authors resort to adversarial training. The aim is to maximize, the predictors ability to predict the label, while the adversary is simultaneously trained to minimize, the ability to predict the sensitive attribute from the output predictions. They term this approach as Adversarial debiasing. They show that adversarial debiasing theoretically guarantees EO and DP. They have results for EO on Adult Dataset as can be found in Table 1. We get FPR (female) FPR (male) and FNR values for female and male are . The accuracy of the classifier remains at

Figure 3: Error rate vs and compare our results with [Agarwal et al., 2018] In the two plots above, the refers to degree of DP on Adult and Compass dataset. In the plots below, is the degree of EO

5.3.4 [Agarwal et al., 2018]

In this work, the authors give an approach for achieving fairness in binary classification. They focus on DP and EO. In their optimization problem is set up, the objective is to minimize the error, and DP or EO are formalized as linear inequality constraints. It is then set up as a saddle point problem which is solved using a standard scheme of Freund and Schapire developed for solving an equilibrium in two player zero sum games. This involves sequentially optimizing over the Lagrangian, where the problem is reduced to cost-sensitive classification problems that can be easily handled. The base classifiers they have used is either logistic regression or gradient boosted decision trees. We compare our results with theirs on Adult and Compass Dataset both for DP and EO as given in Figure

3. On observing the plots we find our performance is better for Compass dataset but worse for Adult dataset.

5.3.5 [Narasimhan, 2018]

In this work reductionist approach is followed. The authors propose COCO algorithm for convex losses with convex constraints and FRACO algorithm for fractional convex losses with convex constraints. LogReg and LinCon form the baseline methods. The authors have results for different kinds of losses and constraints apart from fairness. In Table 2 we compare our results with theirs for - mean loss under coverage constraints discussed in Equation 14. In Table 4 we have results for -mean loss under DP as the constraint, whose loss function is given by Equation 16. Finally we have results for -measure under KLD error in Table 3 and the loss function is given by Equation 18. In Figure 5 we study the change of -mean loss with respect to the DP constraint on Default dataset. It is observed that as the constraint is relaxed, the loss reduces. We also see, that value truly reflects the degree of constraint we are allowing. Similarly in Figure 4 we observe the relationship between -mean loss w.r.t coverage constraint on Crimes dataset. Here too we verify the relation between and the amount of coverage allowed.

Dataset Our COCO LogReg
adult 0.25 0.23 (0.246) 0.23 (0.244) 0.27 (0.196)
compass 0.25 0.44 (0.248) 0.38 (0.242) 0.67 (0.196)
crimes 0.25 0.29 (0.239) 0.27 (0.250) 0.18 (0.331)
default 0.25 0.34 (0.212) 0.40 (0.242) 0.61 (0.075)
Table 2: Minimizing H-mean loss s.t. coverage
Figure 4: Plot of -mean loss and coverage values achieved by different values of on Crimes dataset
Dataset Our FRACO LogReg
adult 0.001 0.30 (0.001) 0.31 (0.001) 0.34 (0.007)
compass 0.001 0.32 (0.001) 0.64 (0.003) 0.70 (0.105)
crimes 0.001 0.04 (0.0002) 0.21 (0.001) 0.22 (0.002)
default 0.001 0.45 (0.001) 0.49 (0.001) 0.64 (0.107)
Table 3: Minimizing F-measure loss s.t. KLD error
Dataset Our COCO LinCon
adult 0.05 0.27 (0.044) 0.33 (0.035) 0.39 (0.027)
compass 0.20 0.32 (0.150) 0.41 (0.206) 0.57 (0.107)
crimes 0.20 0.28 (0.195) 0.32 (0.197) 0.52 (0.190)
default 0.05 0.29 (0.006) 0.37 (0.032) 0.54 (0.015)
Table 4: Minimizing Q-mean loss s.t. Demographic Parity
Figure 5: Plot of -mean loss and DP values achieved by different values of on Default dataset

6 Conclusion

With the observation by ProPublica that machine learning algorithms, in particular classification algorithms, may introduce certain biases towards certain sections of the society, the researchers are invested in making the machine learning models fair. However, there are challenges in defining what is fair and what should be a performance metric of the classification algorithm. The researchers explored different combinations as well as came up with framework approaches to incorporate the different definitions of fairness such as demographic parity (DP), equalized odds (EO) and disparate impact (DI) and performance measures. However, to design classification algorithm, it needs a lot of human analysis and ingenuity. In this paper, we study fairness in machine learning with the following goals:

  • To define a simple machine learning framework to achieve fairness.

  • It must be generic and adaptable to any definition of fairness.

  • It must involve least human analysis.

The neural network implementation discussed along with the losses, achieves all the three goals. With our results with respect DP, EO and DI along with losses such as cross entropy, -mean, -mean and measure, we show that the neural network achieves the same performance of the state of the art results for the same, if not better. With this success, we believe, the neural network based approach is going to significantly ease the process of designing fair classification algorithms.


  • [Agarwal et al., 2018] Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., and Wallach, H. (2018). A reductions approach to fair classification. In Dy, J. and Krause, A., editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 60–69, Stockholmsmässan, Stockholm Sweden. PMLR.
  • [Barocas and Selbst, 2016] Barocas, S. and Selbst, A. D. (2016). Big data’s disparate impact. Cal. L. Rev., 104:671.
  • [Bechavod and Ligett, 2017] Bechavod, Y. and Ligett, K. (2017). Learning fair classifiers: A regularization-inspired approach. CoRR, abs/1707.00044.
  • [Berk et al., 0] Berk, R., Heidari, H., Jabbari, S., Kearns, M., and Roth, A. (0). Fairness in criminal justice risk assessments: The state of the art. Sociological Methods & Research, 0(0):0049124118782533.
  • [Beutel et al., 2017] Beutel, A., Chen, J., Zhao, Z., and hsin Chi, E. H. (2017). Data decisions and theoretical implications when adversarially learning fair representations. CoRR, abs/1707.00075.
  • [Bilal Zafar et al., 2015] Bilal Zafar, M., Valera, I., Gomez Rodriguez, M., and Gummadi, K. P. (2015). Fairness Constraints: Mechanisms for Fair Classification. ArXiv e-prints.
  • [Calders and Verwer, 2010] Calders, T. and Verwer, S. (2010). Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery, 21:277–292.
  • [Chouldechova, 2017] Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5 2:153–163.
  • [Dawid, 1982] Dawid, A. P. (1982). The well-calibrated bayesian. Journal of the American Statistical Association, 77(379):605–610.
  • [Dwork et al., 2012] Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. (2012). Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, ITCS ’12, pages 214–226, New York, NY, USA. ACM.
  • [Edwards and Storkey, 2016] Edwards, H. and Storkey, A. (2016). Censoring representations with an adversary. In International Conference in Learning Representations (ICLR2016).
  • [Feldman et al., 2015] Feldman, M., Friedler, S. A., Moeller, J., Scheidegger, C., and Venkatasubramanian, S. (2015). Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15, pages 259–268, New York, NY, USA. ACM.
  • [Kamiran and Calders, 2009] Kamiran, F. and Calders, T. (2009). Classifying without discriminating. In 2009 2nd International Conference on Computer, Control and Communication, pages 1–6.
  • [Kamiran and Calders, 2010] Kamiran, F. and Calders, T. (2010). Classification with no discrimination by preferential sampling. In Informal proceedings of the 19th Annual Machine Learning Conference of Belgium and The Netherlands (Benelearn’10, Leuven, Belgium, May 27-28, 2010), pages 1–6.
  • [Kamishima et al., 2011] Kamishima, T., Akaho, S., and Sakuma, J. (2011). Fairness-aware learning through regularization approach. In 2011 IEEE 11th International Conference on Data Mining Workshops, pages 643–650.
  • [Louizos et al., 2015] Louizos, C., Swersky, K., Li, Y., Welling, M., and Zemel, R. S. (2015). The variational fair autoencoder. CoRR, abs/1511.00830.
  • [Madras et al., 2018] Madras, D., Creager, E., Pitassi, T., and Zemel, R. S. (2018). Learning adversarially fair and transferable representations. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, pages 3381–3390.
  • [Narasimhan, 2018] Narasimhan, H. (2018). Learning with complex loss functions and constraints. In

    International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 9-11 April 2018, Playa Blanca, Lanzarote, Canary Islands, Spain

    , pages 1646–1654.
  • [Pleiss et al., 2017] Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., and Weinberger, K. Q. (2017). On fairness and calibration. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems 30, pages 5680–5689. Curran Associates, Inc.
  • [Zemel et al., 2013] Zemel, R., Wu, Y., Swersky, K., Pitassi, T., and Dwork, C. (2013). Learning fair representations. In Dasgupta, S. and McAllester, D., editors, Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 325–333, Atlanta, Georgia, USA. PMLR.
  • [Zhang et al., 2018] Zhang, B. H., Lemoine, B., and Mitchell, M. (2018). Mitigating unwanted biases with adversarial learning. CoRR, abs/1801.07593.