The rapid increase in automation of decision-making systems using machine learning approaches has raised significant concerns about the fairness of such models. Different studies have shown that these machine learning models which are designed to help the process of decision-making are not immune to social biases [3, 7]. There is a significant shift towards employing machine learning techniques in many sensitive real-world applications such as credit approval, loan applications, criminal risk assessment, university admissions, and online advertisement. With this new trend, it becomes crucial to consider other aspects and metrics for assessing a model beyond their accuracy. Among those aspects, fairness has gathered close attention in the community as we hope for building a socially responsible and inclusive system. Recently many machine learning algorithms have been proposed to address this problem and make the predictions of the learning algorithms fairer [11, 19, 2, 28].
The naive approach for addressing the fairness problem in machine learning could be to remove or ignore the protected attributes such as sex, gender and age. However, this approach is not practical in many real-world applications 
mainly because of the two following reasons: 1) there can exist some proxy features or correlation between other features and the sensitive attributes which may reveal them, and 2) there already exists some degree of bias in the labels of the training data. On the other hand, in many applications, unlabeled data is abundant, and if appropriately leveraged, they also hold less bias compared to labeled data since models are not strongly affected by the labels of the labeled samples. In the same way, other paradigms like unsupervised learning or semi-supervised learning could be to lower degree sensitive to these biases in the data. Additionally, the lack of adequate labeled data poses a major challenge to many machine learning based applications and in some applications, creating a labeled dataset for training such models is expensive and time-consuming. Therefore, leveraging unlabelled data could be a potential solution to the lack of labeled data as well as fairness problems.
Semi-supervised learning approaches have shown promising results in tackling the aforementioned challenges by exploiting the unlabeled data to improve the performance of a classifier in terms of the accuracy. The unlabeled data do not carry label information which can be a significant source of bias in training machine learning systems. The success of semi-supervised approaches in the improvement of model’s performance through exploiting the unlabeled data, inspired us to study the effect of unlabeled data on the process of learning a fair classifier. In this paper, we propose a semi-supervised classification algorithm based on neural networks to tackle the fairness in machine learning. To the best of our knowledge, we are the first to propose and study the effect of semi-supervised learning on the fairness of a classifier using neural networks. Our proposed model, called SSFair, utilizes Pseudo-Labeling  approach to exploit unlabeled data to increase the accuracy and fairness of a classifier. Pseudo-labeling is one of the most common techniques for handling semi-supervised learning.
The proposed model is built with neural networks and can support any fairness measurement which can be defined or approximated as a differentiable function. Different criteria exist to measure fairness in machine learning. We have incorporated three of the most common measurements demographic parity, equalized opportunity, and equalized odds into SSFair. We have evaluated SSFair on different measurements of fairness in semi-supervised settings and showed the effectiveness of the proposed algorithm to exploit the unlabeled data. We show experimentally that SSFair can benefit from unlabeled data to not just improve the accuracy but also improve the fairness of the classifier.
Ii Related Works
There are three main approaches proposed to tackle the fairness problem in machine learning, 1) pre-processing, 2) in-processing, and 3) post-processing approach.
In pre-processing approach, the goal is to learn a new representation of the data which is uncorrelated with the protected attributes [10, 5, 19, 1]. This new representation can be used for any downstream task such as classification or ranking and any machine learning technique of choice. The main advantage of pre-processing approach is that it eliminates the need for making changes to the machine learning algorithms and therefore is very straightforward to use.
The second approach, in-processing, consists of the techniques that incorporate the fairness constraints into the training process. Most of the works on fairness in machine learning belong to this category [2, 27, 12]. The in-processing algorithms usually address the problem by adding the fairness criterion to the learning algorithm’s main objective function as a regularizer. This category is more flexible to optimize different fairness constraints, and the solutions using this approach are considered the most robust ones. Moreover, these category of approaches have shown promising results in terms of both accuracy and fairness.
The third approach is post-processing which aims to make changes on the output of the classifiers in order to satisfy the fairness constraint. One simple form of it is to find a threshold specific for each protected group and use it to control the fairness objective. Although this approach does not need any changes in the classifier, it is not very flexible in optimizing the trade-off between fairness and accuracy.
Our proposed model formulated as a semi-supervised learning based on neural network, falls under the second category, in-processing approaches. It aims at optimizing the fairness constraint during training the classifier. To the best of our knowledge, SSFair is the first semi-supervised algorithm based on neural networks introduced for tackling the fairness problem.
There are a few works that employ neural networks to optimize the trade-off between fairness and accuracy. Most of these approaches employ adversarial optimization inspired by Generative Adversarial Networks (GAN)  to train a model for producing a fair representation or an output which is indistinguishable among all of the protected groups [20, 6, 26, 30, 22]. However, these methods are not capable of optimizing an arbitrary fairness constraint, at least not explicitly. Alternatively, in 
fairness problem is addressed by incorporating the fairness constraints explicitly into the optimization of the neural network during the training. The authors have added several fairness constraints into the loss function of the neural network as a regularization term. This algorithm only handles fully supervised learning setting and thus can not benefit from unlabeled data.
Iii Fairness Measurements
Defining the concept of fairness for a machine learning algorithm is not trivial, and a variety of definitions exist to measure and quantify fairness. [11, 13]. Such definitions are categorized into two main groups of individual fairness  and group fairness .
The term of individual fairness is first introduced in  to refer to a fairness constraint which is focused on treating similar individuals as similar as possible. The fairness measurement or metrics defined in this category are based on the expectation that similar individuals should get treated similarly and the output of the machine learning algorithm should be close for similar inputs [13, 29]. The main drawback of such constraints is the difficulty of defining their similarly metric function. An appropriate similarity function should be capable of ignoring the proxy features which may reveal individual’s sensitive information. For this reason, individual fairness cannot be applied widely in real-world problems.
The second group, called group fairness or statistical fairness, is most commonly used in the literature. They divide the individuals or samples into sets of unprotected and protected (or privileged and unprivileged) based on sensitive attributes like race, gender, or age. Then they try to make some statistical measures (e.g. classification error, true positive rate, or false positive rate) of the performance of the classifier or any other machine learning algorithm equal for both the protected and unprotected groups. The three most common definitions in this category are demographic parity, equalized opportunity, and equalized odds. Our SSFair approach can optimize for all of these three fairness objectives. These measurements are defined in Section IV in detail.
There is no consensus on the best definition of fairness, and it is very task-dependent to decide which one to use. In some cases, there exists a trade-off between some of these fairness constraints. It is shown that some of these fairness constraints cannot get satisfied at the same time except in some degenerate or highly constrained special cases [15, 25].
Iv Proposed Model
In semi-supervised settings, training data consists of a collection of labeled and unlabeled samples. Assume is the training set consisting of samples. For each sample , denotes the feature set, denotes the label, and is the protected attribute which shows whether that sample belongs to the protected set () or not (). Assume the valid values for labels are for non-advantaged outcome, for the advantaged outcome, or for the unknown labels.
Our goal is to learn a binary classifier function parameterized by to optimize two main objectives, the classification accuracy, and fairness. We would model the function by a neural network. To achieve this goal we define the loss function of the model as:
where indicates the classification loss, and is the fairness loss which imposes fairness on the output of the model. Parameter controls the trade-off between fairness and accuracy losses. Parameter controls the regularization term which is imposed on all of the networks’ weights. Regularization is very important to prevent overfitting specially since limited labeled samples are available.
Iv-a Classification Loss
The first part of the loss function, the classification accuracy loss , is defined over the training samples as:
where indicates the classification loss for sample and is defined as the cross-entropy between the output of the learned function and the target label:
where indicates the output of the learned function for sample and is ’s corresponding target label. Target label is defined as the ground truth label if is labeled, while it is defined as for unlabeled samples. indicates whether sample should be considered in the learning process or not and will be defined below. is an indicator function which zero-outs the samples whose is not .
We follow the Pseudo-Label approach  to handle the unlabeled samples. For all labeled samples, is set to . For unlabeled samples, only the ones with high confidence output should get their set to and remain in the learning process. With a binary classifier, the output value can be utilized to obtain the confidence of the prediction for sample . Therefore is defined as:
where defines a threshold which controls the degree of confidence which is needed to consider a predicted label in the learning process.
Iv-B Fairness Loss
The second term in the loss imposes fairness on the learned function. As discussed in Section III, there is a variety of definitions for fairness and there is no consensus on which one is the best. Our approach is quite flexible in that it can work with any fairness objective as far as it is a differentiable function. This capacity to handle and optimize different definitions is considered a huge advantage for a fairness algorithm since it enables adapting the appropriate fairness definition based on the application. In this paper, the following three most common objectives in group fairness are studied with the proposed model.
Iv-B1 Demographic Parity
Demographic parity, also referred to as Statistical Parity, is one of the most common criteria for fairness 
. It measures the difference between the probabilities of predicting advantaged output for the protected and unprotected groups and requires the decision of a classifier to be independent of the protected attribute. Its corresponding loss function denoted by is defined as:
where defines the subset of where their protected attribute .
Demographic parity is backed up by the ”four-fifth rule” which recommends that the selection rate for the protected group should not be less than of the unprotected group unless there exists some business necessity . A selection rate of less than can have an adverse impact on the unprotected group.
Iv-B2 Equalized Opportunity
This measurement is focused on fairness for the advantaged outcome. It measures the difference between the probabilities of predicting advantaged output for the protected and unprotected groups with advantaged ground truth. Its corresponding loss function denoted by is defined as the following when :
where defines the subset of with label attribute .
Iv-B3 Equalized Odds
This constraint requires the outcome and the protected attribute to be independent conditional on the label. It is defined as the following:
It is a more strict criterion than equalized opportunity as it requires for both and . It enforces the accuracy to be equally high for all of the outcomes while equalized opportunity focuses on the advantaged outcome.
Iv-C Model and Training
The classifier function
is modelled by a multi-layer perception (MLP) neural network. The whole model is trained using backpropagation with respect to the loss function in Equation (1). Given a set of samples, we optimize the model using Adam  optimization technique over shuffled mini-batches from the data.
We evaluate and study our proposed model for the fairness problem. We provide experimental results to support our claim that employing our semi-supervised approach based on neural networks improves the accuracy and fairness for classification task.
We use the UCI Adult Income Dataset (ADULT)[17, 16] and study the task of predicting whether a person makes more than or not. This is one of the most commonly used benchmarks for evaluating classification approaches for fairness. The proportion of high income individuals across the two groups of men and women are not equal, and therefore there is no demographic parity in the dataset. The dataset has 12 features including categorical and continuous features.
Categorical features are encoded using one-hot encoding. The age feature is bucketized at the boundaries [18, 25, 30, 35, 40, 45, 50, 55, 60, 65]. The ”Sex” feature is considered as a protected feature. We have also filtered out the samples with missing values. The post-processed dataset containssamples with features. We randomly chose of the samples for the train set and left the rest for the test set.
V-B Experimental Setting
We compare the results of our work with the model proposed by Manisha et al.  which is a model based on neural networks to address the fairness problem. To the best of our knowledge, it is the only work done on the trade-off between fairness and accuracy using neural networks. This model  is fully supervised and is only trained on the labeled samples.
The hyperparameters of our proposed algorithm are tuned with validation on a randomly selectedof the training data. After setting the hyperparameters, the model is trained on the full training set. Eventually, the results on the test data are reported in the experiments.
In our experiments, for both SSFair and Manisha et al. 
, a Multilayer Perceptron (MLP) neural network withhidden layer of size is used to model the function
. Rectified Linear Unit (ReLU) activation is used for the outputs of the hidden layer. Since the task is binary classification, we use sigmoid function as the activation function on the last layer and get the final output as the result of that. A dropout layer with a dropout rate ofis used after the hidden layer. The regularization parameter is selected from for each experiment based on the results of the validation process. Finally, the confident degree parameter is set to for SSFair.
V-C Experimental Results
In this section, we present the results of our experiments on the ADULT dataset to demonstrate the effectiveness of our semi-supervised learning approach for the fairness problem.
V-C1 The effect of unlabeled data on accuracy and fairness
We would like to verify that using unlabeled data can help our algorithm to improve on both aspects of accuracy and fairness. We performed experiments by increasing the number of unlabeled samples while keeping the number of labeled samples fixed to to investigate the effect of adding unlabelled data. We have experimented with three different values of for parameter .
The plot of fairness loss versus the number of unlabeled samples is illustrated in Figure 1. For calculating the fairness loss, the output of the classifier (
) is binarized with the threshold ofto provide a binary outcome. Demographic parity is selected as the fairness loss in this part. As these plots suggest, fairness loss improves as we increase the size of the unlabeled set (note that higher fairness is achieved with fairness loss is lower). This experiment verifies that fairness in our model can benefit from unlabeled data and therefore our approach has been successful in utilizing unlabeled data to improve fairness. Moreover, we paid special attention to the existing trade-off between accuracy and fairness as well. Particularly, we were interested in understanding whether the improvement in fairness by increasing the size of unlabeled data could be a result of potential losses on the accuracy. To understand this effect, the plot of accuracy versus the number of unlabeled samples is also illustrated in Figure 2. As it is clear from the plots, the accuracy of the classifier increases as we grow the number of unlabeled samples as well. This result validates that our approach provides a solution for using additional unlabeled data to improve both factors of accuracy and fairness.
V-C2 Comparison against fully supervised approach
We demonstrate the benefit of our semi-supervised learning approach for the fairness problem versus a full supervised model. We experimented with a varying number of labeled samples (, , and ). For experiment with labeled samples, we randomly chose samples from the training set and kept their ground truth label while we changed the label of the other samples to .
The results are illustrated in Figures 3, 4, and 5. Different points on the curves are obtained by using different values for parameter which is varied from to to impose different levels of fairness on the classifier. Generally, there exists a trade-off between accuracy and fairness and parameter controls this trade-off: increasing would result in increasing the accuracy while decreasing the fairness. For each value of , the experiment is repeated five times and the averaged results are reported.
Comparing two algorithms, one which can produce higher accuracy while maintaining the same level of fairness loss is considered the superior one. As it is evident from the results, SSFair provides higher accuracy for the same level of fairness loss compared to the approach of Manisha et al. This conclusion is consistent for all of the three fairness measurements, suggesting the effectiveness of exploiting unlabeled data by using semi-supervised learning for the fairness problem. It is worth noting that the effect of using unlabeled data is more evident in cases with fewer labeled samples, which indicates that this approach is most helpful in scenarios with scarce labeled data.
Our understanding of such behavior is that since unlabeled data does not include any label information, they do not hold biased information for the labels either. Therefore, they can be beneficial not only to the accuracy but also to the fairness of the classifier. Our experiments show that SSFair is capable of exploiting the structure and information of unlabeled data to increase the accuracy and fairness compared to a fully supervised model.
We proposed a classifier based on neural networks for semi-supervised learning to tackle the fairness problem. The proposed model, SSFair, employs the Pseudo-labeling approach to exploit the information in the unlabeled data. We studied the effect of using unlabeled data on learning a fair classifier, and showed with our experiments that unlabeled data could be beneficial not just for the accuracy but also for the fairness. SSFair is evaluated on three fairness measurements, demographic disparity, equalized opportunity, and equalized odds. In the experiments, it is demonstrated that semi-supervised learning can achieve higher fairness and accuracy compared to the approaches that only use the labeled data.
-  (2018) Auditing black-box models for indirect influence. Knowledge and Information Systems 54 (1), pp. 95–122. Cited by: §II.
-  (2018) A reductions approach to fair classification. In International Conference on Machine Learning, pp. 60–69. Cited by: §I, §II.
-  (2016) Big data’s disparate impact. Calif. L. Rev. 104, pp. 671. Cited by: §I, §IV-B1.
-  (2004) The four-fifths rule for assessing adverse impact: an arithmetic, intuitive, and logical analysis of the rule and implications for future research and practice. In Research in personnel and human resources management, pp. 177–198. Cited by: §IV-B1.
-  (2017) Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems, pp. 3992–4001. Cited by: §II.
-  (2019) Improved adversarial learning for fair classification. arXiv preprint arXiv:1901.10443. Cited by: §II.
-  (2017) Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big data 5 (2), pp. 153–163. Cited by: §I.
-  (2012) Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214–226. Cited by: §III, §III.
-  (2014) Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680. Cited by: §II.
-  (2019) Obtaining fairness using optimal transport theory. In International Conference on Machine Learning, pp. 2357–2365. Cited by: §II.
-  (2016) Equality of opportunity in supervised learning. In Advances in neural information processing systems, pp. 3315–3323. Cited by: §I, §I, §III.
-  (2012) Decision theory for discrimination-aware classification. In 2012 IEEE 12th International Conference on Data Mining, pp. 924–929. Cited by: §II.
-  (2018) Fairness through computationally-bounded awareness. In Advances in Neural Information Processing Systems, pp. 4842–4852. Cited by: §III, §III.
-  (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980. Cited by: §IV-C, §V-B.
-  (2017) Inherent trade-offs in the fair determination of risk scores. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017), Vol. 67, pp. 43. Cited by: §III.
-  (1996) . In Kdd, Vol. 96, pp. 202–207. Cited by: §V-A.
-  Adult data set. Note: UCI machine learning repository Cited by: §V-A.
-  (2013) Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop on Challenges in Representation Learning, Vol. 3, pp. 2. Cited by: §I, §IV-A.
THE variational fair autoencoder. stat 1050, pp. 4. Cited by: §I, §II.
-  (2017) Learning to pivot with adversarial networks. In Advances in neural information processing systems, pp. 981–990. Cited by: §II.
-  (2016) A statistical framework for fair predictive algorithms. stat 1050, pp. 25. Cited by: §I.
-  (2018) Learning adversarially fair and transferable representations. In International Conference on Machine Learning, pp. 3381–3390. Cited by: §II.
-  (2018) A neural network framework for fair classifier. arXiv preprint arXiv:1811.00247. Cited by: §II, Fig. 3, Fig. 4, Fig. 5, §V-B, §V-B.
-  (2018) Realistic evaluation of deep semi-supervised learning algorithms. In Advances in Neural Information Processing Systems, pp. 3235–3246. Cited by: §I.
-  (2017) On fairness and calibration. In Advances in Neural Information Processing Systems, pp. 5680–5689. Cited by: §III.
-  (2018) Achieving fairness through adversarial learning: an application to recidivism prediction. arXiv preprint arXiv:1807.00199. Cited by: §II.
-  (2017) Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, pp. 1171–1180. Cited by: §II.
-  (2017) Fairness constraints: mechanisms for fair classification. In Artificial Intelligence and Statistics, pp. 962–970. Cited by: §I.
-  (2013) Learning fair representations. In International Conference on Machine Learning, pp. 325–333. Cited by: §III.
-  (2018) Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 335–340. Cited by: §II.