Since many applications of machine learning models have far-reaching consequences on people (credit approval, recidivism score etc.), there is growing concern about their potential to reproduce discrimination against a particular group of people based on sensitive characteristics such as gender, race, religion, or other. Many bias mitigation strategies for machine learning have been proposed in recent years, however, most of them focus on neural networks. Ensemble methods combining several decision tree classifiers have proven very efficient for various applications. Therefore, in practice for tabular data sets, actuaries and data scientists prefer the use of gradient tree boosting over neural networks due to its generally higher accuracy rates. Our field of interest is the development of fair classifiers based on decision trees. In this paper, we propose a novel approach to combine the strength of gradient tree boosting with an adversarial fairness constraint. The contributions of this paper are as follows:
We apply adversarial learning for fair classification on decisions trees;
We empirically compare our proposal and its variants with several state-of-the-art approaches, for two different fairness metrics. Experiments show the great performance of our approach.
Ii Fair Machine Learning
Ii-a Definitions of Fairness
Throughout this document, we consider a classical supervised classification problem training with examples , where
is the feature vector withpredictors of the -th example, is its binary sensitive attribute and its binary label.
In order to achieve fairness, it is essential to establish a clear understanding of its formal definition. In the following we outline the most popular definitions used in recent research. First, there is information sanitization which limits the data that is used for training the classifier. Then, there is individual fairness, which binds at the individual level and suggests that fairness means that similar individuals should be treated similarly. Finally, there is statistical or group fairness. This kind of fairness partitions the world into groups defined by one or several high level sensitive attributes. It requires that a specific relevant statistic about the classifier is equal across those groups. In the following, we focus on this family of fairness measures and explain the most popular definitions of this type used in recent research.
Ii-A1 Demographic Parity
Based on this definition, a classifier is considered fair if the prediction from features is independent from the protected attribute [Dwork2011].
There are multiple ways to assess this objective. The p-rule assessment ensures the ratio of the positive rate for the unprivileged group is no less than a fixed threshold . The classifier is considered as totally fair when this ratio satisfies a 100%-rule. Conversely, a 0%-rule indicates a completely unfair model.
An algorithm is considered fair if across both demographics and , for the outcome the predictor has equal true positive rates, and for the predictor has equal false positive rates [Hardt2016]. This constraint enforces that accuracy is equally high in all demographics since the rate of positive and negative classification is equal across the groups.
A metric to assess this objective is to measure the disparate mistreatment (DM) [Zafar2017]. It computes the absolute difference between the false positive rate (FPR) and the false negative rate (FNR) for both demographics.
The closer the values of and to 0, the lower the degree of disparate mistreatment of the classifier.
Ii-B Related Work
Recent research in fair machine learning has made considerable progress in quantifying and mitigating undesired bias. Three different types of mitigation strategies exist: The family of “pre-processing” algorithms which ensures that the input data is fair, “in-processing” methods where the undesired bias is directly mitigated during the training phase, and finally “post-processing” algorithms where the output of the trained classifier is modified.
We propose an “in-processing”
algorithm. Here, undesired bias is directly mitigated during the training phase. A straightforward approach to achieve this goal is to integrate a fairness penalty directly in the loss function. One such algorithm integrates a decision boundary covariance constraint for logistic regression or linear SVM[Zafar2015]. In another approach, a meta algorithm takes the fairness metric as part of the input and returns a new classifier optimized towards that fairness metric [DBLP:journals/corr/abs-1806-06055]. Furthermore, the emergence of generative adversarial networks (GANs) provided the required underpinning for fair classification using adversarial debiasing [NIPS2014_5423]. In this field, a neural network classifier is trained to predict the label , while simultaneously minimizing the ability of an adversarial neural network to predict the sensitive attribute [Zhang2018, Wadsworth2018, Louppe2016].
Iii Fair Adversarial Gradient Tree Boosting (FAGTB)
Our aim is to learn a classifier that is both effective for predicting true labels and fair, in the sense that it cares about metrics defined in section II-A
for demographic parity or equalized odds. The idea is to leverage the great performance of gradient tree boosting (GTB) for classification, while adapting it for fair machine learning via adversarial learning.
GTB processes sequentially by gradient iteration to define a prediction function of the form:
where is a predictor vector, is the total number of iterations, and corresponds to a weak learner at step in the form of a greedy CART prediction. Given a loss function to minimize for all from the training set, GTB calculates at each step the so-called “pseudo residuals”:
Then, at each step, GTB fits a new weak learner to those pseudo residuals and adds it to the current model. This step is repeated until the algorithm converges.
This architecture allows us to apply for fair classification with decision tree algorithms the concept of adversarial learning, which corresponds to a two-player game with two contradictory components, such as in generative adversarial networks (GAN) [NIPS2014_5423].
Iii-a Min-Max formulation
In the vein of [Zhang2018, Louppe2016, Wadsworth2018] for fair classification, we consider a predictor function
, that outputs the probability of an input vectorfor being labelled , and an adversarial model which tries to predict the sensitive attribute from the output of . Depending on the accuracy rate of the adversarial algorithm, we penalize the gradient of the GTB at each iteration. The goal is to obtain a classifier whose outputs do not allow the adversarial function to reconstruct the value of the sensitive attribute. If this objective is achieved, the data bias in favor of some demographics disappeared from the output prediction.
The predictor and the adversarial classifiers are optimized simultaneously in a min-max game defined as:
where and are respectively the predictor and the adversary loss for the training sample given , which refers to the output of the GTB predictor for input
. The hyperparametercontrols the impact of the adversarial loss.
The targeted classifier outputs the label which maximizes the posterior . Thus, for a given sample , we get:
where , with
denoting the sigmoid function. Therefore,is defined as the negative log-likelihood of the predictor for the training sample :
where equals 1 if is true, 0 otherwise.
The adversary corresponds to a neural network with parameters , which takes as input the sigmoid of the predictor’s output for any sample (i.e., ), and outputs the probability for the sensitive to equal 1:
For the demographic parity task, is the only input given to the adversary for the prediction of the sensitive attribute . In that case, the network outputs the conditional probability , with .
For the equalized odds task, the label is concatenated to to form the input vector of the adversary , so that the function could be able to output different conditional probabilities depending on the label of .
The adversary loss is defined for any training sample as:
with defined according to the task as detailed above.
Note that, for the case of demographic parity, if there exists such that on the training set, and , with and the corresponding distributions on the training set, is a global optimum of our min-max problem eq. (6). In that case, we have both a perfect classifier in training, and a completely fair model since the best possible adversary is not able to predict
more accurately than the estimated prior distribution. Similar observations can easily be made for the equalized odds task (by replacingby and using the corresponding definition of in the previous assertion). While such a perfect setting does not always exists in the data, it shows that the model is able to identify a solution when it reaches one. If a perfect solution does not exists in the data, the optimum of our min-max problem is a trade-off between prediction accuracy and fairness, controlled by the hyperparameter .
The learning process is outlined as pseudo code in Algorithm 1. The algorithm first initializes the classifier with constant values for all inputs, as done for the classical GBT. Additionally, it initializes the parameters of the adversarial neural network (a Xavier initialization is used in our experiments’). Then, at each iteration , beyond calculating the pseudo residuals for any training sample w.r.t. the targeted prediction loss , it computes pseudo residuals for the adversarial loss too. Both residuals are combined in , where controls the impact of the adversarial network. The algorithm then fits a new weak regressor (a decision tree in our work) to residuals using the training set . This pseudo-residuals regressor is supposed to correct both prediction and adversarial biases of the old classifier . It is added to it after a line search step, which determines the best weight to assign to in the new classifier . Finally, the adversarial has to adapt its weights according to new outputs (i.e., using the training set
). This is done by gradient backpropagation. A schematic representation of our approach can be found in Figure1.
Input: training set a number of iterations M, an adversarial learning rate , a differentiable loss function for the output classifier and for the adversarial classifier.
Initialize: Calculate the constant value:
Initialize parameters of the neural network
for to do
Calculate the pseudo residuals:
Calculate the pseudo residuals of the adversarial from the input :
Calculate the training loss derivative:
Fit a classifier to pseudo residuals using the training set
Compute multiplier by solving the following one-dimensional optimization problem:
Update the learning model:
Fit the adversarial to the using the new outputs (i.e., using the training set )
For our experiments we use four different popular data sets often used in fair classification: The Adult UCI Income data set [Dua:2019], the COMPAS data set [angwin2016machine], the Default data set [Yeh:2009:CDM:1464526.1465163], the Bank Marketing data set [Moro2014].
For all data sets, we repeat 10 experiments by randomly sampling two subsets, 80% for the training set and 20% for the test set. Finally, we report the average of the accuracy and the fairness metrics from the test set.
Because different optimization objectives result in different algorithms, we run separate experiments for the two fairness metrics of our interest, demographic parity (Table I) and equalized odds (Table II). More specifically, for demographic parity we aim at a p-rule of 90% for all algorithms and then compare the accuracy. Optimizing for equalized odds, results are more difficult to compare. In order to be able to compare the accuracy, we have done our best to obtain, each time, a disparate level below 0.03.
As a baseline, we use a classical, “unfair” gradient tree boosting algorithm, Standard GTB, and a deep neural network, Standard NN.
Further, to evaluate if the complexity of the adversarial network has an impact on the quality of the results, we compare a simple logistic regression adversarial, FAGTB-1-Unit, with a complex deep neural network, FAGTB-NN.
In addition to the algorithms mentioned above, we evaluate the following fair state-of-the-art in-processing algorithms: Wadsworth2018 [Wadsworth2018]333https://github.com/equialgo/fairness-in-ml, Zhang2018 [Zhang2018]444https://github.com/IBM/AIF360, Kamishima [Kamishima2012]222https://github.com/algofairness/fairness-comparison Feldman [Feldman2014]222https://github.com/algofairness/fairness-comparison, Zafar-DI [zafar2017parity]222https://github.com/algofairness/fairness-comparison and Zafar-DM [Zafar2017]222https://github.com/algofairness/fairness-comparison.
For each algorithm and for each data set, we obtain the best hyperparameters by grid search in 5-fold cross validation (specific to each of them). As a reminder, for FAGTB the value is used to balance the 2 cost functions during the training phase. This value depends exclusively on the main objective: For example, to obtain the demographic parity objective with 90% p-rule, we choose a lower and thus less weighty than for a 100% p-rule objective. In order to better understand this hyperparameter we illustrate its impact on the accuracy and the p-rule metric in Figure 2 for the Adult UCI data set. For that, we model the FAGTB-NN algorithm with 10 different values of and we run each experiment 10 times. In the graph, we report the accuracy and the p-rule fairness metric, and finally plot a polynomial regression of second order to demonstrate the general effect.
For Standard GTB, we parameterize the number of trees and the maximum tree depth. For example, for the Bank data set, a tree depth of with
trees is sufficient. For the Standard NN, we parameterize the number of hidden layers and units with a ReLU function and we apply a specific dropout regularization to avoid overfitting. Further, we use an Adam optimisation with a binary cross entropy loss. For the Adult UCI data set for example, the architecture consists of 2 hidden layers with 16 and 8 units, respectively, and ReLU activations. The output layer comprises one single output node with sigmoid activation.
For FAGTB, to accelerate the learning phase, we decided to sacrifice some performance by replacing the one-dimensional optimization
by a specific fixed learning rate for the classifier predictor. All hyperparameters mentioned above, for trees and neural networks, are selected jointly. Notice that those choices impact the rapidity of convergence for each of them. For example, if the classifier predictor converges too quickly this may result in biased prediction probabilities during the first iterations which are difficult to correct by the adversary afterwards. For FAGTB-NN, in order to achieve better results, we execute for each gradient boosting iteration several training iterations of the adversarial NN. This produces a more persistent adversarial algorithm. Otherwise, the predictor classifier GTB could dominate the adversary. At the first iteration, we begin with modeling a biased GTB and we then model the adversarial NN based on those biased predictions. This approach allows to have a better weight initialization of the adversarial NN. It is more suitable for the specific bias on the data set. Without this specific initialization we encountered some cases where the predictor classifier surpasses the adversarial too quickly and tends to dominate from the beginning. Compared to the FAGTB-NN, the adversary of the FAGTB-1-Unit is more simple. In this case, the two parameters of the adversarial are chosen randomly and for each gradient boosting iteration only one is computed for the adversarial unit.
For demographic parity (Table I), as expected Standard GTB and Standard NN achieve the highest accuracy. However, they are also the most biased ones. For example, the classical gradient tree boosting algorithm achieves a 32.6% p-rule for the Adult UCI data set. In this particular case, the prediction for earning a salary above $50,000 is in average more than three times higher for men than for women. Comparing the mitigation algorithms, FAGTB-NN achieves the best result with the highest accuracy while maintaining a reasonable high p-rule equality (90%). The choice of a neural network architecture for the adversary proved to be in any case better than a simple logistic regression. This is particularly true for the COMPAS data set where, for a similar p-rule, the difference in accuracy is considerable (2.7 points). Recall that for demographic parity the adversarial classifier only has one single input feature which is the output of the prediction classifier. It seems necessary to be able to segment this input in several ways to better capture information relevant to predict the sensitive attribute. The sacrifice of accuracy is less important for the Bank and the Default data set. The dependence between the sensitive attribute and the target label is thus less important than for the COMPAS data set. To achieve a p-rule of 90%, we sacrifice 4.6 points of accuracy (Comparing GTB and FAGTB-NN) for COMPAS, 0.7 points for Default and 0.6 points for Bank.
In Figure 3 we plot the distribution of the predicted probabilities for each sensitive attribute for 3 different models: An unfair model with , and 2 fair FAGTB models with and , respectively. For the unfair model, the distribution differs most for the lower probabilities. The second graph shows an improvement but there remain some differences. For the final one, the distributions are practically aligned.
Zhang2018 [Zhang2018] introduced a projection term which ensures that the predictor never moves in a direction that could help the adversary. While this is an interesting approach, we noticed that this term does not improve the results for demographic parity. In fact, the Wadsworth2018 [Wadsworth2018] algorithm follows the same approach but without projection term and obtains similar results.
Comparing our approach with different common fair algorithms by accuracy and fairness (p-rule metric)
for the Adult UCI, the COMPAS, the Default and the Bank data set.
Comparing our approach with different common fair algorithms by accuracy and fairness (, )
for the Adult UCI, the COMPAS, the Default and the Bank data set.
For equalized odds, the min-max optimization is more difficult to achieve than demographic parity. The fairness metrics and are not exactly comparable thus we did not succeed to obtain the same level of fairness. However, we notice that the FAGTB-NN achieves better accuracy with a reasonable level of fairness. Concretely, we achieve for the 4 data sets and for both metrics values below 0.02 or less, except for the Bank data set where is equal to 0.07. For this data set, most of the state-of-the-art algorithms result in a between 0.06 and 0.08. The reason why it proves hard to achieve a low False Negative Rate (FNR), is that the total share of the target is very low at 11.7%. A possible way to handle this problem of imbalanced target class could be to to add a specific weight directly in the loss function. We also notice that the difference in the results between FAGTB-1-Unit and FAGTB-NN is much more significant, one possible reason is that an unique logistic regression cannot keep a sufficient amount of information in order to predict the sensitive attribute.
In this work, we developed a new approach to produce fair gradient boosting algorithms. Compared with other state-of-the-art algorithms, our method proves to be more efficient in terms of accuracy while obtaining a similar level of fairness.