Large-scale algorithmic decision making, often driven by machine learning on consumer data, has increasingly run afoul of various social norms, laws and regulations. A prominent concern is when a learned model exhibits discrimination against some demographic group, perhaps based on race or gender. Concerns over such algorithmic discrimination have led to a recent flurry of research on fairness in machine learning, which includes both new tools and methods for designing fair models, and studies of the tradeoffs between predictive accuracy and fairness[ACM, 2019].
At the same time, both recent and longstanding laws and regulations often restrict the use of “sensitive” or protected attributes in algorithmic decision-making. U.S. law prevents the use of race in the development or deployment of consumer lending or credit scoring models, and recent provisions in the E.U. General Data Protection Regulation (GDPR) restrict or prevent even the collection of racial data for consumers. These two developments — the demand for non-discriminatory algorithms and models on the one hand, and the restriction on the collection or use of protected attributes on the other — present technical conundrums, since the most straightforward methods for ensuring fairness generally require knowing or using the attribute being protected. It seems difficult to guarantee that a trained model is not discriminating against, say, a racial group if we cannot even identify members of that group in the data.
A recent paper [Kilbertus et al., 2018] made these cogent observations, and proposed an interesting solution employing the cryptographic tool of secure multiparty computation (commonly abbreviated MPC). In their model, we imagine a commercial entity with access to consumer data that excludes race, but this entity would like to build a predictive model for, say, commercial lending, under the constraint that the model be non-discriminatory by race with respect to some standard fairness notion (e.g. equality of false rejection rates). In order to do so, the company engages in MPC with a trusted party, such as a regulatory agency, who does have access to the race data for the same consumers. Together the company and the regulator apply standard fair machine learning techniques in a distributed fashion. In this way the company never directly accesses the race data, but still manages to produce a fair model, which is the output of the MPC. The guarantee provided by this solution is the standard one of MPC — namely, the company learns nothing more than whatever is implied by its own consumer data, and the fair model returned by the protocol.
Our point of departure stems from our assertion that MPC is the wrong guarantee to give if our motivation is ensuring that data about an individual’s race does not “leak” to the company via the model. In particular, MPC implies nothing about what individual information can already be inferred from the learned model itself. The guarantee we would prefer is that the company’s data and the fair model do not leak anything about an individual’s race beyond what can be inferred from “population level” correlations. That is, the fair model should not leak anything beyond inferences that could be carried out even if the individual in question had declined to provide her racial identity. This is exactly the type of promise made by differential privacy [Dwork et al., 2006b], but not by MPC.
The insufficiency of MPC for protecting privacy. To emphasize the fact that concerns over leakage of protected attributes under the guarantee of MPC are more than hypothetical, we describe a natural example where this leakage would actually occur.
Example. Let denote the (non-protected) data on some group of consumers held by the company, and let denote the data indicating the race of each consumer represented in . Importantly, in the example we can think of race as being uncorrelated with anything else, including the company data . In particular, the reader can imagine that the race of each consumer is determined by an unbiased independent coin flip. Nevertheless the output of the MPC will allow the company to infer race.
Suppose that the model and learning algorithm used by the company and regulator are Support Vector Machines (SVM), and that the learned model uses the attributes in bothand as an input — that is, the learned model has the form where are the unprotected attributes and is race. An SVM model is represented by the underlying support vectors, which will be the labeled instances in
that specify the maximum margin separating hyperplane in the kernel space used. From the MPC the company thus learns the race of the support vectors — regardless of any fairness constraint, and despite the fact thatis uncorrelated with . We note that there exist differentially private implementations of SVMs that would avoid such leakage.
The reader might object that, in this example, the algorithm is trained to use racial data at test time, and so the output of the algorithm is directly affected by race. But there are also natural examples in which the same problems with MPC can arise even when race is not an input to the learned model
, and race is again uncorrelated with the company’s data. We also note that SVMs are just an extreme case of a learned model fitting its training data, and thus potentially revealing this data. For example, it is also well-known that neural networks trained with confidence intervals will naturally exhibit tight confidence intervals for their predictions on points in the training data. Thus simply having the model allows one to “read off” specific instances in the training set.
Our approach: differential privacy. These examples show that cryptographic approaches to “locking up” sensitive information during a training process are woefully insufficient as a privacy mechanism — we need to explicitly reason about what can be inferred from the output of a learning algorithm, not simply say that we cannot learn more than such inferences. In this paper we thus instead consider the problem of designing fair learning algorithms that also promise differential privacy with respect to consumer race, and thus give strong guarantees about what can be inferred from the learned model.
We note that the guarantee of differential privacy is somewhat subtle, and does not promise that the company will be unable to infer race. For example, it might be that a feature that the company already has, such as zip codes, is perfectly correlated with race, and a computation that is differentially private might reveal this correlation. In this case, the company will be able to infer racial information about its customers. Informally, this is because it was possible to predict race only from the information already available to the company, and the differentially private computation revealed this fact about the world. However, differential privacy prevents leakage of individual secrets beyond what can be inferred about those secrets from population-level correlations. For instance, in the example above, since the company’s data is uncorrelated with race, a differentially private implementation of SVMs would necessarily return a learned model that is nearly independent of any individual’s race.
Like [Kilbertus et al., 2018], our approach can be viewed as a collaboration between a company holding non-race consumer data and a regulator holding race (or other sensitive) data. Our algorithms allow the regulator to build fair models from the combined data set in a way that ensures the company, or any other party with access to the model or its decisions, cannot infer the race of any consumer in the data much more accurately than they could do from population-level statistics alone. In this way we comply with the spirit of laws and regulations asking that sensitive attributes not be leaked, while still allowing them to be used to enforce fairness.
1.1 Our Results
We study the problem of learning classifiers with data containing protected attributes. More specifically, we are given a class of classifiers and we output a randomized classifier in (the set of distributions over ). The training data consists of individual data points of the form . Here is the vector of unprotected attributes, is the protected attribute and is the binary label. As discussed above, our algorithms achieve three goals simultaneously:
Differential privacy: Our learning algorithms satisfy differential privacy [Dwork et al., 2006b] with respect to protected attributes. (They need not be differentially private with respect to the unprotected attributes .)
Fairness: Our learning algorithms guarantee approximate notions of statistical fairness across the groups specified by the protected attribute. The particular statistical fairness notion we focus on is Equalized Odds [Hardt et al., 2016], which in the binary classification case reduces to asking that false positive rates and false negative rates be approximately equal, conditional on all values of the protected attribute (but our techniques apply to other notions of statistical fairness as well, including statistical parity).
Accuracy: Our output classifier has error rate comparable to the optimal classifier in consistent with the fairness constraints.
Our treatment evaluates fairness and error as in-sample quantities. Out-of-sample generalization for both error and fairness violations follow from standard sample complexity bounds in learning theory, and so we elide this complication for clarity.
We start with a simple extension of the post-processing approach of [Hardt et al., 2016]. Their algorithm starts with a possibly unfair classifier and derives a fair classifier by mixing
with classifiers which are solely based on protected attributes. This involves solving a linear program which takes quantitiesas input. Here is the fraction of data points with . To make this approach differentially private with respect to protected attributes, we start with which is learned without using protected attributes and we use standard techniques to perturb the ’s before feeding them into the linear program, in a way that guarantees differential privacy. We analyze the additional error and fairness violation that results from the perturbation. Detailed results can be found in Section 3.
Although having the virtue of being exceedingly simple, this first approach has two significant drawbacks. First, even without privacy, this post-processing approach does not in general produce classifiers with error that is comparable to that of the best fair classifiers, and our privacy preserving modification inherits this limitation. Second, and often more importantly, this post-processing approach crucially requires that protected attributes can be used at test time, and this isn’t feasible (or legal) in certain applications.
We then consider the approach of [Agarwal et al., 2018] (see also a follow-up work [Kearns et al., 2018]). We refer to it as in-processing, to distinguish it from post-processing. Their approach does not have either of the above drawbacks: it does not require that protected features be available at test time, and it is guaranteed to produce the approximately optimal fair classifier. The algorithm is correspondingly more complicated. The main idea of their approach (here we follow the presentation of [Kearns et al., 2018]) is to show that the optimal fair classifier can be found as the equilibrium of a zero-sum game between a “Learner” who selects classifiers in and an “Auditor” who finds fairness violations. This equilibrium can be approximated by iterative play of the game, in which the “Auditor” plays exponentiated gradient descent and the “Learner” plays best responses (which can be computed given access to an efficient cost-sensitive classification oracle). To make this approach private, while simulating this play dynamic, we add Laplace noise to the gradients used by the Auditor and we let the Learner run the exponential mechanism (or some other private learning oracle) to compute approximate best responses. Our technical contribution is to show that the Learner and the Auditor still converge to an approximate equilibrium given the additional noise introduced for privacy. Detailed results can be found in Section 4.
One of the most interesting things to fall out of our results is an inherent tradeoff that arises between privacy, accuracy, and fairness, that doesn’t arise when any two of these desiderata are considered alone. This manifests itself as the parameter “” in our in-processing result (see Table 1) which mediates the tradeoff between error, fairness and privacy. This parameter also appears in the (non-private) algorithm of [Agarwal et al., 2018] — but there it serves only to mediate a (polynomial) tradeoff between fairness and running time. At a high level, the reason for this difference is that without the need for privacy, we can increase the number of iterations of the algorithm to decrease the error to any desired level. However, when we also need to protect privacy, there is an additional tradeoff, and increasing the number of iterations also requires increasing the scale of the gradient perturbations, which may not always decrease error.
This tradeoff exhibits an additional interesting feature. Recall that as we discussed above, the in-processing approach works even if we can not use protected attributes at test time. But if we are allowed to use protected attributes at test time, we are able to obtain a better tradeoff between these quantities — essentially eliminating the role of the variable that would otherwise mediate this tradeoff. We give details of this improvement in section 4.3 (for this result, we also need to relax the fairness requirement from Equalized Odds to Equalized False Positive Rates). The main step in the proof is to show that, for small constant and containing certain “maximally discriminatory” classifiers which make decisions solely on the basis of group membership, we can give a better characterization of the Learner’s strategy at the approximate equilibrium of the zero-sum game.
Finally, we provide evidence that using protected attributes at test time is necessary for obtaining this better tradeoff. In Section 4.4
, we consider the sensitivity of computing the error of the optimal classifier subject to fairness constraints. We show that this sensitivity can be substantially higher when the classifier cannot use protected attributes at test time, which shows that higher error must be introduced to estimate this error privately.
|Yes||No||111This error bound is relative to the non-private post-processing algorithm, which does not necessarily return the optimal fair classifier. All other error bounds in this table are relative to the optimal fair classifier.|
1.2 Related Work
The literature on algorithmic fairness is growing rapidly, and is by now far too extensive to exhaustively cover here. See [Chouldechova and Roth, 2018] for a recent survey. The most closely related pieces of work from this literature are [Hardt et al., 2016], [Agarwal et al., 2018], and [Kearns et al., 2018], upon which we directly build. In particular, [Hardt et al., 2016] introduces the “equalized odds” definition that we take as our primary fairness goal, and gave a simple post-processing algorithm that we modify to make differentially private. [Agarwal et al., 2018]
derives an “oracle efficient” algorithm which can optimally solve the fair empirical risk minimization problem (for a variety of statistical fairness constraints, including equalized odds) given oracles (implemented with heuristics) for the unconstrained learning problem.[Kearns et al., 2018] recast this algorithm into a game theoretic framework, and substantially generalize it to be able to handle infinitely many protected groups. We give a differentially private version of this algorithm as well.
Our paper is directly inspired by [Kilbertus et al., 2018], who study how to train fair machine learning models by encrypting sensitive attributes and applying secure multiparty computation (SMC). We share the goal of [Kilbertus et al., 2018]: we want to train fair classifiers without leaking information about an individual’s race through their participation in the training. Our starting point is the observation that differential privacy, rather than secure multiparty computation, is the right tool for this.
We use differential privacy [Dwork et al., 2006b] as our notion of individual privacy, which has become the gold standard “solution concept” for data privacy in the last decade. See [Dwork and Roth, 2014] for a survey. We make use of standard tools from this literature, including the Laplace mechanism [Dwork et al., 2006b], the exponential mechanism [McSherry and Talwar, 2007] and composition theorems [Dwork et al., 2006a, Dwork et al., 2010].
2 Model and Preliminaries
Suppose we are given a data set of individuals drawn from an unknown distribution where each individual is described by a tuple . forms a vector of unprotected attributes, is the protected attribute where , and is a binary label. Without loss of generality, we write and let . Let denote the empirical distribution of the observed data. Our primary goal is to develop an algorithm to learn a (possibly randomized) fair classifier , with an algorithm that guarantees the privacy of the sensitive attributes . By privacy, we mean differential privacy, and by fairness, we mean (approximate versions of) the Equalized Odds condition of [Hardt et al., 2016]. Both of these notions are parameterized: differential privacy has a parameter , and the approximate fairness constraint is parameterized by . Our main interest is in characterizing the tradeoff between , , and classification error.
Definition 2.1 (-Equalized Odds Fairness).
We say a classifier satisfies the -Equalized Odds condition with respect to the attribute , if for all , the false and true positive rates of in the subpopulations and are within of one another. In other words, for all ,
where probabilities are taken with respect to
where probabilities are taken with respect to. The above constraint involves quadratically many inequalities in . It will be more convenient to instead work with a slightly different formulation of -Equalized Odds in which we constrain the difference between false and true positive rates in the subpopulation and the corresponding rates for to be at most for all . The choice of group as an anchor is arbitrary and without loss of generality. The result is a set of only linearly many constraints. For all :
Since the underlying distribution is not known, we will work with empirical versions of the above quantities, in which all the probabilities appearing above will be taken with respect to the empirical distribution of the observed data . Since we will generally be dealing with this definition of fairness, we will use the shortened term “-fair” throughout the paper to refer to “-Equalized Odds fair”. We now introduce some notation that will appear throughout the paper.
We will use notation and to refer to the false and true positive rates of on the subpopulation . and are used to refer to the empirical false and true positive rates which are calculated based on the empirical distribution of the data.
Let be the empirical fraction of the data with , and . With slight abuse of notation, we will use to denote the empirical fraction of the data with and . We will see that shows up in our analyses and plays a role in the performance of our algorithms.
Observe that using the introduced notation, given a classifier
2.1 Differential Privacy
Let be a data universe from which a database of size is drawn and let be an algorithm that takes the database as input and outputs . Informally speaking, differential privacy requires that the addition or removal of a single data entry should have little (distributional) effect on the output of the mechanism. In other words, for every pair of neighboring databases that differ in at most one entry, differential privacy requires that the distribution of and are “close” to each other where closeness are measured by the privacy parameters and .
Definition 2.2 (-Differential Privacy (DP) [Dwork et al., 2006b]).
A randomized algorithm is said to be -differentially private if for all pairs of neighboring databases and all ,
if , is said to be -differentially private.
Recall that our data universe is , which will be convenient to partition as . Given a dataset of size , we will write it as a pair where represent the insensitive attributes and represent the sensitive attributes. We will sometimes incidentally guarantee differential privacy over the entire data universe (see Table 1), but our main goal will be to promise differential privacy only with respect to the sensitive attributes. Write to denote that and differ in exactly one coordinate (i.e. in one person’s group membership). An algorithm is -differentially private in the sensitive attributes if for all and for all , we have:
Differentially private mechanisms usually work by deliberately injecting perturbations into quantities computed from the sensitive data set, and used as part of the computation. The injected perturbation is sometimes “explicitly” in the form of a (zero-mean) noise sampled from a known distribution, say Laplace or Gaussian, where the scale of noise is calibrated to the sensitivity of the query function to the input data. However, in some other cases, the noise is “implicitly” injected by maintaining a distribution over a set of possible outcomes for the algorithm and outputting a sample from that distribution. The Laplace or Gaussian mechanisms which are two standard techniques to achieve differential privacy follow the former approach by adding Laplace or Gaussian noise of appropriate scale to the outcome of computation, respectively. The Exponential mechanism instead falls into the latter case and is often used when an object, say a classifier, with optimal utility is to be chosen privately. In the setting of this paper, to guarantee the privacy of the sensitive attribute in our algorithms, we will be using the Laplace and the Exponential Mechanisms which are briefly reviewed below. See [Dwork and Roth, 2014] for a more detailed discussion and analysis.
Let’s start with the Laplace mechanism which, as stated before, perturbs the given query function with zero-mean Laplace noise calibrated to the -sensitivity of the query function. The -sensitivity of a function is essentially how much a function would change in norm if one changed at most one entry of the database.
Definition 2.3 (-sensitivity of a function).
The -sensitivity of is
Definition 2.4 (Laplace Mechanism [Dwork et al., 2006b]).
Given a query function , a database , and a privacy parameter , the Laplace mechanism outputs:
where ’s are random variables drawn from .
Keep in mind that besides having privacy, we would like the privately computed query to have some reasonable accuracy. The following theorem which uses standard tail bounds for a Laplace random variable formalizes the trade-off between privacy and accuracy for the Laplace mechanism.
Theorem 2.1 (Privacy vs. Accuracy of the Laplace Mechanism [Dwork et al., 2006b]).
The Laplace mechanism guarantees -differential privacy and that with probability at least ,
While the Laplace mechanism is often used when the task at hand is to calculate a bounded numeric query (e.g. mean, median), the Exponential mechanism is used when the goal is to output an object (e.g. a classifier) with maximum utility (i.e. minimum loss). To formalize the exponential mechanism, let
be a loss function that given an input databaseand , specifies the loss of on by . Without a privacy constraint, the goal would be to output for the given database , but when privacy is required, the private algorithm must output with some “perturbation” which is formalized in the following definition. Let be the sensitivity of the loss function with respect to the database argument . In other words,
Definition 2.5 (Exponential Mechansim [McSherry and Talwar, 2007]).
Given a database and a privacy parameter , output with probability proportional to .
Theorem 2.2 (Privacy vs. Accuracy of the Exponential Mechanism [McSherry and Talwar, 2007]).
Let and be the output of the Exponential mechanism. We have that is -DP and that with probability at least ,
An important property of differential privacy is that it is robust to post-processing. The post-processing of an -DP algorithm output remains -DP.
Lemma 2.3 (Post-Processing [Dwork et al., 2006b]).
Let be a -DP algorithm and let be any randomized function. We have that the algorithm is -DP.
Another important property of differential privacy is that DP algorithms can be composed adaptively with a graceful degradation in their privacy parameters.
Theorem 2.4 (Composition [Dwork et al., 2010]).
Let be an -DP algorithm for . We have that the composition is -DP where and .
Following the Composition Theorem 2.4, if for instance, an iterative algorithm that runs in iterations is to be made private with target privacy parameters and , each iteration must be made -DP. This may lead to a huge amount of per iteration noise if is too large. The Advanced Composition Theorem 2.5 instead allows the privacy parameter at each step to scale with .
Theorem 2.5 (Advanced Composition [Dwork et al., 2010]).
Suppose and are target privacy parameters. Let be a -DP algorithm for all . We have that the composition is -DP where .
3 Differentially Private Fair Learning: Post-processing
In this section we will present and analyze our first differentially private fair learning algorithm which will be called DP-postprocessing. The DP-postprocessing algorithm is based on the fair learning model introduced in [Hardt et al., 2016] where decisions made by an arbitrary base classifier have their false and true positive rates equalized across different groups in a post-processing step. Due to the desire for privacy of the sensitive attribute , we assume the base classifier is trained only on the unprotected attributes and that is used only for the post-processing step. We will see the fair learning problem can be written as a linear program whose coefficients depend only on the introduced in Remark 2.2, and thus privacy will be achieved if these quantities are calculated privately using the Laplace mechanism. While the approach is straightforward and simply implementable, the privately learned classifier will need to take as input the sensitive attribute at test time which is not feasible (or legal) in all applications.
We will first review the basic approach of [Hardt et al., 2016] in Subsection 3.1. We will then introduce the DP-postprocessing algorithm in Subsection 3.2 which is followed by its analysis including the tradeoffs between accuracy, fairness, and privacy.
3.1 Fair Learning
Following the model presented in [Hardt et al., 2016], suppose there is an arbitrary base classifier which is trained on the set of training examples . Notice the protected attribute is excluded from the training set, and so is trivially -DP in the protected attribute. The goal for now is to make the classifications of the base classifier -fair with respect to the sensitive attribute by post-processing the predictions given by . With slight abuse of notation, let denote the derived optimal -fair randomized classifier where is a vector of probabilities describing and that . Define to be the solution to the optimization problem LP (1) where,
and is the expected loss of , i.e.,
Once is found by solving LP (1), one would then use this vector of probabilities, along with the estimate given by the base classifier and the sensitive attribute , to make further predictions. See Fig. 1 for a visual presentation of the adopted model in this section.
[title=LP: Linear Program]
Since the true underlying distribution is not known, in practice the empirical distribution is used to estimate the quantities appearing in LP (1). Using simple probability techniques, one can expand the empirical quantities , , and in a linear form in with coefficients being a function of and quantities introduced in Remark 2.2. We have the expanded empirical version of LP (1) written in (2). [title=: Empirical Linear Program]
3.2 A Differentially Private Algorithm: Design and Analysis
In order to guarantee privacy of the protected attribute , we simply need to compute the empirical quantities appearing in (2) in a differentially private manner: once we do this, the differential privacy guarantees of the algorithm will follow from the post-processing property. In particular, we need to compute a private estimate of . The first thing to do is to find the -sensitivity of to the sensitive attribute .
Lemma 3.1 (-Sensitivity of the Empirical Distribution to ).
We have that
be the perturbed version of where ’s are draws from distribution. Once is computed privately with privacy guarantee of , any post-processing of the private would still be -differentially private by the Post-processing Lemma 2.3. As a consequence, one may instead feed the privately computed empirical distribution to the linear program (2) to ensure privacy of the sensitive attribute . With an inevitable modification to the constraints of the linear program (2), we now introduce the -differentially private linear program (3) which is used in the DP-postprocessing Algorithm 1 to obtain an optimal -DP -fair classifier. Note that in (3), is the confidence parameter, is the training sample size, and are the false and true positive rates of the classifier in calculated using the private , and refers to the noisy version of , or in other words, . We will provide high probability guarantees on the accuracy and fairness violation of the classifier given by the DP-postprocessing Algorithm 1 in Theorem 3.2. The proof of Theorem 3.2 relies on some facts which are stated in Lemma A.1. All the proofs are given in Appendix A. [title=: -Differentially Private Linear Program]
Theorem 3.2 (Error-Privacy, Fairness-Privacy Tradeoffs).
We emphasize that the accuracy guarantee stated in Theorem 3.2 is relative to the non-private post-processing algorithm, not relative to the optimal fair classifier. This is because the non-private post-processing algorithm itself has no such optimality guarantees: its main virtue is simplicity. In the next section, we analyze a more complicated algorithm that is competitive with the optimal fair classifier.
4 Differentially Private Fair Learning: In-processing
In this section we will introduce our second differentially private fair learning algorithm which will be called DP-oracle-learner and is based on the algorithm presented in [Agarwal et al., 2018]. The method developed by [Agarwal et al., 2018], in the language of [Kearns et al., 2018] gives a reduction from finding an optimal fair classifier to finding the equilibrium of a two-player zero-sum game played between a “Learner” who needs to solve an unconstrained learning problem (given access to an efficient cost-sensitive classification oracle which will be described later in Assumption 4.1) and an “Auditor” who finds fairness violations. Having the learner play its best response and the auditor play a no-regret learning algorithm (we use exponentiated gradient descent, or “multiplicative weights”) guarantees convergence of the average plays to the equilibrium. Our differentially private extension achieves privacy by having the learner play its best response using the exponential mechanism. This is the differentially private equivalent of assuming access to a perfect oracle, as is done in [Agarwal et al., 2018, Kearns et al., 2018]. In practice, the exponential mechanism would be substituted for a computationally efficient private learner with heuristic accuracy guarantees. The auditor is made private by the Laplace mechanism where the Laplace perturbations are added to the gradients.
We will first review the fair learning problem in section 4.1 and briefly give the reduction discussed above. The DP-oracle-learner algorithm and its analysis come afterwards in section 4.2 where tradeoffs among accuracy, fairness, and privacy of the learned classifier output by the DP-oracle-learner algorithm are studied. In section 4.3 we consider a scenario where only equalized false positive rates are required and improve the tradeoffs assuming that access to the sensitive attribute at test time is allowed. Finally, in section 4.4, we consider the sensitivity of computing the error of the optimal classifier subject to fairness constraints. We show that this sensitivity can be substantially higher when the classifier cannot use protected attributes at test time, which shows that higher error must be introduced to estimate this error privately. This demonstrates an interesting interaction between the error achievable in the equalized odds fairness constraints, and the ability to use protected attributes explicitly in classification (i.e. requiring “disparate treatment”) which does not arise without the constraint of differential privacy.
4.1 Fair Learning
Suppose given a class of binary classifiers , the task is to find the optimal -fair classifier in , where is the set of all randomized classifiers that can be obtained by functions in . In our main analysis, we will not necessarily assume that the protected attribute is available to the classifiers — i.e. we will allow them to be “-blind” at test time. We will discuss in subsection 4.3 how we can get better accuracy/fairness guarantees if we allow classifiers in to have access to the protected attribute . [Agarwal et al., 2018] provided a reduction of the learning problem with only the fairness constraint to a two-player zero-sum game and introduced an algorithm that achieves the lowest empirical error. In this section we mainly discuss their reduction approach which forms the basis of our differentially private fair learning algorithm that will be introduced later on in subsection 4.2. Although [Agarwal et al., 2018] considers a general form of a constraint that captures many existing notions of fairness, in this paper, we focus on the Equalized Odds notion of fairness described in Definition 2.1. Our techniques, however, generalize beyond this.
To begin with, the -fair classification task can be modeled as the constrained optimization problem 4, where
form the difference of the false and true positive rates of the classifier given with those of the subpopulation with , and is the expected error over the distribution on .
[title=Fair Learning Problem]
Once again, as the data generating distribution is unknown, we will be dealing with the Fair Empirical Risk Minimization (ERM) problem 5. In this empirical version, all the probabilities and expectations are taken with respect to the empirical distribution of the data . [title=Fair ERM Problem]
Toward deriving a fair classification algorithm, the above fair ERM problem 5 will be rewritten as a two-player zero-sum game whose equilibrium is the solution to the problem. Let store all constraints of 5, with moved to the other side of the inequalities, in one single vector.
For dual variable , let
be the Lagrangian of the optimization problem. We therefore have that the Fair ERM Problem 5 is equivalent to
In order to guarantee convergence, we further constrain the norm of to be bounded. So let be the feasible space of the dual variable for some constant . Hence, the primal and the dual problems are as follows.
The above primal and dual problems can be shown to have solutions that coincide at a point which is the saddle point of . From a game theoretic perspective, the saddle point can be viewed as an equilibrium of a zero-sum game between a Learner (-player) and an Auditor (-player) where is how much the Learner must pay to the Auditor. Algorithm 3, developed by [Agarwal et al., 2018], proceeds iteratively according to a no-regret dynamic where in each iteration, the Learner plays the best response () to the given play of the Auditor and the Auditor plays exponentiated gradient descent. The average play of both players over rounds are then taken as the output of the algorithm, which can be shown to converge to the saddle point ([Freund and Schapire, 1996]). [Agarwal et al., 2018] shows how can be solved efficiently having access to the cost-sensitive classification oracle for () and we have their reduction for our Equalized Odds notion of fairness written in Subroutine 2.
Assumption 4.1 (Cost-Sensitive Classification Oracle for ).
It is assumed that the proposed algorithm has access to which is the cost-sensitive classification oracle for . This oracle takes as input a set of individual-level attributes and costs , and outputs . In practice, these oracles are implemented using learning heuristics.
Note that the Learner finds for a given of the Auditor and since the Lagrangian is linear in , the minimizer of can be chosen to put all the probability mass on a single classifier . Additionally, our reduction in Subroutine 2 looks different from the one derived in Example 4 of [Agarwal et al., 2018] since we have our Equalized Odds fairness constraints formulated a bit differently from how it is formulated in [Agarwal et al., 2018].
that corresponds to a -approximate equilibrium of the game and it implies neither player can gain more than by changing their strategy (see Theorem 1 of [Agarwal et al., 2018]). They further show that any -approximate equilibrium of the game achieves an error close to the best error one would hope to get and the amount by which it violates the fairness constraints is reasonably small (see Theorem 2 of [Agarwal et al., 2018]).
4.2 A Differentially Private Algorithm: Design and Analysis
We are now going to introduce a differentially private fair classification algorithm to solve the Fair ERM Problem 5 which can be seen as an extension of Algorithm 3 to also guarantee privacy of the protected attribute . In this differentially private version, the Learner and the Auditor are made private in each iteration of the algorithm by the exponential and Laplace mechanisms respectively. Particularly, in the -th iteration of the algorithm,
the private Auditor (-player) perturbs the of Algorithm 3 with appropriately calibrated Laplace noise to ensure -differential privacy of for some value of specified later on;
and the private Learner (-player) plays its best response to a given using a subroutine which is made