Log In Sign Up

Model-Agnostic Characterization of Fairness Trade-offs

by   Joon Sik Kim, et al.
JPMorgan Chase & Co.
Carnegie Mellon University

There exist several inherent trade-offs in designing a fair model, such as those between the model's predictive performance and fairness, or even among different notions of fairness. In practice, exploring these trade-offs requires significant human and computational resources. We propose a diagnostic that enables practitioners to explore these trade-offs without training a single model. Our work hinges on the observation that many widely-used fairness definitions can be expressed via the fairness-confusion tensor, an object obtained by splitting the traditional confusion matrix according to protected data attributes. Optimizing accuracy and fairness objectives directly over the elements in this tensor yields a data-dependent yet model-agnostic way of understanding several types of trade-offs. We further leverage this tensor-based perspective to generalize existing theoretical impossibility results to a wider range of fairness definitions. Finally, we demonstrate the usefulness of the proposed diagnostic on synthetic and real datasets.


page 1

page 2

page 3

page 4


SoFaiR: Single Shot Fair Representation Learning

To avoid discriminatory uses of their data, organizations can learn to m...

Evaluation of Fairness Trade-offs in Predicting Student Success

Predictive models for identifying at-risk students early can help teachi...

Trade-offs between Group Fairness Metrics in Societal Resource Allocation

We consider social resource allocations that deliver an array of scarce ...

Inherent Trade-Offs in the Fair Determination of Risk Scores

Recent discussion in the public sphere about algorithmic classification ...

On the Choice of Fairness: Finding Representative Fairness Metrics for a Given Context

It is of critical importance to be aware of the historical discriminatio...

An Empirical Investigation of Learning from Biased Toxicity Labels

Collecting annotations from human raters often results in a trade-off be...

1 Introduction

Machine learning continues to be more widely used for applications with important societal consequences, such as credit decisioning, predictive policing, healthcare, and employment applicant screening. In these applications, model developers face regulatory, ethical, and legal challenges to prove whether or not their models are fair [8]. To provide quantitative tests of model fairness, practitioners further need to choose between multiple definitions of fairness that exist in the machine learning literature [3, 32, 22]. Furthermore, these various notions of fairness have been shown to conflict with one another, and in some situations, enforcement of fairness come with a necessary cost in loss of accuracy [16, 7]. Such considerations complicate the practical development and assessment of fair machine learning models, as applications do not always clearly point to a specific definition of fairness as appropriate for the use case, and the conditions under which performance loss must necessarily occur can be too abstract to test in practice.

As an example, suppose an engineer is responsible for training a loan prediction model from a large user dataset, subject to mandatory fairness requirements as set by internal bank policy, which itself is shaped by regulatory and business concerns. She has many choices for how to train a fair model, with fairness enforced before model training [15, 30, 20, 24, 25, 26], during model training [29, 28], or after model training [11, 12, 13]. However, she currently must resort to trial and error to determine which of these myriad approaches, if any, will produce a compliant model with sufficient performance111In this paper, performance

refers to classical metrics derived from confusion matrix analysis, like accuracy, precision, and recall.

to satisfy business needs. It may even turn out that despite her best efforts, the fairness constraints are impossible to satisfy, due to limitations intrinsic to the prediction task and data at hand.

Motivated by such practical considerations, we propose the FACT (FAirness-Confusion Tensor) diagnostic for exploring these fairness–performance trade-offs systematically before building any model. The fairness-confusion tensor, the underlying object considered by our proposed diagnostic, is simply the ordinary confusion matrix augmented with an additional axis for each protected attribute. Our diagnostic builds upon the observation that many metrics for model performance and many fairness definitions are simple rational functions of the elements of this tensor. The diagnostic solves an optimization problem defined over this tensor (and not over model parameters), which we call the performance–fairness optimality problem (PFOP), thus enabling a data-dependent, yet model-agnostic analysis of the trade-offs, simply by considering the geometry of valid fairness–confusion tensors for all models that satisfy a specified set of performance and/or fairness conditions. In particular, by noting that many settings involve only linear notions of fairness, we focus on the least-squares accuracy–fairness optimality problem (LAFOP) in this work, a specific convex instantiation of PFOP.

Our contributions are:

  1. to identify many definitions of fairness that can be expressed as functions of the fairness–confusion tensor,

  2. to formulate the FACT diagnostic as a PFOP/LAFOP over the fairness–confusion tensor that explores model-agnostic fairness trade-offs,

  3. to derive general fairness incompatibility results, and

  4. to demonstrate the practical use of the FACT diagnostic on synthetic and real datasets.

2 Related Work

Quantitative definitions of fairness exist in many different variations [22, 16, 7, 11, 13, 4, 2]; however, less work exists to categorize these notions. [27] categorized the existing fairness definitions based on entries and rates derived from the confusion matrix, but did not fully explore trade-offs and incompatibilities among different notions. Our work extends this perspective and provides a simple geometric formalism to study these trade-offs. Yet some other definitions of fairness defy quantification using confusion matrix notions, such as individual-level fairness [11] or representation-level fairness [20, 24, 25, 26], and many models can be trained to satisfy them using different algorithms. Nevertheless, the FACT diagnostic being model-agnostic allows it to be applied even to models trained to satisfy these definitions of fairness, as long as the final criteria for evaluation involves fairness definitions that rely on the fairness-confusion tensor.

Fairness–confusion tensor is indeed not a completely new or ground-breaking notion — several work has mentioned it in passing, most of the times disregarded it as a simple computational tool that eases the computation of several definitions of fairness on the implementation level [1, 5]. It is also a natural object considered in several post-processing methods in fairness [13, 23], a group of algorithms that fine-tune a trained model to mitigate the unfairness while keeping the performance change minimal. Unlike the previous approaches, we take a closer look at the fairness–confusion tensor itself and study how this object brings together several notions of fairness, simplifying and generalizing the analysis of inherent trade-offs in fairness.

Fairness–performance trade-offs have been studied in many specific cases [3, 32, 15, 12, 21, 19, 31], for some definition of fairness, some definition of performance, and for some model. To our knowledge, these trade-offs have not been studied in the general, model-agnostic way we present below. [29, 28] presented an optimization-based analysis of the trade-offs, albeit over the parameter space of a particular model.

Fairness–fairness trade-offs are less well-studied; most existing results describe the impossibility of satisfying multiple notions of fairness simultaneously. The result presented in [16, 2, 23]

is perhaps the most well-known, which states that no classifier can satisfy class balance and calibration simultaneously, except under some strong assumptions about the data and the model. We prove below a more general form of this theorem, which yields many related generalizations.

[7] also showed a similar type of incompatibility between predictive parity and class balance; we re-derive this result using the fairness–confusion tensor perspective with a more detailed set of conditions for compatibility. Nevertheless, to the best of our knowledge, our work is the first to provide a general approach to diagnose both fairness–fairness and fairness–performance trade-offs together under the same formalism.

3 The fairness–confusion tensor

Our key insight is that the elements of the fairness–confusion tensor encode all the information needed to study many notions of performance and fairness. The fairness–confusion tensor is simply the stack of confusion matrices for each protected attribute . We focus on the simplest case, with one binary protected class , and a binary classifier for a binary label , which is shown in Table 1222The arguments naturally generalize to multiple and non-binary protected attributes with high-dimensional tensors..

Table 1: The fairness–confusion tensor, showing the two planes corresponding to the confusion matrix for each of the favored () and disfavored groups ().

By convention, we call the favored group and the disfavored group. We augment the conventional abbreviations TP, FP, FN, and TN with a subscript indicating the group variable .

Let be the number of data points, be the number of data points in each group , be the number of postive-class instances () for each group. Assume , and

are known constants. Unraveling the fairness–confusion tensor into an 8-dimensional vector, we denote it as

normalized and constrained to lie on a simplicial complex , where and encode marginal sum constraints of the dataset (e.g., ) in matrix notation:

The following examples show how some typical fairness definitions can be reformulated as functions of .

Demographic parity (DP) states that each protected group should receive positive prediction at an equal rate:


DP is equivalent to


or also the linear system , where


The choice of normalization, , ensures that the matrix coefficients are in . This choice does not affect the solution , but will be important for having a uniform choice of “units” for and .

We now introduce a graphical notation to help visualize which components of the fairness–confusion tensor participate in the fairness definition. Depict the fairness–confusion tensor as , with the left matrix for the favored class () and the right matrix for the disfavored class (). Since each component of corresponds to some element of the fairness–confusion tensor, we shade each component that appears in the equation. Blue shading denotes the favored class, while red shading denotes the disfavored class. We further distinguish two kinds of dependencies. Components that have a nonzero coefficient in the matrix are shaded fully. However, the values of these coefficients themselves can depend on other components, albeit implicitly. In this example, implicitly touches the entire confusion matrix for , so we shade these implicit components in a lighter shade. Putting this all together, we can represent (3) graphically as .

[t] Name of fairness Definition and linear system Terms in fairness–confusion tensor Demographic parity (DP)[4] Equalized opportunity (EOp)[11, 13] Predictive equality (PE)[7]

Equalized odds (EOd)

[13] EOp PE Calibration within groups (CG)[16] Positive class balance (PCB)[16] Negative class balance (NCB)[16] Predictive parity (PP)[7] Equal false omission rate (EFOR) 1 Conditional accuracy equality (CA)[2] PP EFOR Equal false positive rate (EFPR) 2 Equal false negative rate (EFNR) 2

  • To our knowledge, EFOR has not been described in literature in isolation, but is used in the definition of conditional accuracy equality (CA)[2].

  • Defined implicitly in [7].

Table 2: Some common fairness definitions in terms of linear functions or quadratic functions that appear in the performance–fairness optimality problem (8). There are two groups separated by the horizontal line: those that are specified by linear functions (above), or quadratic functions (below). The graphical notation is described in Section 3.

is the probability produced by a model (parameterized by

) of . The fairness functions are uniquely defined only up to a normalization factor and overall sign.

Predictive parity (PP) [7] states that the likelihood of being in the positive class given the positive prediction is the same for each protected class:


which is equivalent to


Unlike for DP, the marginal sum constraints do not relate and , so this notion of fairness is not linear in the fairness–confusion tensor. Nevertheless, PP can be expressed using the quadratic form


We can write down the corresponding graphical notation as , with the superscript denoting the quadratic order of the term.

Calibration within groups (CG) [16], when specialized to binary classifiers and binary protected classes, can be written as the system of equations


where the s are scores satisfying and have no implicit dependence on any entries of the fairness–confusion tensor. We can rewrite this linear system explicitly as the matrix equation with

which can be written in our graphical notation as . This notation makes clear that calibration is a notion that touches all eight entries of the fairness–confusion tensor, but only two at a time.

Equalized odds (EOd) [13] can be expressed as two simultaneous linear equations, or in our notation.

The graphical notation gives us a simple, intuitive, yet powerful way to organize many different fairness definitions in the literature, as summarized in Table 2.

4 Optimization problems over the fairness-confusion tensor

As presented in the previous section, the fairness-confusion tensor allows for a succinct linear and quadratic characterization of multiple fairness definitions in the literature. We can then naturally consider the following family of optimization problems over , where the objective function is constructed in a way that the solution space reflects potential trade-offs between fairness and performance.

Definition 1 (Performance-fairness optimality problem (PFOP)).

Let be performance metrics (indexed by ) with best performance 0 and worst performance 1, be fairness constraints (indexed by ) with , and

be two series of loss functions specific to each performance metrics and fairness constraints respectively, and

, be real constants with . Then, the performance–fairness optimality problem (PFOP) is a class of optimization problem of form:


PFOP is a general optimization problem that contains two groups of terms; the first quantifies performance loss; the second quantifies unfairness. The restriction is necessary to ensure that is a valid fairness–confusion tensor that obeys the requisite marginal sums.

In our discussion below, it will be convenient to consider solutions with explicit bounds on their optimality.

Definition 2.

Let and . Then, a -solution to the PFOP is a that satisfies (8) such that and .

The parameters and therefore represent the sum total of deviation from perfect fairness and perfect predictive performance respectively.

Unless otherwise stated, the rest of this paper is dedicated to analyzing one of the simplest instantiations of PFOP, defined below.

Definition 3.

The least-squares accuracy–fairness optimality problem (LAFOP) is a PFOP with accuracy as the performance function, and fairness constraints with


In other words, LAFOP is the problem



encodes the usual notion of classification error. We consider only a single hyperparameter,

, which specifies the relative importance of satisfying the fairness constraints while optimizing classification performance. This parameter can be interpreted as a regularization strength, with considering only performance and disabling all fairness constraints, and imposing fairness constraints without regard to accuracy. Also, we can stack linear fairness functions together into one linear system (encoded in the matrix ) as the regularizer for considering multiple linear notions of fairness.

LAFOP is a convex optimization problem which is straightforward to analyze and solve. Despite its simplicity, LAFOP still encompasses many situations involving linear notions of fairness, allowing us to reason about multiple fairness constraints as well as fairness–accuracy trade-offs, presented in more detail in Section 6.

5 Incompatible fairness definitions

In this section, we show how PFOPs (usually LAFOPs) can be used to derive existing and new fairness incompatibility results. Proofs are given in the Appendix.

5.1 The incompatibility of {CG, PCB, NCB}, and generalizations thereof

Sets of fairness definitions Necessary conditions
{CG, PP, DP, and any of EOp, PE, PCB, NCB, EFOR} and
{CG, DP, and any of EOp, PE, PCB, NCB, EFOR} EBR only
or EBR
or EBR
or EBR
Table 3: Some example sets of fairness definitions containing CG, which are rank-0 compatible (i.e. incompatible) in the sense of Definition 4 (left-column), together with their necessary conditions to be compatible (right column). EBR is the equal base rate condition, . These are all special cases of Theorem 1, while not exhaustive.

The well-known impossibility theorem of [16] can be proved simply by asking if and when solutions exist to its corresponding LAFOP. The proof uses only elementary linear algebra.

Definition 4.

Let be a set of linear fairness functions, and be the dimensionality of the vector space . Then, is a set of rank- compatible fairnesses. If , then is a set of compatible fairnesses. Otherwise, is termed a set of incompatible notions of fairness. An incompatible set is termed rank-0 compatible if it is not empty.

The notion of rank-0 compatibility captures situations when the set of fairness notions is compatible but only under exceptional, data-dependent constraints (i.e. incompatible).

For linear fairness functions , compatibility is equivalent to having infinitely many solutions to


Below, we introduce a general version of the impossibility theorem of [16] that leads to many other results.

Theorem 1.

Let be the number of bins in the definition of calibration within groups fairness (CG) [16], and , be the scores, with , and with . Then, the corresponding PFOP (8) has the only solution


and only when


Otherwise, no solution exists.

Corollary 1 ([16]).

Consider a classifier that satisfies CG, PCB and NCB fairness simultaneously. Then, at least one of the following statements is true:

  1. [nolistsep]

  2. the data have equal base rates for each class , i.e. , or

  3. the classifier has perfect prediction, i.e.  and .

In other words, {CG, PCB, NCB} is rank-0 compatible.

Theorem 1 yields many similar results regarding the incompatibility of CG and other notions of fairness.

Corollary 2.

Consider a classifier that satisfies CG and DP fairness simultaneously. Then, the data have equal base rates for each group . In other words, {CG, DP} is rank-0 compatible.

The same approach can be extended easily to quadratic notions of fairness.

Corollary 3.

Consider a classifier that satisfies CG and PP fairness simultaneously. Then, at least one of the following is true:

  1. [nolistsep]

  2. .

  3. .

In other words, {CG, PP} is rank-0 compatible.

Similar constructions yield many other sets of fairness definitions in Table 2 that have similar impossibility criteria, as summarized in Table 3. To our knowledge, all cases other than {CG, PCB, NCB} are new.

5.2 The incompatibility of {EFPR, EFNR, PP}

In this section, we re-derive an impossibility result that is described in [7] using LAFOP formulation just like the previous section, and provide more precise conditions for compatibility. Details of the proof are in the Appendix.

Theorem 2 ([7]).

{EFPR, EFNR, PP} is a rank-0 compatible set of fairness notions. For a classifier that satisfies , at least one of these statements must be true:

  1. [nolistsep]

  2. The classifier has no true positives.

  3. The classifier has no false positives.

  4. Each protected class has the same base rate.

Theorem 2 shows that equal false positive rates, equal false negative rates, and predictive parity are compatible only under exceptional, data-dependent circumstances.

6 Experiments

In this section, we show how the FACT diagnostic can be useful in practice by considering various scenarios. First we compare the FACT Pareto frontier of equalized odds to existing published results of fair models (Section 6.2). The FACT Pareto frontier characterizes a model’s achievable accuracy for a given set of fairness constraints, and helps us to assess the suboptimality of existing methods and to contextualize the FACT diagnoistic in terms of recent work. Next, we further explore the cases when multiple fairness constraints are imposed in LAFOP, both in exact (Section 6.3) and relaxed (Sections 6.4, 6.5) settings. By observing these cases we draw a clear picture of how several notions of fairness and accuracy interact with one another, and demonstrate the relative impact of individual notions of fairness when imposing multiple notions simultaneously.

6.1 Datasets

We study a synthetic dataset similar to that in [29]

, consisting of two-dimensional features along with a single binary protected attribute that is either sampled from an independent Bernoulli distribution (“unbiased” variant, denoted

S(U)), or sampled dependent on the features (“biased” variant, denoted S(B)). More details on sampling distribution are in Section H.1. We also study the UCI Adult dataset [10], a census dataset with protected attributes often used to predict whether or not an individual has high income. We consider sex as the only protected attribute.

6.2 Interpreting the FACT Pareto frontier

Figure 1:

Model-agnostic FACT Pareto frontier of accuracy and equalized odds fairness for the Adult dataset. Three fair models (FGP, Eq.Odd., Op.) are plotted by varying the the strength of the fairness criteria imposed, along with the baseline models (LR, SVM, RF). The FACT Pareto frontier here is plotted by taking the reasonable empirical estimate of the Bayes error into account.

With LAFOP, one can naturally consider a FACT Pareto frontier by plotting values of -solutions obtained under a certain fairness constraint. In this section, we want to highlight the importance of this frontier in the context of several published fair models in the literature as well as its implications.

We consider three fair models: FGP [26], Op. [29], and Eq.Odd. [13]

, individually representing three different approaches one can take in training fair models (imposing fairness before, during, or after training). We also train some baseline models (logistic regression, SVM, random forest) for reference.

The FACT Pareto frontier can be interpreted as characterizing the model’s achievable accuracy relative to the Bayes error (i.e., the degree to which the added fairness constraints adversely impact the Bayes error). A wide range of ML models have been applied to the UCI Adult dataset [6], and thus we have a reasonable empirical estimate of the Bayes error (around 0.12). Combining this empirical estimate of the Bayes error with the output of FACT diagnostic, we obtain the FACT Pareto frontier shown in Figure 1 for the Adult dataset with respect to Equalized Odds. Indeed, we observe that the FACT Pareto frontier provides a reasonable upper bound for the three existing fair models considered, and moreover informs that only for quite small fairness gaps (e.g. and smaller) that the accuracy will start to suffer. In the rest of the following sections and figures, because reasonable estimates of the Bayes error are not available for most datasets, should be interpreted in reference to the Bayes error, i.e means that the upper bound of the best-achievable accuracy is the accuracy of the Bayes classifier, not 1.

6.3 Multiple exact fairness constraints

We are now interested in a more complicated setting, when a group of fairness constraints are required to be exactly satisfied. In the context of LAFOP, we are interested in the values of -solutions. The optimal values are shown in Table 4 in terms of the best attainable accuracy (1 - ) under several fairness constraints, which are imposed as strict equality constraints of the form . Again, as in Section 6.2, the reported values indicate the relative accuracy when we set 1.0 to be the accuracy of the Bayes classifier. For instance, in the unbiased synthetic dataset (first column), while {EOd, DP} are compatible, the best attainable accuracy under these fairness constraints drops by over 60 percent relative to the accuracy of the Bayes classifier. Another observation is that under EOd and DP, we can additionally have CB, PE, and PCB “for free”, with no additional drop in the best attainable accuracy. Similar trends hold for the biased synthetic and the Adult dataset as well (middle, right columns), where {EOd, DP} determines for the group of fairness definitions in color blue. As expected, the bottom two rows of the table verify the impossibility results from Section 5 for all three datasets. Note that this table is generalizable to an arbitrary number of fairness constraints imposed on the optimization.

Fairnesses S(U) S(B) Adult
PCB, CB 1.000 1.000 1.000
PE, NCB 1.000 1.000 1.000
PCB, DP 0.979 0.793 0.935
EOd, DP 0.392 0.392 0.763
EOd, PCB, DP 0.392 0.392 0.763
EOd, CB, PE, DP 0.392 0.392 0.763
EOd, CB, PE, EOp, DP 0.392 0.392 0.763
PCB, NCB, CG - - -
CG, CB, EOp, DP - - -
Table 4: The best accuracy achievable (1.0 being the accuracy of the Bayes classifier) for any classifier that satisfies multiple fairnesses exactly on the synthetic datasets (unbiased, denoted as S(U), and biased, denoted as S(B)) and the Adult dataset (Section 6.1). The numbers are the largest (and hence the smallest ) attainable in an -solution to the performance–fairness optimality problem, as in Definition 2. A dash indicates that the optimization could not find feasible solutions, i.e. the set of fairness definitions are incompatible. This table shows which fairness comes “for free”, in the sense that additional fairness can be satisfied without extra drop in accuracy (increase in ). The groups of fairnesses are colored according to their maximum attainable values (from top to bottom: magenta, green, blue, black, and red).
Figure 2: (,) curves for the unbiased synthetic dataset (Section 6.1, left), biased synthetic dataset (Section 6.1, middle), and Adult dataset (Section 6.1, right). Accuracy axes are equal on all plots. Note that the -axis is inverted ( decreases from left to right), and the values should be considered in reference to the Bayes error (i.e. represents accuracy of the Bayes classifier rather than a perfect accuracy). The colors are consistent with the groups of fairness definitions in Table 4 – groups that demonstrate similar behavior are plotted under the same color. The solid line is a trajectory of (,)-solutions when directly varying the and optimizing (14), and crosses are (,)-solutions obtained from (10) with varying s. All other groups of definitions except for those in red and black (which are incompatible in the sense of Definition 4) converge to the value reported in Table 4 as . Incompatible groups (red, black) have halted trajectories before hitting smaller values of , meaning the optimization is not feasible.

6.4 Multiple approximate fairness constraints

In practice, we may have situations where fairness constraints need not be satisfied exactly, either because the application does not require exact fairness, or because exact fairness is impossible to attain. To account for approximate solutions with , we examine how (,)-solutions for different groups of fairness constraints change with varying amount of relaxation. Figure 2 shows this in two different ways:

  1. (,)-solutions obtained when we impose fairness conditions as inequality constraints instead of as regularizers, i.e. solving


    while varying (drawn as solid lines), and

  2. (,)-solutions obtained from the LAFOP (10) while varying s (drawn as crosses).

It confirms that the regularization approach works just like hard constraints on the optimization.

In Figure 2, we show (,) curves for the groups of fairness definitions used in Table 4 with corresponding colors in all three datasets. Looking side by side, the curves in Figure 2 all converge to the values reported in Table 4 for each group, and our findings from Table 4 regarding the dominance of EOd and DP in the blue group is verified as the blue curves overlap with each other showing similar behaviors.

Comparing the blue and green groups, we can see how imposing EOd fairness instead of PCB fairness jointly with DP fairness can introduce a comparably larger drop in accuracy (EOd indeed is a stronger definition of fairness than PCB). Groups that contain incompatible fairness definitions, like {PCB, NCB, CG} and {CG, CB, EOp, DP}, have halted trajectories before hitting much smaller values for all three datasets. The trajectories confirm the theoretical impossibility results ( is unattainable), while also showing how much relaxation is needed for these definitions to be approximately compatible. Because the unbiased synthetic dataset has approximately equal base rates compared to the Adult or the biased synthetic dataset, the incompatibility is less strict (black line). All such analyses on trade-offs can be performed before training any models on these datasets— through the lens of LAFOP, we are able to quantify and visualize the trade-offs using changes in (, ) values for different groups of fairness definitions, which is otherwise not captured in such model-agnostic manner.

6.5 Multi-way fairness–accuracy trade-offs

Figure 3: Fairness–fairness–accuracy trade-off analysis using contour plot of accuracy with varying regularization strengths of Demographic Parity (DP) and Equalized Odds (EOd) for the unbiased synthetic dataset (left), biased synthetic dataset (middle), and Adult dataset (right). The contours show how the regularization strength of each fairness individually influence the accuracy () given the other (accuracy of 1.0 being the accuracy of the Bayes classifier). For the unbiased synthetic data, the accuracy change along the vertical axis (DP) is practically nonexistent given EOd, while along the horizontal axis (EOd) the change is drastic. Other datasets demonstrate more complex relationships.

Until now, we have only considered situations where zero or one parameter is sufficient to simultaneously specify the fairness strength for every fairness function, i.e. . In this section, we generalize this and allow each regularization parameter to vary freely. It is then natural to consider the multi-linear least-squares accuracy–fairness optimality problem (MLAFOP):


where the regularization parameters now take different values across each of the fairness constraints. This allows for a general inspection of the individual effect of fairness constraints in a group.

As an example, a three-way trade-off between EOd, DP, and accuracy can be visualized as a contour plot, similar to the ones shown in Figure 3

. Such contours allow us to interpolate multiple points in the plane and observe the changes in

more effectively for different settings for fairness constraints. For the unbiased synthetic dataset (left), EOd dominates DP in terms of the sensitivity of to the changes in regularization strength, as the variation of is the strongest along the horizontal axis. On the other hand, the relationship is relatively less one-sided for the biased synthetic and the Adult dataset.

For general -way trade-offs involving fairness constraints and accuracy, it may be convenient to visualize specifically at two-dimensional slices along the general four-dimensional surface. We plot different plots using the values obtained from optimizing , by varying one for one fairness constraint while keeping the other values fixed. By observing how changes in these slices, it is possible to implicitly rank the fairness measures in terms of their influence on by considering the sensitivity of to changes in the corresponding .

Figure 4: The four-way trade-off between accuracy, PCB, EOd, and DP in the biased synthetic dataset (Section 6.1). Shown here is the () value as a function of some regularization strength for some fairness function , while holding all other s constant (accuracy of 1.0 being the accuracy of the Bayes classifier). The value next to each colored line in the legend represents constant values for the fixed s. Sweeping through PCB while keeping DP and EOd fixed (left) does not change the accuracy, whereas the other plots show multiple levels of variations. For EOd (right), the accuracy levels converge quickly to the limiting value of 0.392 as reported in Table 4, suggesting that the accuracy is more sensitive to changes in EOd constraint strength compared to the others.

For example, consider a four-way trade-off between a group of three fairness definitions (DP, EOd, PCB) and accuracy. Table 4 already showed that imposing PCB given (DP, EOd) does not affect , which may imply that PCB is the weakest in terms of its influence on . To get more information, for the biased synthetic dataset, we show in Figure 4 three cases of sweeping through the regularization strength of each fairness constraints while keeping that of the others fixed. Sweeping through PCB condition (left) does not affect at fixed EOd and DP levels, confirming the observation from Table 4. Sweeping through DP conditions while keeping PCB and EOd strengths fixed (middle) results in a slight drop, but not big enough to make all levels to converge to value reported on Table 4 (0.392). Sweeping through EOd while keeping PCB and DP strengths fixed (right) on the other hand results in significant changes for all levels and convergence to the value 0.392, suggesting EOd is stronger than DP in terms of its influence on changing . This notion of relative influence of fairness deserves further investigation, to see if these preliminary results are robust across other slices and datasets. Nonetheless, such analysis allows the practitioners to better understand the limitations of any models they train or evaluate, while also demonstrating a clear picture of how different notions of fairness interact with one another when they are to be imposed together.

7 Conclusions

The FACT diagnostic facilitates model-agnostic reasoning about different kinds of trade-offs involving arbitrarily many notions of performance and fairness which can be expressed as functions of the fairness–confusion tensor. In our formalism, many fairness definitions in the literature are in fact linear or quadratic, thus are easy to be imposed as constraints to the PFOP. The FACT formalism further allows us to use elementary linear algebra and convex optimization theory to derive many existing and new theoretical results regarding fairness–fairness trade-offs and fairness–performance trade-offs. We have also empirically validated the practical use of the FACT diagnostic in several scenarios. Many of the presented results require only linear fairness functions, as in the LAFOP setting. Nevertheless, it is easy to extend this to quadratic fairness functions with more varied performance metrics depending on different use cases. Furthermore, analyzing multi-way trade-offs suggest additional data-dependent properties to investigate, such as the relative ordering of fairness notions by sensitivity.


This work was supported in part by DARPA FA875017C0141, the National Science Foundation grants IIS1705121 and IIS1838017, an Okawa Grant, a Google Faculty Award, an Amazon Web Services Award, a JP Morgan A.I. Research Faculty Award, and a Carnegie Bosch Institute Research Award. JK is supported in part by Kwanjeong Fellowship. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of DARPA, the National Science Foundation, or any other funding agency.

This paper was prepared for information purposes by the Artificial Intelligence Research group of JPMorgan Chase & Co and its affiliates (“JP Morgan”), and is not a product of the Research Department of JP Morgan. JP Morgan makes no representation and warranty whatsoever and disclaims all liability, for the completeness, accuracy or reliability of the information contained herein. This document is not intended as investment research or investment advice, or a recommendation, offer or solicitation for the purchase or sale of any security, financial instrument, financial product or service, or to be used in any way for evaluating the merits of participating in any transaction, and shall not constitute a solicitation under any jurisdiction or to any person, if such solicitation under such jurisdiction or to such person would be unlawful. © 2020 JPMorgan Chase & Co. All rights reserved.


  • [1] R. K. Bellamy, K. Dey, M. Hind, S. C. Hoffman, S. Houde, K. Kannan, P. Lohia, J. Martino, S. Mehta, A. Mojsilovic, et al. (2018) AI fairness 360: an extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943. Cited by: §2.
  • [2] R. Berk, H. Heidari, S. Jabbari, M. Kearns, and A. Roth (2018) Fairness in criminal justice risk assessments: the state of the art. Sociological Methods & Research. External Links: Document, 1703.09207 Cited by: §2, §2, item 1, Table 2.
  • [3] T. Calders, F. Kamiran, and M. Pechenizkiy (2009) Building classifiers with independency constraints. In 2009 IEEE International Conference on Data Mining Workshops, External Links: Document Cited by: §H.1, §1, §2.
  • [4] T. Calders and S. Verwer (2010-09-01)

    Three naive bayes approaches for discrimination-free classification

    Data Mining and Knowledge Discovery 21 (2). External Links: Document Cited by: §2, Table 2.
  • [5] L. E. Celis, L. Huang, V. Keswani, and N. K. Vishnoi (2019) Classification with fairness constraints: a meta-algorithm with provable guarantees. In Proceedings of the Conference on Fairness, Accountability, and Transparency, Cited by: §2.
  • [6] N. Chakrabarty and S. Biswas (2018) A statistical approach to adult census income level prediction. In 2018 International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), Cited by: §6.2.
  • [7] A. Chouldechova (2017) Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5 (2). Cited by: §1, §2, §2, item 2, Table 2, §3, §5.2, Theorem 2.
  • [8] K. Crawford, R. Dobbe, T. Dryer, G. Fried, B. Green, E. Kaziunas, A. Kak, V. Mathur, E. McElroy, A. N. Sánchez, D. Raji, J. L. Rankin, R. Richardson, J. Schultz, S. M. West, and M. Whittaker (2019) AI now 2019 report. AI Now Institute, New York. External Links: Link Cited by: §1.
  • [9] G. B. Dantzig (1963) Linear programming and extensions. Technical report Technical Report R-366-PR, RAND Corporation, Santa Monica, California. External Links: Link Cited by: §H.2.
  • [10] D. Dua and C. Graff (2017) UCI machine learning repository. Note: University of California, Irvine, School of Information and Computer Sciences External Links: Link Cited by: §6.1.
  • [11] C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel (2012) Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, External Links: Document Cited by: §1, §2, Table 2.
  • [12] M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian (2015) Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’15. External Links: Document Cited by: §1, §2.
  • [13] M. Hardt, E. Price, and N. Srebro (2016)

    Equality of opportunity in supervised learning

    In Advances in Neural Information Processing Systems, Cited by: §1, §2, §2, Table 2, §3, §6.2.
  • [14] E. Jones, T. Oliphant, P. Peterson, et al. (2001)

    SciPy: open source scientific tools for Python

    External Links: Link Cited by: §H.2.
  • [15] F. Kamiran, T. Calders, and M. Pechenizkiy (2010)

    Discrimination aware decision tree learning

    In 2010 IEEE International Conference on Data Mining, External Links: Document Cited by: §1, §2.
  • [16] J. Kleinberg, S. Mullainathan, and M. Raghavan (2017) Inherent trade-offs in the fair determination of risk scores. In Proceedings of the 8th Innovations in Theoretical Computer Science Conference, External Links: Document Cited by: §1, §2, §2, Table 2, §3, §5.1, §5.1, Table 3, Corollary 1, Theorem 1.
  • [17] D. Kraft (1988) A software package for sequential quadratic programming. Technical report Technical Report DFVLR-FB 88-28, Institut für Dynamik der Flugsysteme, Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt (DFVLR). Cited by: §H.2.
  • [18] D. Kraft (1994) Algorithm 733: TOMP–Fortran modules for optimal control calculations. ACM Transactions on Mathematical Software 20 (3). External Links: Document Cited by: §H.2.
  • [19] L. T. Liu, M. Simchowitz, and M. Hardt (2019) The implicit fairness criterion of unconstrained learning. In Proceedings of the 36th International Conference on Machine Learning, Cited by: Appendix G, §2.
  • [20] D. Madras, E. Creager, T. Pitassi, and R. Zemel (2018) Learning adversarially fair and transferable representations. In Proceedings of the 35th International Conference on Machine Learning, Cited by: §1, §2.
  • [21] A. K. Menon and R. C. Williamson (2018) The cost of fairness in binary classification. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, Cited by: §2.
  • [22] A. Narayanan (2018) Translation tutorial: 21 fairness definitions and their politics. In Proceedings of the Conference on Fairness, Accountability and Transparency, New York, USA, Cited by: §1, §2.
  • [23] G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger (2017) On fairness and calibration. In Advances in Neural Information Processing Systems, Cited by: §2, §2.
  • [24] S. Samadi, U. Tantipongpipat, J. H. Morgenstern, M. Singh, and S. Vempala (2018) The price of fair PCA: one extra dimension. In Advances in Neural Information Processing Systems, Cited by: §1, §2.
  • [25] J. Song, P. Kalluri, A. Grover, S. Zhao, and S. Ermon (2019) Learning controllable fair representations. In Proceedings of Machine Learning Research, Cited by: §1, §2.
  • [26] Z. Tan, S. Yeom, M. Fredrikson, and A. Talwalkar (2020) Learning fair representations for kernel models. In Proceedings of The 23rd International Conference on Artificial Intelligence and Statistics, External Links: 1906.11813 Cited by: §1, §2, §6.2.
  • [27] S. Verma and J. Rubin (2018) Fairness definitions explained. In Proceedings of the International Workshop on Software Fairness, External Links: Document Cited by: §2.
  • [28] M. B. Zafar, I. Valera, M. Gomez Rodriguez, and K. P. Gummadi (2017) Fairness beyond disparate treatment & disparate impact: learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, External Links: Document Cited by: §1, §2.
  • [29] M. B. Zafar, I. Valera, M. G. Rodriguez, and K. P. Gummadi (2015) Fairness constraints: mechanisms for fair classification. In Proceedings of the Second Workshop on Fairness, Accountability, and Transparency in Machine Learning, External Links: 1507.05259 Cited by: §1, §2, §6.1, §6.2.
  • [30] R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork (2013) Learning fair representations. In Proceedings of the 30th International Conference on Machine Learning, Cited by: §1.
  • [31] H. Zhao and G. J. Gordon (2019) Inherent tradeoffs in learning fair representations. In Advances in Neural Information Processing Systems, External Links: 1906.08386 Cited by: §2.
  • [32] I. Žliobaitė (2015) On the relation between accuracy and fairness in binary classification. In Proceedings of the Second Workshop on Fairness, Accountability, and Transparency in Machine Learning, External Links: 1505.05723 Cited by: §1, §2.

Appendix A Proof of Theorem 1

A useful strategy is to solve (11) for a set of solutions, then ask if any of these solutions satisfies an additional fairness constraint . This proof, as well as many of the ones below, illustrate this strategy in practice.


First, set and in (11). Since , the matrix is full rank and therefore admits the solution (12). Considering yields immediately the condition (13).

Next, set . Then either is a solution (which is the case when all other fairnesses are linear and linearly dependent on ), or otherwise no solution exists to both (11) and simultaneously. ∎

This theorem states that {CG} is incompatible when , since it is a singleton set of rank-0 compatible fairness.

The condition is necessary in Theorem 1, which is reasonable to assume as we would expect the positive class to have a higher score than the negative class in the definition of CG. We can prove the necessity of this condition by contradiction. In the degenerate case , {CG} is a set of rank-2 compatible fairnesses. It turns out that (11) with is only on rank 6. Denoting $⃝i$ as the th row of the matrix, we have two linear dependencies, and . There is no longer a unique solution to the LAFOP; instead, we have a two-parameter family of solutions,


Furthermore, this family of solutions satisfies if and only if , i.e. the base rates are equal and furthermore the score for both bins is equal to the base rate.

Appendix B Proof of Corollary 1


Consider the product


This product equals the zero vector (and hence satisfies both PCB and NCB) if and only if either of the conditions of the Corollary hold. (The last solution, and , is inadmissible since by assumption.) ∎

Appendix C Proof of Corollary 2


The result follows from solving


Appendix D Proof of Corollary 3


The result follows from solving


which is true if and only if either condition in the Corollary is true. (The last case, , is inadmissible by assumption.) ∎

Appendix E Proof of Corollary 4

In addition, here is a situation of fairness “for free”, in the sense that one notion of fairness automatically implies another.

Corollary 4.

Consider a classifier that satisfies CG fairness. Then, the classifier also satisfies EFOR fairness. In other words, {CG, EFOR} is rank-0 compatible.


vanishes identically. ∎

Appendix F Proof of Theorem 2


Finding the solution to and also the linear system yields the three conditions of the Theorem. ∎

Appendix G CG–accuracy trade-offs

In the paper, we have only considered the case when in the LAFOP: we only consider when the fairness criteria are satisfied exactly yielding several fairness–fairness trade-off results without heed to the accuracy of the classifiers. Nonetheless, recall that LAFOP allows us to express both fairness–accuracy and fairness–fairness trade-offs by introducing an accuracy objective along with a fairness regularizer. In this section, we show how the LAFOP can be used to theoretically analyze a simple fairness–accuracy trade-off. We now describe a small result that is relevant to the CG–accuracy trade-off considered in [19].

Theorem 3.

Let be the base rate. Consider a classifier that satisfies CG with . Then, perfect accuracy is attained if and only if


The case of necessity () follows immediately from solving , where is defined in Theorem 1. The inequality conditions follow immediately from the constraint . The case of sufficiency () follows immediately from Theorem 1 and substituting the equality condition. ∎

The condition of this theorem relates the scores and to the base rate of the data, thus providing simple, explicit data dependencies that are necessary and sufficient.

Appendix H Experiment Details

h.1 Dataset

The synthetic dataset consists of two-dimensional data

that follow the Gaussian distributions


We further introduce two cases for the protected attribute.

Unbiased synthetic dataset

The protected attribute value is independent of and , and is instead distributed according to the Bernoulli distribution . This notion of fairness was described in [3].

Biased synthetic dataset

The protected attribute value is assigned as . This dataset models a situation when some features (but not all) encode a protected attribute.

h.2 Optimization

For solving the optimization problems, we used solvers as implemented in the scipy package for Python [14]. For linear fairness constraints, we used the simplex algorithm [9], and for other constrained optimization forms, we used sequential least-squares programming (SLSQP) solver [17, 18].