Recovering from Biased Data: Can Fairness Constraints Improve Accuracy?

Multiple fairness constraints have been proposed in the literature, motivated by a range of concerns about how demographic groups might be treated unfairly by machine learning classifiers. In this work we consider a different motivation; learning from biased training data. We posit several ways in which training data may be biased, including having a more noisy or negatively biased labeling process on members of a disadvantaged group, or a decreased prevalence of positive or negative examples from the disadvantaged group, or both. Given such biased training data, Empirical Risk Minimization (ERM) may produce a classifier that not only is biased but also has suboptimal accuracy on the true data distribution. We examine the ability of fairness-constrained ERM to correct this problem. In particular, we find that the Equal Opportunity fairness constraint (Hardt, Price, and Srebro 2016) combined with ERM will provably recover the Bayes Optimal Classifier under a range of bias models. We also consider other recovery methods including reweighting the training data, Equalized Odds, and Demographic Parity. These theoretical results provide additional motivation for considering fairness interventions even if an actor cares primarily about accuracy.



page 9


Making ML models fairer through explanations: the case of LimeOut

Algorithmic decisions are now being used on a daily basis, and based on ...

Dynamic fairness - Breaking vicious cycles in automatic decision making

In recent years, machine learning techniques have been increasingly appl...

Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics

Why do biased predictions arise? What interventions can prevent them? We...

Fair Classification with Group-Dependent Label Noise

This work examines how to train fair classifiers in settings where train...

Interpretable Fairness via Target Labels in Gaussian Process Models

Addressing fairness in machine learning models has recently attracted a ...

Towards Accuracy-Fairness Paradox: Adversarial Example-based Data Augmentation for Visual Debiasing

Machine learning fairness concerns about the biases towards certain prot...

An Information-Theoretic Perspective on the Relationship Between Fairness and Accuracy

Our goal is to understand the so-called trade-off between fairness and a...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Machine learning (typically supervised learning) systems are automating decisions that affect individuals in sensitive and high stakes domains such as credit scoring

[scoredsociety] and bail assignment [machinebias, flores]. This trend toward greater automation of decisions has produced concerns that learned models may reflect and amplify existing social bias or disparities in the training data. Examples of possible bias in learning systems include the Pro-Publica investigation of COMPAS (an actuarial risk instrument) [machinebias]

, accuracy disparities in computer vision systems


, and gender bias in word vectors


In order to address observed disparities in learning systems, an approach that has developed into a significant body of work is to add demographic constraints to the learning problem that encode criteria that a fair classifier ought to satisfy.

Multiple constraints have been proposed in the literature [eodds, dwork12], each encoding a different type of unfairness one might be concerned about, and there has been substantial work on understanding their relationships to each other, including incompatibilities between the fairness requirements [costfairness, chouldechova, inherent, pleiss2017fairness].

In this work, we take a different angle on the question of fairness. Rather than argue whether or not these demographic constraints encode intrinsically desirable properties of a classifier, we instead consider their ability to help a learning algorithm to recover from biased training data and to produce a more accurate classifier.

In particular, adding a constraint (such as a fairness constraint) to an optimization problem (such as ERM) would typically result in a lower quality solution. However, if the objective being optimized is skewed (e.g., because training data is corrupted or not drawn from the correct distribution) then such constraints might actually help prevent the optimizer from being led astray, with a higher quality solution when accuracy is

measured on the true distribution.

More specifically, we consider a binary classification setting in which data points correspond to individuals, some of whom are members of an advantaged Group A and the rest of the individuals are members of a disadvantaged Group B. We want to make a decision such as deciding whether to offer a candidate a loan or admission to college. We have access to labeled training data consisting of pairs where is some set of features corresponding to an individual and is a label we want to predict for new individuals.

The concern is that the training data is potentially biased against Group in that the training data systematically misrepresents the true distribution over features and labels in Group , while the training data for Group is drawn from the true distribution for Group .

We consider several natural ways this might occur. One way is that members of the disadvantaged group might show up in the training data at a lower rate than their true prevalence in the population, and worse, this rate might depend on their true label.

For instance, if the positive examples of Group B appear at a much lower rate in the training data than the negative examples of Group B (which might occur for cultural reasons or due to other options available to them), then ERM might learn a rule that classifies all or most members of Group B as negative.

A second form of bias in the training data we consider is bias in the labeling process. Human labelers might have inherent biases causing some positive members of Group B in the training data to be mislabeled as negative, which again could cause unconstrained ERM to be more pessimistic than it should be. Alternatively, both processes might occur together.

We examine the ability of fairness constraints to help an ERM learning method recover from these problems.

1.1 Summary of Results

Our main result is that ERM subject to the Equal Opportunity fairness constraint [eodds] recovers the true Bayes Optimal hypothesis under a wide range of bias models, making it an attractive choice even for decision makers whose overall concern is purely about accuracy on the true data distribution.

In particular, we assume that under the true data distribution, the Bayes-Optimal classifiers and classify the same fraction of their respective populations as positive111. We will allow the classifiers to make decisions based on group membership or alternatively assume we have sufficiently rich data to implicitly infer the group attribute., and have the same error rate

on their respective populations, and that these errors are uniformly distributed.

However, during the training process we do not have access to the true distribution. We only have access to a biased distribution in a way that implicates the distinct social groups and causes the classifier to be overly pessimistic on individuals from Group .

We prove that, subject to the above conditions on and , even with substantially corrupted training data either due to the under-representation of positive examples in Group B or a substantial fraction of positive examples in Group B mislabeled as negative, or both, the Equality of Opportunity fairness constraint will enable ERM to learn the Bayes Optimal classifier , subject to a pair of inequalities ensuring that the labels are not too noisy and Group has large mass.

Expressed another way, this means that the lowest error classifier on the biased data satisfying Equality of Opportunity is the Bayes Optimal Classifier on the un-corrupted data.

Other related fairness notions such as Equalized Odds and Demographic Parity do not succeed in recovering the Bayes Optimal classifier under such broad conditions. These results provide additional motivation for considering fairness interventions, and in particular Equality of Opportunity, even if one cares primarily about accuracy.

Our results are in the infinite sample limit and we suppress issues of sample complexity222Our notion of sample complexity is typical. Let be the biased training data-set and . Given ,

samples ensures with probability greater than

that . in order to focus on the core phenomenon of the data source being unreliable.

1.2 Related Work

This paper is directly motivated by a model of implicit bias in ranking [implicit]. In that paper, the training data for a hiring process is systematically corrupted against minority candidates and a method to correct this bias increases both the quality of the accepted candidate and the fraction of hired minority candidates. However, that fairness intervention, the Rooney Rule, does not immediately translate to a general learning setting.

Our results avoid triggering the known impossibility results between high accuracy and satisfying fairness criteria [chouldechova, inherent] by assuming we have equal base rates across groups. This assumption may not be realistic in all settings, however there are settings where bias concerns arise and there is empirical evidence that base rates are equivalent across the relevant demographic groups, e.g. highly differential arrest rates for some alleged crimes that have similar occurrence rates across groups [predictandserve, dirtydata].

Within the fairness literature there are several approaches similar to ours. In particular, our concern with positive examples not appearing in the training data is similar in effect to a selective labels problem [kleinbergselective]. [chouldechovaselectivelabels] uses data augmentation to experimentally improve generalization under selective label bias.

[impossibility, discriminative] also consider the training and test data distribution gap we experience in our model and posit differing interpretations of fairness constraints under different worldviews. While we do not explicitly use the terminology in these papers, we believe our view of the gap between the true distribution and the training time distribution is aligned with Friedler et al’s concept of the gap between the construct space and the observed space.

Our second bias model, Labeling Bias, is similar to [labelbias]. In that paper, the bias phenomenon is that a biased labeler makes poor decisions on the disadvantaged group and intervenes with a reweighting technique, one that is more complex than our Re-Weighting intervention. However, that paper does not consider the interaction of biased labels with different groups appearing in the data at different rates as a function of their labels.

2 Model

In this section we describe our learning model, how bias enters the data-set, and the fairness interventions we consider.

We assume the data lies in some instance space , such as . There are two demographic groups in the population, Group and Group . Their proportions in the population are given by and for . can be read as individual in demographic Group . Group is the disadvantaged group that suffers from the effects of the bias model.

Assume there a special coordinate of the feature vector that denotes group membership. The data distribution is given by , and is a pair distributions , with determining how is distributed and determining how is distributed.

2.1 True Label Generation:

Now we describe how the true labels for individuals are generated. Assume there exists a pair of Bayes Optimal Classifiers with .

We assume that the Bayes Optimal classifier for Group B may be different from the Bayes Optimal classifier for Group A. If was also optimal for Group B, then we can just learn for both Groups and using data only from Group and biased data concerns fade away. Thus we are learning a pair of classifiers, one for each demographic group.

When generating samples, first we draw a data-point . With probability , (and thus ) and with probability , (so ).

Once we have drawn a data-point , we model the true labels as being produced as follows; evaluate , using the classifier corresponding to the demographic group of . If , then . If , then . However, we assume that is not perfect and independently with probability , the true label of does not correspond to the prediction .

The labels after this flipping process are the true labels of the training data.333Note this label model is equivalent to the Random Classification Noise model [angluin]. However the key interpretative difference is that in RCN, is the correct label and those that get flipped are noise, but in our case the are the true labels and is merely the Bayes-Optimal classifier given the observable features. We assume that . This combined with the assumption that is the same for classifiers from both groups implies that the two groups have equal base rates (fraction of positive samples) i.e .

We denote this label model as for a pair of classifiers with where is some hypothesis class with finite VC dimension.

2.2 Biased Training Data

Now we consider how bias enters the data-set. Consider the example of hiring where the main failure mode will be a classifier that is too negative on the disadvantaged group. We explore several different bias models to capture potential ways the data-set could become biased.

The first bias model we call Under-Representation Bias. In this model, the positive examples from Group are under-represented in the training data.

Specifically, the biased training data is drawn as follows:

  1. examples are sampled from the distribution . Thus each .

  2. The label for each is generated according to the label process from Section 2.1 with hypothesis and .

  3. For each pair , if and , then the data-point is discarded from our training set independently with probability .

Thus we see fewer positive examples from Group in our training data. is the probability a positive example from Group stays in the training data and .

If , then the positive and negative regions of are strictly disjoint, so if we draw sufficiently many examples, with high probability, we will see enough positive examples in the positive domain of to find a low empirical error classifier that is equivalent to .444We would learn with ERM and Uniform Convergence, using the fact that has finite VC-dimension.

In contrast for non-zero , our label model interacting with the bias model can induce a problematic phenomenon that fools the ERM classifier. For non-zero there is error even for the Bayes Optimal Classifier and thus in the positive region of the classifier there are positive examples mixed with negative examples. The fraction of negative examples is amplified by the bias process.

If is sufficiently small, there could in fact be more negative examples of Group B than positive examples in the positive region of . If this occurs, then the bias model will snap the unconstrained ERM optimal hypothesis (on the biased data) to classifying all individuals from Group as negatives. This can be observed in Figure 1.

(a) Un-Corrupted Data
(b) Corrupted Data: Under-Representation Bias
Figure 1: The schematic on the left displays data points with ,

as a hyperplane, and

. The schematic on the right displays data drawn from the same distribution subject to the Under-Representation Bias with . Now there are more negative examples than positive examples above the hyperplane so the lowest error hypothesis classifies all examples on the right as negative.

Under-Representation Bias is related to selective labels in [kleinbergselective] since we are learning on a filtered distribution where the filtering process is correlated with the group label. Our model is functionally equivalent to over-representing the negatives of the in the training data, an empirical phenomenon observed in [dirtydata].

2.3 Alternative Bias Model: Labeling Bias

We now consider a bias model that captures the notion of implicit bias, which we call Labeling Bias. In particular, a possible source of bias in machine learning is the label generating process, especially in applications where the sensitive attribute can be inferred by the labeler, consciously or unconsciously. For example, training data for an automated resume scoring system could be based upon the historical scores of resumes created by a biased hiring manager or a committee of experts. This source of labels could then systematically score individuals from Group as having lower resume scores, an observation noted in randomized real world investigations [bertrand2004emily].

Formally, the labeling bias model is:

  1. examples are sampled from the distribution . Thus each .

  2. The labels for each are generated according to the label process from Section 2.1 with hypothesis and .

  3. For each pair , if and , then independently with probability , the label of this point is flipped to negative.

This process is one-sided, so true positives become negatives in the biased training data, so apparent negatives becomes over-represented. We are making a conceptual distinction that the true labels (Step 2) are those generated by the original label model and these examples that get flipped by the bias process (Step 3) are not really negative, instead they are just mislabeled.

As increases more and more of the individuals in the minority group appear negative in the training data. Once the number of positive samples is smaller than the number of negative samples above the decision surface , then the optimal unconstrained classifier (according to the biased data) is to simply classify all those points as negative.

2.4 Under-Representation Bias and Labeling Bias

We now consider a more general model that combines Under-Representation Bias and Labeling Bias, and moreover we allow either positives or negatives of Group B (or both) to be under-represented. Specifically, we now have three parameters: , , and . Given examples drawn from , we discard each positive example of Group B with probability and discard each negative example of Group B with probability to model the Under-Representation Bias. Next, each positive example of Group B is mislabeled as negative with probability to model the Labeling Bias. Note that the under-representation comes first: and represent the probability of true positive and true negative examples from Group B staying in the data-set, respectively, regardless of whether they have been mislabeled by the agent’s labelers.

2.5 Fairness Interventions

Now we introduce several fairness interventions and define a notion of successful recovery from the biased training distribution.

We consider multiple fairness constraints to examine whether the criteria have different behavior in different bias regimes. The fairness constraints we focus on are Equal Opportunity, Equalized Odds, and Demographic Parity.

Definition 2.1.

Classifier satisfies Equal Opportunity on data distribution [eodds] if


This requires that the true positive rate in Group is the same as the true positive rate in Group .

Equalized Odds is a similar notion, also introduced in [eodds]. In addition to requiring Line 1, Equalized Odds also requires that the false positive rates are equal across both groups. Equivalently, we can define Equalized Odds as , meaning that is independent of the sensitive attribute, conditioned on the true label . We also consider Demographic Parity := [dwork12].

For each of these criteria, the overall training procedure is solving a constrained ERM problem.555We do not consider methods for efficiently solving the constrained ERM problem.

We also consider data Re-Weighting, where we change the training data distribution to correct for the bias process and then do ERM on the new distribution. The overall gist of how the training data becomes biased in our models is that the positive samples from Group are under-represented in the training data so we can intervene by up-weighting the observed fraction of positives in the training data from Group to match the fraction of positives from the Group training data.

In the training process we only have access to samples from the training distribution and thus when using a fairness criterion to select among models we check the requirement on the biased training data.

Observe that in our model of label generation, the Bayes Optimal Classifier on the true distribution is the used to generate the labels initially, regardless of the values of and . Thus our goal for the learning process is to recover the original optimal classifier , subject to training data from a range of bias models and the true label process with . A more effective learning method would recover in a wider range of the model parameters (the parameters that characterize the bias process and the true label process). Accordingly we define Strong-Recovery:

Definition 2.2.

A Fairness Intervention in bias model satisfies Strong-Recovery if for all and all , when given data corrupted by bias model , the training procedure recovers the Bayes Optimal Classifier , given sufficient samples, for all , , and .

3 Recovery Behavior Across Bias Models

There are two failure modes for learning a fairness constrained classifier that we will need to be concerned with. First, the Bayes Optimal Hypothesis may not satisfy the fairness constraint evaluated on the biased data. Second, within the set of hypotheses satisfying the fairness constraint, another hypothesis (with higher error on the true distribution) may have lower error than the Bayes Optimal Classifier on the biased data. We now describe how the multiple fairness interventions provably avoid or fail to avoid these pitfalls in increasingly complex bias models. We defer formal proofs to Section 4.

3.1 Under-Representation Bias

Equal Opportunity and Equalized Odds both perform well in this bias model and avoid both failure modes, subject to an identical constraint on the bias and demographic parameters.

First, from the definition of the Under-Representation Bias model, observe that satisfies both fairness notions on the biased data, so the first failure mode does not occur.

Second, Equal Opportunity intuitively prevents the failure mode where a hypothesis is produced that appears better than on the biased data, such as classifying all examples from Group as negative, by forcing the two classifiers to classify the same fraction of positive examples as positive. So, if we classify all the examples from Group B as negative, we have to do the same with Group A, inducing large error on the training data from the majority Group A. In particular, so long as the fraction of total data from Group B is not too large and is not too close to , this will not be a worthwhile trade-off for ERM (saying negative on all samples will not have lower perceived error on the biased data than ) and so it will not produce this outcome.

A formal proof of correctness is given in Section 4.1. Specifically, we prove that Equal Opportunity strongly recovers from Under-Representation Bias so long as


Note that this is true for all and , so we have that Equal Opportunity satisfies Strong-Recovery(, from Under-Representation Bias. Alternatively, we see that if then the inequality simplifies to at least so we have Strong-Recovery. Equalized Odds also recovers in this bias model with the same conditions as Equal Opportunity.

Figure 2: This figure indicates the parameter region such that Equal Opportunity Constrained ERM recovers under the Under-Representation Bias Model and is a visualization of Equation 2. and . We label each pair with blue if it satisfies the inequality and red otherwise. This plot shows how smaller means we can recover from lower . Blue means is recovered. The dashed black line indicates the boundary between recovering and failing to recover .

In contrast, Demographic Parity fails to recover even if . If , , and and we originally had samples, then the Bayes Optimal Classifier does not satisfy Demographic Parity on the biased data since the fraction of samples that will be labelled positive is .

Similarly, if we let , then in order to match the fraction of positive classifications made by , is forced to classify a larger region of the input spaces as positive than would in the absence of biased data and so we do not recover .

Another way to intervene in the Under-Representation Bias model would just be to re-weight the training data to account for the under-sampling of positives from Group . If we really know positives from Group are under-represented, we can change our objective function by changing each indicator function such that minimizing the sum of indicators measures the loss on the true distribution and not the loss on the biased training distribution.

Define . Then let,

Then we use this new indicator in the objective function. This new loss function is an unbiased estimator of the true unbiased risk, so uniform convergence on this estimator will suffice to learn

. We can infer the value of from the data for Group A if we know the data from Group B is corrupted by this bias model. One concern with re-weighting in general is that the functional form of the correction is tied to the exact bias model.

In summary, for the Under-Representation Bias model, the fairness interventions Equalized Odds, Equal Opportunity, and Re-Weighting recover under a range of parameters. However, Demographic Parity is inadequate even for and will not recover for non-vacuous bias parameters.

3.2 Labeling Bias

In Section 4, we prove that Equal Opportunity constrained ERM on data biased by the Labeling Bias model also finds the Bayes-Optimal Classifier, under similar parameter conditions to the previous bias model.

Interestingly, in contrast to Under-Representation Bias, Labeling Bias cannot be corrected by Equalized Odds. The problem is the first failure mode. For example, consider but where . The Bayes Optimal Classifier for Group has false positive rate of 0 and true positive rate of . However, since , there is no classifier for Group that achieves both of these rates simultaneously. In particular, the only way to classify the negative individuals in the positive region as negative is for the classifier to decrease its true positive rate from . Therefore, Equalized Odds rules out usage of . This violation holds for as well.

In contrast, does satisfy Equal Opportunity on the biased data, and given the conditions in Theorem 4.1, it will be the lowest error such classifier on the biased data.

Demographic Parity experiences similar limitations as in the Under-Representation Bias model.

The Re-Weighting intervention is to change the weighting of observed positives in the training data for Group so that we have the same fraction of positives in Group as in Group . Define the fraction of positive individuals in Group and the observed fraction of positives in in the biased data. and refer to the observed fraction of negative individuals in Group and Group in the biased data.

We need a re-weighting factor such that:

We prove in Section 4.2 that this correction factor will lead to the positive region of having a higher weight of positive examples than negative examples and simultaneously the negative region of having a higher weight of negative examples than positive examples. This causes ERM to learn the optimal hypothesis . We can infer the value of by comparing the fraction of positives in Group and Group .

In summary, Equal Opportunity and the Re-Weighting Interventions recover well in this bias model (Labeling Bias) while Equalized Odds and Demographic Parity are inadequate.

3.3 Under-Representation Bias and Labeling Bias

In this most general model that combines the two previous models, Re-Weighting the data is now no longer sufficient to recover the true classifier. For example, consider the case where and , and and . If there were points originally from group , then in expectation were negative and were positive. After the bias process, in expectation there are negatives on the negative side of , and on the positive side of we have correctly labelled positives and what appear to be negative samples.

The Re-Weighting intervention will not do anything in expectation because the overall fractions are still correct; we have total points with one quarter of them labeled positive. ERM is now indifferent between and labeling all samples from Group as negative. If we just slightly increase the parameter and reduce then in expectation ERM will strictly prefer labeling all the samples negatively.

While the Re-Weighting method fails, we prove that Equal Opportunity-constrained ERM recovers the Bayes Optimal Classifier as long as we satisfy a condition ensuring that Group A has sufficient mass and the signal is not too noisy. As with the previous model, Demographic Parity and Equalized Odds are not satisfied by on minimally biased data and so they will not recover the Bayes-Optimal classifier.

4 Our Main Results

We now present our main theorem formally. Define the biased error of a classifier as its error rate computed on the biased distribution.

Theorem 4.1.

Assume true labels are generated by corrupted by both Under-Representation bias and Labeling bias with parameters , and assume that


Then is the lowest biased error classifier satisfying Equality of Opportunity on the biased training distribution and thus is recovered by Equal Opportunity constrained ERM.

Note , , , and . Condition 4.1 refers to Line 3 and Line 4.

This case contains our other results as special cases and in the next section we prove our main theorem in this bias model. Note that if Line 3 is not satisfied then the all-negative hypothesis will have the lowest biased error among hypotheses satisfying Equal Opportunity on the biased training distribution. Similarly, if Line 4 is not satisfied then the all-positive hypothesis will have the lowest biased error among hypotheses satisfying Equal Opportunity on the biased training distribution. Thus Theorem 4.1 is tight.

To give a feel for the formula in Theorem 4.1, note that the case of small is good for our intervention, because the advantaged Group is large enough to pull the classification of the disadvantaged Group in the right direction. For example, if then the bounds are satisfied for all (and if then the bounds are satisfied for all ) for any under-representation biases and any labeling bias .

Thus, Equal Opportunity Strongly Recovers with and in the Under-Representation and Labeling Bias model.

Table 1 summarizes the results in the three core interventions and the three core bias models. Demographic Parity is omitted from the table since it cannot recover under the bias models when and thus is inadequate. The contents of each square indicate if recovery is possible in a bias model with an intervention and what constraints need to be satisfied for recovery.

Intervention Under-Representation Labeling Bias Both
Equal Opportunity-ERM Yes: Yes: Yes: Using Condition 4.1
Equalized Odds Yes: No No
Re-weighting Class B: Yes Yes No
Table 1: Summary of recovery behavior of multiple fairness interventions in bias models.

4.1 Proof of Main Theorem

In this section we present the proof of the main result, Theorem 4.1. We want to show that the lowest biased error classifier satisfying Equal Opportunity on the biased data is , given Condition 4.1.

The first step of the proof is to show that satisfies Equal Opportunity on the biased training data. Note: the lemmas and claims here are all in the Under-Representation Bias combined with Labeling Bias Model, the most general bias model.

Lemma 4.2.

satisfies Equal Opportunity on the biased data distribution.


First, let’s consider the easiest case with , , and . Recall that is the pair of classifiers used to generate the labels. When , is a perfect classifier for both groups so Equal Opportunity is trivially satisfied. Now, let’s consider arbitrary . Recall that .

By our assumption that Group A and Group B have equal values of and we have

Next consider when we have both Under-Representation Bias and Labeling Bias. Recall that is the probability that a positive or negative sample from Group is not filtered out of the training data while is the probability a positive label is flipped and this flipping occurs after the filtering process. Then,

so Equal Opportunity is still satisfied.

In words, the bias model removes or flips positive points from Group independent of their location relative to the optimal hypothesis class. Thus positive points throughout the input space are are equally likely to be removed, so the overall probability of true positives being classified as positives is not changed. ∎

Now we describe how a candidate classifier differs from . We can describe the difference between the classifiers by noting the regions in the input space that each classifier gives a specific label. This gives rise to four regions of interest with probability mass as follows:

These probabilities are made with reference to the regions in input space before the bias process. and are functions of to make explicit that there may be multiple hypotheses with different functional forms that could allocate the same amount of probability mass to parts of the input space where and agree on labeling as positive and negative respectively.

The partition of probability mass into these regions is easiest to visualize for hyperplanes but will hold with other hypothesis classes. and are defined similarly with respect to and . A schematic with hyper-planes is given in Figure 3.

Figure 3: Differences between and measured with probabilities in the true data distribution (before the effects of the bias model).

To show that has the lowest error on the true distribution, we first show how given any a pair of classifiers and , which jointly satisfy Equal Opportunity (Equal Opportunity) on the biased distribution, we can transform into a pair of classifiers still satisfying Equal Opportunity with at most one non-zero parameter from , and at most one non-zero parameter from , while also not increasing biased error.

The final step of our proof argues that out of the family of all hypotheses with (1) at most one non-zero parameter for the hypothesis on Group , (2) at most one non-zero parameter for the hypothesis on Group , (3) and jointly satisfying Equal Opportunity on the biased data, has the lowest biased error.

These steps combined imply that is the lowest biased error hypothesis that satisfies Equal Opportunity.

Lemma 4.3.

Given a pair of classifiers and which satisfy Equal Opportunity on the biased data we can find a pair of classifiers and satisfying

  1. At most one of is non-zero and at most one of is non-zero.

  2. has error at most that of on the biased distribution.

  3. and satisfy Equal Opportunity.


We want to exhibit a pair of classifiers with lower biased error that zeros out one of the parameters. We do this by modifying each classifier separately, while keeping the true positive rate on the biased data fixed to ensure we satisfy Equal Opportunity.

First, consider Group and suppose that since otherwise we do not need to modify . We hold the true positive rate of constant and shrink towards zero. As we shrink , we must shrink towards zero in order hold the true positive rate fixed (and thus satisfy Equal Opportunity).

The un-normalized666The normalization factor for these rates for Group and Group is the same so this term can be cancelled. True Positive Rate (constrained by Equal Opportunity) is . Since the term is independent of the classifier , keeping the true positive rate constant is equivalent to keeping constant.

Let be the amount we wish to shrink and let be the amount we must shrink to keep fixed. Then,

We continue shrinking these parameters until either or has hit zero. To this see can be always be done, note that for . Which term hits zero will depend on the sign of .

Observe for Group this process will clearly reduce training error since we are decreasing both and and the error on group is monotone increasing (and linear) with respect to .

We then separately do this same shrinking process for group . Now we show the biased error decreases for Group .

Thus the overall biased error change for Group is the sum of the two above terms,

The first two terms vanish because of .

Since this term is negative, we have shown that this modification process decreases error on the biased training data for both Group and Group while keeping the true positive rate fixed. ∎

Lemma 4.4.

If we have two classifiers each with one non-zero parameter and the classifiers satisfy the Equal Opportunity constraint, then and .


Recall that the Equal Opportunity constraint requires that these expressions be equal.

Then the theorem follows from inspecting the second equality. ∎

This lemma makes explicit that when the classifiers each have only one non-zero parameter and satisfy Equal Opportunity, then the non-zero parameter corresponds to the same region.

Lemma 4.5.

Of hypotheses satisfying ( and ) or ( and ), if these inequalities hold:

then the lowest biased error classifier satisfying Equal Opportunity on the biased data is .


First, we sketch the proof informally. Consider three cases which depend on how the bias process affects the unconstrained optimum for Group on the biased data. In the first case, in the biased data distribution, the region has more positive than negative samples in expectation and the region has more negative than positive samples in expectation. In the second case, there are more positive than negative samples throughout the entire input space in the biased distribution. In the third and final case, there are more negative than positive samples throughout the input space in the biased distribution.

In these three cases, the optimal hypothesis is exactly one of , respectively. The second two hypotheses mean labelling all inputs as positive and labelling all inputs as negative, respectively. These three hypotheses correspond to hypotheses with at most one non-zero parameter.

For instance, occurs when and . Each of the three hypotheses occur when the one non-zero parameter attains a location on the boundary of its range of values. When is allowed to be non-zero, if instead (and thus it also must be that ), the hypothesis is equivalent to . A similar relationship holds for and .

In order to show the theorem, we prove that if has lower biased error than and on the biased data distribution, then has the lowest error among all hypotheses with at most one non-zero parameter and satisfying Equal Opportunity.

To see this, consider and with the same non-zero parameter equal to . Then the error of is a linear function of . Similarly, the error of is a linear function of . The overall error of is a weighted combination of the error of and the error of or , so the overall error of is thus linear in , so the optimal hypothesis parametrized by must occur on the boundaries of the region of , so the optimal hypothesis is one of . We then show that the inequalities we assume in the theorem enforce that has strictly lower error than or . Formally, we enumerate the possible events:

Type Sign of Label in Biased Data Un-Normalized Probability of Event
A + +
A + -
A - +
A - -
B + +
B + -
B - +
B - -

The probabilities on the far right hand side are not normalized. First we show that the . and , thus if and only if or



Now we consider compared to . Then Then if and only if .



Thus we have shown that the error of is less than the error of of and if and only if both Lines 5 and 6 are true, which we assume in our theorem.

Now we show that we error of is linear in . There are two cases depending on what parameter of is non-zero.

Let be a hypothesis such that and and .

On the other case let and and .

Thus the error of is linear in and boundary values for correspond to the hypotheses in . These two arguments show that:

  1. Any single parameter is a weighted sum of ( and ) or is a weighted sum of ( and ) and so is linear in . The boundary values of correspond to .

  2. Since the optimal value of a linear function occurs on the boundaries of its range, the optimal Equal Opportunity classifier with at most one non-zero parameter is one of .

  3. The inequalities in the theorem statement enforce that has lower biased error than either or , so has the lowest biased error of any single parameter hypothesis satisfying Equal Opportunity.

If the conditions in the Theorem do not hold, then will not have lower error than and .

4.2 Verification Re-Weighting Recovers from Labeling Bias

The way we intervene by Reweighting is we multiply the loss term for mis-classifying positive examples in Group by a factor such that the weighted fraction of positive examples in biased data for Group is the same as the overall fraction of positive examples in Group .

The goal of this reweighting is that the ratio of positive to negative samples in the positive region of is greater than while the ratio is less than in the negative region of . Thus the re-weighted probabilities need to simultaneously satisfy:

The two constraints are equivalent to requiring that:


Recall from Section 3.2 that

First we show the right hand inequality.

Observe that both terms are linear in . When , the inequality becomes . In our bias model , but if , the inequality becomes . Thus Line 7 holds if both and .

is clearly true because .

To see that , note that this is equivalent to , where the right-hand-side is the overall fraction of negative examples in . This is clearly true because the positive region of has exactly an fraction of negatives, and the negative region of has a fraction of negatives.

Now we show the left hand inequality in Line 7.