On the Fairness of Machine-Assisted Human Decisions

by   Talia Gillis, et al.

When machine-learning algorithms are deployed in high-stakes decisions, we want to ensure that their deployment leads to fair and equitable outcomes. This concern has motivated a fast-growing literature that focuses on diagnosing and addressing disparities in machine predictions. However, many machine predictions are deployed to assist in decisions where a human decision-maker retains the ultimate decision authority. In this article, we therefore consider how properties of machine predictions affect the resulting human decisions. We show in a formal model that the inclusion of a biased human decision-maker can revert common relationships between the structure of the algorithm and the qualities of resulting decisions. Specifically, we document that excluding information about protected groups from the prediction may fail to reduce, and may even increase, ultimate disparities. While our concrete results rely on specific assumptions about the data, algorithm, and decision-maker, they show more broadly that any study of critical properties of complex decision systems, such as the fairness of machine-assisted human decisions, should go beyond focusing on the underlying algorithmic predictions in isolation.


page 1

page 2

page 3

page 4


A Pilot Study on Detecting Unfairness in Human Decisions With Machine Learning Algorithmic Bias Detection

Fairness in decision-making has been a long-standing issue in our societ...

Fair and Unbiased Algorithmic Decision Making: Current State and Future Challenges

Machine learning algorithms are now frequently used in sensitive context...

Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability

Algorithmic predictions are increasingly used to aid, or in some cases s...

A statistical framework for fair predictive algorithms

Predictive modeling is increasingly being employed to assist human decis...

Pretending Fair Decisions via Stealthily Biased Sampling

Fairness by decision-makers is believed to be auditable by third parties...

A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores

The increased use of algorithmic predictions in sensitive domains has be...

Equalizing Recourse across Groups

The rise in machine learning-assisted decision-making has led to concern...

1 Introduction

When we analyze the properties of machine-learning predictions, we often consider settings in which they are implemented automatically. In this article, we instead consider how properties of a machine-learning algorithm affect a human decision-maker’s choices that take these predictions as an input. Focusing on a measure of disparity with respect to the resulting decisions, we document cases in which the interaction of a biased decision-maker with a machine-learning algorithm alters usual trade-offs between fairness and accuracy.

Machine-learning tools are increasingly used in high-stakes decisions. Algorithms that predict recidivism are employed in pre-trial bail decisions. In medicine, machine-learning predictions are used to make testing decisions. In hiring, predictive algorithms screen applicants in order to make interview decisions. Lenders use sophisticated statistical models to make default predictions that underlie credit approval decisions.

When machines are used in such high-risk contexts, we often care about their properties beyond overall accuracy. A growing literature specifically studies the fairness properties of machine decisions. In medical applications we may take special interest in when an algorithm errs, examining the incidence of false positives and false negatives (Mullainathan and Obermeyer, 2019). In decisions subject to heightened legal oversight such as hiring decisions, we may care about differentiated treatment of or impact on legally designated protected classes (Raghavan et al., 2020). Chouldechova (2016) and Kleinberg et al. (2016) study how different fairness criteria apply to machine classifications.

Typically, fairness properties of algorithmic decisions are analyzed as if the machine predictions were implemented directly. But in many cases machine predictions may not be implemented automatically, and instead inform a human decision-maker who has the decision authority. This is the case for recommender systems or decision support systems. The human is considered vital in these systems due to their domain knowledge (Lawrence et al., 2006)

, which encompasses a human decision-maker’s intuition and understanding of a problem as well as any additional information they may observe. We may also require that a human make the final decision in a system as a matter of accountability or comfort. In fact, the recent proposal of the European Union for the regulation of artificial intelligence, the most comprehensive and ambitious proposal to date, requires that humans retain authority when algorithms are used in decision-making

(European Commission, 2021).

In this article we ask how questions around fairness and bias play out when machine predictions support rather than replace human decisions, and ask how incorporating a human decision-maker impacts the relationship between the structure of (machine) predictions and the properties of resulting (human) decisions. Specifically, we consider a decision that aims to predict a label from features with minimal prediction loss. At the time of the decision, the decision-maker has access to a machine prediction of the label of interest. This prediction is derived from a training data-set that the decision-maker does not have direct access to. This setup may represent a judge’s bail decision for which a risk score may be available, or a loan officer’s approval decision with the help of a credit score.

We focus on the fairness properties of the resulting decision, and analyze possible trade-offs with accuracy. For the continuous prediction decisions in our model, we capture fairness by the disparity that encodes how the decision varies between instances that only differ in their membership to a (protected) group. We measure disparity as we may want decisions not to differ between groups, holding other characteristics constant. We contrast disparity with accuracy (expected loss) of the decision. By focusing on conditional statistical parity, we consider a specific notion of fairness that is directly related to the inclusion of protected characteristics in the decision. We discuss other notions of fairness in the extension section.

In a baseline case where machine predictions are implemented directly, excluding protected characteristics reduces disparities, conditional on all other features. Thus, when the labels differ across groups in the data, the objectives of minimizing risk and minimizing disparity may be in conflict. This presents us with a trade-off to consider when choosing which inputs to include in the algorithm.

Having established the baseline case, we switch to a human decision-maker who uses the machine prediction to inform their decision. Adopting a standard model from statistical decision theory, the human decision-maker updates their prior beliefs using the information contained in the machine predictions. Thus the role of the predictions moves from taking action to informing the decision-maker about the world through a concise summary (the prediction itself). Changing the inputs available to the prediction now changes the information represented to the decision-maker.

We consider a human decision-maker who starts with biased beliefs about the differences in true labels across protected groups. Prior work studies and provides evidence of the role of biased beliefs in discrimination (Coffman et al., 2021; Bohren et al., 2019a). In such cases, the decision-maker believes that differences between groups are larger than they are on average in the training data available to the algorithm. The design of the algorithm may then have implications for how the human decision-maker updates their beliefs about the differences between groups before taking a decision.

When the human has a biased prior, common relationships between the inclusion of protected characteristics and the disparities in the final decision may flip. We show formally how the decisions of a biased decision-maker may have larger disparity when the machine predictions exclude protected characteristics. The intuition behind this result is straightforward: if the machine input is not informative about the differences between groups, the prior bias of the human decision-maker remains unmitigated. We show this reversal in a stylized example and then provide general theorems that apply in large samples. Our results therefore show that, in the case of algorithmic assistance, some of the common trade-offs between fairness and accuracy do not apply.

Our results suggest that we have to take the structure of decisions into account when relating machine predictions to the (fairness) properties of algorithms. While our analysis shows that typical intuitions about the effect of including protected-class information may not apply to machine-assisted human decisions, there may be other reasons to avoid protected characteristics that we do not capture in our model and which alter conclusions about their inclusion. Importantly, our analysis demonstrates that any such conclusion should take the specific way in which predictions are to be used in the decision-making system into account. More broadly, we highlight the importance of analyzing the impact of machine predictions in the context of the full decision environment, which includes the way in which (possibly biased) data informs machine predictions and how humans (with possibly biased priors or preferences) use these predictions.

1.1 Related literature

We contribute to an interdisciplinary literature on algorithmic fairness that spans computer science, statistics, economics, law, and operations. For example, Kleinberg et al. (2016) and Chouldechova (2016) study tensions between different fairness qualities of algorithms. Lakkaraju et al. (2017) and Jiang and Nachum (2020) study bias coming from label bias in the training data. Corbett-Davies and Goel (2018) review prevalent measures of fairness and discuss their shortcomings. A specific question arising in this literature concerns the impact different restrictions on algorithmic inputs have on the fairness of the resulting decisions (e.g. Barocas and Selbst, 2016; Gillis and Spiess, 2018; Yang and Dobbie, 2020; Hellman, 2020), and whether there are legal restrictions on algorithmic inputs (e.g. Bent, 2019; Gillis, 2021). Relative to this literature, we focus on the question of how properties of an underlying prediction algorithm affect decisions when there is a human decision-maker in the loop.

We also relate to work on the interaction of machine predictions and human decisions. Similar to our setup, Bastani et al. (2021) study how machine learning can improve human decision-making. Ibrahim et al. (2021) consider information flowing from a human to a machine, identifying that a signal by the human related to their private information is more useful for machine prediction than a human’s direct forecast. Athey et al. (2020) consider conditions under which it is optimal to allocate decision authority to a human rather than an algorithm. Dietvorst et al. (2018); Green and Chen (2019); Stevenson and Doleac (2019) study frictions in the adoption of machine predictions. Grgić-Hlača et al. (2019); Fogliato et al. (2021) run vignette experiments to analyze the effects of algorithmic advice on judicial decisions. De-Arteaga et al. (2020)

consider instances in which humans are able to identify and override incorrect estimates of risk. Relative to these approaches, we specifically focus on the interaction of machine predictions with a biased human decision-maker.

More specifically, we relate to a literature that explicitly considers fairness properties of machine-assisted human decisions. Most closely related to our theoretical results, Morgan and Pass (2019) analyze specific notions of discrimination for a machine-aided human classification decision, and show that there may be a trade-off between avoiding discrimination in the underlying machine classification and avoiding discrimination in the human decision, where only trivial cases allow for avoiding both at the same time. Relative to their approach, we focus on the role of biases in human beliefs, and our analysis still applies to cases where there are no true differences between groups. Imai et al. (2020) study the effect of providing judges with risk assessment tools in pre-trial decisions and consider the impacts on the fairness of the resulting decisions. Unlike their work, we focus on the effect of excluding protected characteristics, rather than the effect of providing advice.

The article is also related to an economics literature on fairness and discrimination, which attempts to identify sources of biased decisions and distinguishes between preference- and belief-based explanations (Bordalo et al., 2019; Bohren et al., 2019a, b; Coffman et al., 2021). Prior work provides evidence of how inaccurate beliefs can lead to biases in observed decisions. Building on this work, our article considers how information from a machine prediction interacts with inaccurate beliefs.

Finally, we relate to work on the communication of statistical results, the statistics and optimization of learning optimal decisions, as well as information design. Most similar to our approach, Andrews and Shapiro (2021) model the communication of statistical results to a decision-maker, in the same way we model the machine-learning algorithm providing an input to the decision-maker’s choice. Relative to this work, we focus specifically on the relationship of the inclusion of protected characteristics to disparities in decisions when beliefs are biased. A related literature on forming optimal decisions from training data also emphasizes the distinction between prediction and resulting decision (e.g. Bertsimas and Kallus, 2020). More specifically, Kleinberg et al. (2018b) argue that fairness constraints should be imposed in the decision, rather than the prediction stage. Canetti et al. (2019); Mishler et al. (2021) consider whether and how fairness can be achieved through post-processing. Our information structure also resembles that of Kamenica and Gentzkow (2011), who demonstrate how an informed sender with control of information can design a signaling scheme that influences the behavior of a decision-making receiver with misaligned preferences.

1.2 Structure of the article

The remainder of this article proceeds as follows. Section 2 introduces our model. Section 3 demonstrates our main results in a stylized example before we generalize these results in Section 4. In Section 5 we discuss some implications of our result. Finally, we consider extensions in Section 6 before concluding in Section 7.

2 Setup

For an instance , we consider personalized decisions that may vary by features and group identity . Here, features may comprise baseline characteristics available at the time the decision is taken, and the group identity may encode additional sensitive attributes such as gender or ethnicity/race that may be subject to legal constraints or ethical considerations. A decision leads to loss , where is the true label of the instance that is unavailable at the time of the decision. For simplicity, we will focus here on a simple prediction decision relative to a true label with squared-error loss . We can think of this choice as an explicit prediction decision or an implicit assessment where loss approximates a consequence of the associated decision. We briefly mention binary decisions in Section 6.

We assume that the true label has mean

and variance

. For tractability, we assume here that the variance is finite, constant, and known (), that the error term

is Normally distributed, that

has finite support, and that is binary. The expected (out-of-sample) loss (risk) of the decision rule is . Writing for the risk at and for the average risk at , we have that . The risk expresses the accuracy of the decision.

In addition to the accuracy of the decision, we may also care about how the decision treats instances differently that vary only by their group identity. We define as the decision disparity of decision rule between group 1 and group 0, and write for its average across instances.

Disparities may be of interest when there are legal or ethical grounds for not treating people differently who only differ in their group membership. This may be true when the ground-truth discrepancies are zero, and we are worried that an unfair decision introduces biases. But even when the true discrepancies are not zero – which could be due to omitted variables, different outcomes that reflect past or future discrimination, or a causal relationship – we may want to ensure that discrepancies are small to address biased data, correct institutional discrimination, or comply with legal restrictions. By considering disparities, we focus on conditional statistical parity as our notion of fairness; we briefly discuss extension to accuracy-based fairness concepts in Section 6.

We assume that the decision is taken by a decision-maker with decision authority who has a belief (prior)

over the mean vector

. This belief incorporates any ex-ante beliefs and prior data the decision-maker may have observed. In addition to this belief, the decision-maker also observes a machine prediction , which for simplicity we assume comes from a training data of iid draws that are independent of the deployment data and independent of the decision-maker’s prior (and not available to the decision-maker directly). Specifically, we consider the two predictions

where denotes the number of observations with . The first prediction does not vary with group identity, while the second one does. We discuss expanding to more complex functional forms in Section 6 below, but for now focus on these simple averages, which allow us to model learning by the agent.

We consider three decisions taken by the decision-maker to minimize average expected loss. First, we consider the decisions that the decision-maker would take without access to any prediction. Second, we consider the decision the decision-maker would take when given access to the prediction for the instance . Finally, we consider the analogous decision given the more detailed prediction . We assume that the decision-maker minimizes expected loss , averaged over her prior.111We assume that the decision-maker observes a single instance at a time, and at this point take into account only the prediction for this specific instance. In principle, the decision-maker could also learn from other predictions; however, we assume that the structure of the full prediction function is too complex for or not available to the decision-maker, and the decision-maker solely updates based on . We discuss extensions in Section 6. The three decisions are therefore

Note that

are random variables since they depend on the training data, even for fixed


We are interested in the relative accuracy and disparities of these three decisions of the decision-maker, and compare them to applying the machine predictions directly. While not explicitly discussed in the results below, combining human priors with machine predictions can improve the accuracy of decisions by combining information even when not enforced by institutional constraints. In the main part of this article, however, we take human decision-authority as given and focus on comparing , rather than whether one should or should not delegate a given decision to a human decision-maker, a machine, or human-assisted machine, or a machine-assisted human.

3 Illustration in a simple example

We begin by considering a simple example in which we focus on instances with fixed (so that the only variation is the information about group membership ), so for simplicity we drop the subscripts and inputs in this section. For this simple example, we assume

  • drawn independently upon realization of , and write and ;

  • the decision-maker holds a prior belief that , that these distributions are independent across , and write and ;

  • the group distribution is ;

  • we have training data with .

In Section 4, we will drop these distributional assumptions about beliefs and generalize the main insights from the example.

Under these assumptions, the machine predictions are given by

and the decision-maker’s optimal decisions are

The decisions and have an intuitive structure. When given access to the decision-maker combines their prior of with a normal signal of whose accuracy is a function of sample size and prior variance. We can find the resulting posterior expectation through inverse-variance weighting. The decision-maker who only observes , on the other hand, updates about the average (first term), but not about the difference (second term), which is orthogonal in this specific example.

Table 1: Disparities and risks in the example

For this example, Table 1 lists the resulting expected risks and disparities, which are both taken as averages over the training sample given the true values of and at the given , in terms of ground truth , beliefs , noise , prior variance , and sample size . We first summarize the features of the example in terms of expected disparities; we will generalize these findings and drop expectations in the next section below.

Remark 1 (Disparity reversal in the example).

If the human decision-maker ex-ante believes that the disparity is larger than it actually is, , then

while for the underlying predictions

Hence, in this example, the usual ordering of disparities is reversed. Under the automated approach that implements directly, inclusion of allows for differentiated action. Thus will not exhibit disparity in the actions it takes while will on average separate its decisions by . Since the human decision-maker does not update about disparities when given access only to , the prior disparity persists, leading to excess disparity. Giving the biased decision-maker access to , on the other hand, reduces disparities relative to the unaided decision-maker () as well as the decision-maker who only sees .

We next inspect the relationship of disparities and accuracy. Fairness and accuracy are sometimes seen as representing a trade-off when considering the inclusion of protected characteristics; we show that this trade-off may disappear under our model of machine-assisted human decisions, and already for pure automation will depend on the bias–variance trade-off in training.

Remark 2 (Trade-off reversal in the example).

For and , we find that:

  1. Trade-off regime. If then there is a disparity–accuracy trade-off for automation,

    which disappears for assistance,

  2. Dominance regime. If then there is no trade-off for either decision, and

When the sample size is large enough (and thus small) relative to the disparity in the data, including the group identity in prediction makes the automated decision more accurate at the cost of disparity, while the exclusion of information makes both disparity and accuracy worse for assistance. If true disparities are very small, then even for automation there is no trade-off between these two goals. By excluding group information, the automated rule reduced variance more than it increases loss due to (statistical) bias. In cases where the the prior disparity is very close to the true disparity in the data in the sense that , the assisted human decision even exhibits the reverse of the usual trade-off, where exclusion of group information increases disparity and accuracy.

4 Main results

In this section, we analyze general patterns in the disparities (and accuracy) in the model from Section 2. As a baseline, we first consider the direct implementation of the predictions . For our main results, we then consider decisions by the human decision-maker. Throughout, we hold the covariate fixed, , as we evaluate the decision-rule, which is without loss in our framework.

If we directly implemented the machine predictions (which varies by group) and (which ignores group membership), we note that disparities would trivially be

(with expectations over the training data), and


almost surely, since can vary by group, while does not. Often, these difference in discrepancy are seen as one side of a trade-off with accuracy, where the more disparate rule is also more accurate; we note that if we take the perspective that are learned from training data then there is not necessarily a trade-off:

Remark 3 (Trade-off vs dominance regimes for machine decisions).

For ever and sample sizes there exists an such that:

  1. Trade-off regime. If , then

  2. Dominance regime. If , then

In other words, when the true disparity in outcomes between the two groups is small, then there is no trade-off between increasing accuracy and increasing disparities; in that case, ignoring group identity and learning jointly serves as a form of regularization that improves predictions in this simple learning framework.

We now turn to human decisions with machine assistance, where we investigate the interplay between decision-maker biases and the nature of machine assistance. Specifically, we assume that the decision-maker, while aiming to maximize accuracy, may have biased beliefs about the relative means of the two groups, which we express by (excess) disparity in their prior. To formalize our assumption, it will be helpful to define as

the average of instances with in the training data.

Assumption 1 (-disparate beliefs).

The decision-maker’s belief about means at assumes that there is a disparity of at least between groups and ,

In this assumption, we condition on the average to rule out cases where beliefs about the difference in means are overwhelmed by updates about the average . We also make a regularity assumptions about the prior, in addition to Normal error terms in Section 2.

Assumption 2 (Finite prior moments).

For all , .

Under these assumptions, we -almost surely obtain a reversal in disparities of and , relative to and :

Theorem 1 (Disparity reversal).

Assume that the decision-maker has -disparate beliefs, that the regularity conditions hold, and that . Then -almost surely for every there exists some

such that with probability (over draws of the training data) at least

we have

whenever .

The statement holds for a set of true means

that have prior probability one, which may exclude areas of the parameter space that the human decision-maker rules out ex-ante. Specifically, the conclusion of the theorem may fail to hold when the decision-maker has a dogmatic prior that cannot be overcome by the data.

The intuition behind this result is straightforward: If the decision-maker is biased in the sense that they overestimate the disparity relative to the data, then a prediction that does not vary by group preserves that disparity, while separate predictions help overcome it. Relative to the machine baseline in (1), the disparity is reversed; including protected characteristics increases disparities in automation, while reducing bias relative to the no-assistance and -assisted decisions of the human decision-maker.

Corollary 1 (Disparity reordering).

Under the conditions of 1 with , with probability at least

When there is a trade-off between disparity and accuracy in the case where the machine prediction is directly implemented, then this trade-off is eliminated in large samples as we move from automation to assistance.

Theorem 2 (Trade-off reversal).

Assume that the decision-maker has -disparate beliefs, that the regularity conditions hold, and that . Then -almost surely for every and there exists some such that with probability (over draws of the training data) at least we have that


whenever and

Hence, not only does incorporating the biased human decision-maker reverse the intuition around the effect of the inclusion of group information in prediction functions. Taking into account the interaction of prediction and human bias also negates the usual trade-off between accuracy and disparity.

5 Interpretation and implications

In the previous section, we showed that considering a biased decision-maker who is assisted by a prediction algorithm may reverse the effect of including protected characteristics in the prediction: While an algorithm that does not differentiate by protected characteristics may reduce disparities when applied directly (“automation”), excluding group information may not be effective or even counterproductive for reducing disparities when the algorithm provides an input to human decision-making (“assistance”). In this section, we consider implications of this result for the debate around the use of protected characteristics in algorithms and the evaluation of algorithmic properties in context.

5.1 Use of protected characteristics

Excluding characteristics from consideration by a machine-learning algorithm – only in deployment or in training and deployment – is sometimes considered a way to assure that people with otherwise similar features are treated similarly.222A growing literature questions the exclusion of group information because it may be ineffective (Gillis and Spiess, 2018; Gillis, 2021) or even counterproductive (Kleinberg et al., 2018b). We focus here on the direct effects of exclusion on differences in decisions even in the simple case where exclusion would be effective for automation. Our results show that this is not necessarily true for machine-assisted human decisions and thus provides a potential rationale for including protected characteristics. However, there are other economic, legal, and ethical considerations around the inclusion of protected information that we do not model and may lead to different conclusions, including bias in existing data, biased preferences (as opposed to beliefs) of the agent, dogmatic beliefs that are hard to overcome with data, and limited rationality that prevents the agent from updating towards less biased choices. We review some of these extensions in Section 6 below.

While our analysis remains limited to the specific case of a human with a biased prior, the formal results clarify that studying the properties of machine-assisted human decisions requires careful consideration of the role of the human decision-maker and the context in which the decision takes place. In particular, we document that in a standard model of choice under uncertainty, properties of the prediction function do not directly translate into analogous properties of resulting decisions (similar to the work of Andrews and Shapiro, 2021, for reporting statistical results to a decision-maker), and that a naive application of existing results may have unintended consequences when applied to machine-assisted human decisions.

5.2 Fairness in context

Our results clarify that determining the fairness, bias, and impact of algorithmic decision requires going beyond the properties of a prediction rule in isolation. Specifically, in the chain from data to prediction to decision (Figure 1), biases in the data and properties of predictions rules are not the only factors that shape the properties of the resulting decisions. The process by which data is transformed into a prediction function (the algorithm) and how predictions are used to make decisions (in our case, the human decision-maker with their beliefs) shape the properties of the decision. For example, 3

shows that the training process (in this case, the signal-to-noise ratio in the data and implicit regularization) matters for the relationship between accuracy and disparity. Our main result then shows that the mapping from prediction function to human decisions is non-trivial and may likewise alter commonly assumed trade-offs.

data prediction decision
Figure 1: A view on data-driven decisions

The need to consider properties of algorithmic decisions in their proper context often requires extending current frameworks that narrowly focus on the properties of static decision rules without modelling how they are learned from data or leveraged to take decisions. In this article, we instead highlight cases in which an analysis of fairness relies on a view of the whole process by which data and human beliefs are turned into decisions. While this framework still falls short of modelling the additional historical, institutional, and dynamic context that may be required to describe equilibrium outcomes and their welfare implications, we hope to highlight the value of analyzing critical properties of algorithmic decisions in their context.

6 Extensions and open questions

Our current model remains limited to prediction decisions by a rational decision-maker with biased beliefs, where predictions are provided by simple averages from unbiased training data. In this section, we briefly discuss extensions to our approach that aim to make the analysis more complete and applicable.

6.1 Binary decisions

Many of the assisted decisions studied in the fairness literature, such as employment, pretrial release, and lending, are binary rather than continuous. To convert between real valued parameters and binary (or discrete) decisions, thresholds are commonly used. In these cases, we can distinguish between an algorithm’s prediction, for example a defendant’s risk score for recidivism, and the decision itself, such as whether to release a defendant on bail. Even when a prediction is used automatically in decision-making, this implicitly assumes the use of a decision rule that translates the prediction to a decision, such as a risk threshold below which defendants are released. We have chosen prediction decisions with a continuous true label for convenience. While we do not expect the main takeaways to change when applied to binary decisions based on threshold rules, we see a more complete treatment that formally includes binary decisions as a natural next step in the analysis.

6.2 Biased preferences

Throughout this work we have assumed that both the decision-maker and algorithm designer are aligned in the goal of minimizing risk. As Kamenica and Gentzkow (2011) show, if there are misaligned preferences, the algorithm designer has the potential to improve the decisions taken by designing the structure of the information revealed to the decision-maker. An interesting extensions is the optimal design of algorithms for cases where the source of bias is preference misalignment, rather than belief bias.

6.3 Biased data, biased equilibria

In our model, we have assumed that the training data provides unbiased signals of some ground-truth label of interest. But when the data itself is biased then the exclusion of protected characteristics may correct for biases and lead to more accurate predictions. Relatedly, the data may itself be the result of decisions by algorithms and (biased) decision-makers, leading to biased conclusions that may sustain discriminatory equilibria. Modelling biases in the data itself is therefore an important extension of our current model.

6.4 Machine learning beyond simple averages, and transparency

Our model utilizes simple averages to model machine-learning algorithms, which are easy to interpret statistically and can provide a clear answer to the problem of information transfer we wish to solve. An important extension of this work is to model more complex regularization schemes. Information between types and could be averaged if they draw from from similar distributions. Such data combination could happen adaptively.

Modelling such an extension faces at least two challenges that we side-step in our current results. First, pointwise predictions could then contain information about other parts of the distribution, including direct information about differences between groups. Second, more complex algorithms pose the question of transparency and whether the decision-maker in our model would be able to understand the algorithm well enough to update optimally. Here, both transparent and intransparent algorithms could have advantages.

6.5 Noisy decisions

A fundamental problem when modeling human decision-making is the inconsistency in their actions taken under seemingly similar conditions. For example, Kleinberg et al. (2018a) find that (less noisy) predictions of (noisy) human decisions may outperform the latter in judicial decisions. While our current model assumes that decisions given data and features are not stochastic, our model extends naturally to noisy decisions.

6.6 Alternative notions of fairness, disparities in accuracy

We have focused so far on conditional statistical parity in predictions as our measure of fairness. However, disparities are only one way of expressing unfair decisions; for example, we may also be interested in differences in the accuracy of decisions across groups. Discussing fairness measures based on mistakes becomes particularly relevant as we move from prediction to binary decisions. For example, Morgan and Pass (2019)

consider equalized odds as a notion of fair treatment by the machine in their model of computer-assisted decision-making. The fact that different notions of fairness cannot be achieved simultaneously

(Kleinberg et al., 2016; Chouldechova, 2016) suggests that the conclusions we would draw in our model could depend on the specific measure we consider.

6.7 Distributional assumptions

In our model from Section 2, we currently assume that error terms are Normally distributed with fixed variance, which ensures that the error terms themselves are not informative about the means. (The prior belief, on the other hand, does not generally have to be Normal.) We conjecture that the assumptions of Normality can be replaced by a more general class of smooth distributions. Likewise, we believe that the restriction that the main results hold -almost surely can be weakened to holding for any true mean vector inside an open set that has strictly positive prior probability, under regularity assumptions.

7 Conclusion

In this article, we present a model in which the fairness implications of a machine-assisted human decision cannot be assessed purely from the mathematical properties of underlying predictions. Instead, we argue that the nature of decision authority, beliefs, incentives, existing bias, and equilibria all shape the fairness of complex decision systems. When analyzing disparities of algorithmic decision, this view provides a research agenda towards less discriminatory decisions by designing processes that consider fairness implications in all aspects of their design. We believe that this agenda requires bringing together techniques, ideas, scholars, and stakeholders from across fields and across application areas.


We thank Asa Palley, Stefan Wager, and Larry Wein for helpful discussions, comments, and suggestions.


  • Andrews and Shapiro (2021) Andrews, I. and Shapiro, J. M. (2021). A model of scientific communication. Econometrica, 89(5):2117–2142.
  • Athey et al. (2020) Athey, S. C., Bryan, K. A., and Gans, J. S. (2020). The Allocation of Decision Authority to Human and Artificial Intelligence. AEA Papers and Proceedings, 110:80–84.
  • Barocas and Selbst (2016) Barocas, S. and Selbst, A. D. (2016). Big data’s disparate impact. Calif. L. Rev., 104:671.
  • Bastani et al. (2021) Bastani, H., Bastani, O., and Sinchaisri, W. P. (2021). Improving human decision-making with machine learning. arXiv preprint arXiv:2108.08454.
  • Bent (2019) Bent, J. R. (2019). Is algorithmic affirmative action legal. Geo. LJ, 108:803.
  • Bertsimas and Kallus (2020) Bertsimas, D. and Kallus, N. (2020). From predictive to prescriptive analytics. Management Science, 66(3):1025–1044.
  • Bohren et al. (2019a) Bohren, J. A., Haggag, K., Imas, A., and Pope, D. G. (2019a). Inaccurate Statistical Discrimination. SSRN Electronic Journal.
  • Bohren et al. (2019b) Bohren, J. A., Imas, A., and Rosenberg, M. (2019b). The Dynamics of Discrimination: Theory and Evidence. American Economic Review, 109(10):3395–3436.
  • Bordalo et al. (2019) Bordalo, P., Coffman, K., Gennaioli, N., and Shleifer, A. (2019). Beliefs about Gender. American Economic Review, 109(3):739–773.
  • Canetti et al. (2019) Canetti, R., Cohen, A., Dikkala, N., Ramnarayan, G., Scheffler, S., and Smith, A. (2019).

    From Soft Classifiers to Hard Decisions: How fair can we be?

    In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, pages 309–318, New York, NY, USA. Association for Computing Machinery.
  • Chouldechova (2016) Chouldechova, A. (2016). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv:1610.07524 [cs, stat]. arXiv: 1610.07524.
  • Coffman et al. (2021) Coffman, K. B., Exley, C. L., and Niederle, M. (2021). The Role of Beliefs in Driving Gender Discrimination. Management Science, 67(6):3551–3569.
  • Corbett-Davies and Goel (2018) Corbett-Davies, S. and Goel, S. (2018). The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning.
  • De-Arteaga et al. (2020) De-Arteaga, M., Fogliato, R., and Chouldechova, A. (2020). A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–12, Honolulu HI USA. ACM.
  • Dietvorst et al. (2018) Dietvorst, B. J., Simmons, J. P., and Massey, C. (2018). Overcoming Algorithm Aversion: People Will Use Imperfect Algorithms If They Can (Even Slightly) Modify Them. Management Science, 64(3):1155–1170.
  • European Commission (2021) European Commission (2021). Proposal for a regulation of the european parliament and of the council laying down harmonised rules on artificial intelligence.
  • Fogliato et al. (2021) Fogliato, R., Chouldechova, A., and Lipton, Z. (2021). The Impact of Algorithmic Risk Assessments on Human Predictions and its Analysis via Crowdsourcing Studies. Proc. ACM Hum.-Comput. Interact., 5(CSCW2):1–24.
  • Gillis (2021) Gillis, T. B. (2021). The input fallacy. Minnesota Law Review, forthcoming, 2022.
  • Gillis and Spiess (2018) Gillis, T. B. and Spiess, J. L. (2018). Big Data and Discrimination. The University of Chicago Law Review, page 29.
  • Green and Chen (2019) Green, B. and Chen, Y. (2019). Disparate Interactions: An Algorithm-in-the-Loop Analysis of Fairness in Risk Assessments. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 90–99, Atlanta GA USA. ACM.
  • Grgić-Hlača et al. (2019) Grgić-Hlača, N., Engel, C., and Gummadi, K. P. (2019). Human Decision Making with Machine Assistance: An Experiment on Bailing and Jailing. Proc. ACM Hum.-Comput. Interact., 3(CSCW):1–25.
  • Hellman (2020) Hellman, D. (2020). Measuring algorithmic fairness. Va. L. Rev., 106:811.
  • Ibrahim et al. (2021) Ibrahim, R., Kim, S.-H., and Tong, J. (2021). Eliciting Human Judgment for Prediction Algorithms. Management Science, 67(4):2314–2325. Publisher: INFORMS.
  • Imai et al. (2020) Imai, K., Jiang, Z., Greiner, J., Halen, R., and Shin, S. (2020). Experimental Evaluation of Algorithm-Assisted Human Decision-Making: Application to Pretrial Public Safety Assessment.
  • Jiang and Nachum (2020) Jiang, H. and Nachum, O. (2020). Identifying and Correcting Label Bias in Machine Learning. In International Conference on Artificial Intelligence and Statistics, pages 702–712. PMLR. ISSN: 2640-3498.
  • Kamenica and Gentzkow (2011) Kamenica, E. and Gentzkow, M. (2011). Bayesian Persuasion. American Economic Review, 101(6):2590–2615.
  • Kleinberg et al. (2018a) Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., and Mullainathan, S. (2018a). Human Decisions and Machine Predictions. The Quarterly Journal of Economics, 133(1):237–293. Publisher: Oxford Academic.
  • Kleinberg et al. (2018b) Kleinberg, J., Ludwig, J., Mullainathan, S., and Rambachan, A. (2018b). Algorithmic fairness. In AEA papers and proceedings, volume 108, pages 22–27.
  • Kleinberg et al. (2016) Kleinberg, J., Mullainathan, S., and Raghavan, M. (2016). Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807.
  • Lakkaraju et al. (2017) Lakkaraju, H., Kleinberg, J., Leskovec, J., Ludwig, J., and Mullainathan, S. (2017). The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 275–284, Halifax NS Canada. ACM.
  • Lawrence et al. (2006) Lawrence, M., Goodwin, P., O’Connor, M., and Önkal, D. (2006). Judgmental forecasting: A review of progress over the last 25 years. International Journal of Forecasting, 22(3):493–518.
  • Miller (2018) Miller, J. W. (2018). A detailed treatment of doob’s theorem. arXiv preprint arXiv:1801.03122.
  • Mishler et al. (2021) Mishler, A., Kennedy, E. H., and Chouldechova, A. (2021). Fairness in Risk Assessment Instruments: Post-Processing to Achieve Counterfactual Equalized Odds. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, pages 386–400, New York, NY, USA. Association for Computing Machinery.
  • Morgan and Pass (2019) Morgan, A. and Pass, R. (2019). Paradoxes in Fair Computer-Aided Decision Making. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’19, pages 85–90, New York, NY, USA. Association for Computing Machinery.
  • Mullainathan and Obermeyer (2019) Mullainathan, S. and Obermeyer, Z. (2019). Diagnosing Physician Error: A Machine Learning Approach to Low-Value Health Care. National Bureau of Economic Research working paper.
  • Raghavan et al. (2020) Raghavan, M., Barocas, S., Kleinberg, J., and Levy, K. (2020). Mitigating bias in algorithmic hiring: evaluating claims and practices. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 469–481, Barcelona Spain. ACM.
  • Stevenson and Doleac (2019) Stevenson, M. and Doleac, J. L. (2019). Algorithmic Risk Assessment in the Hands of Humans. IZA Discussion Paper No. 12853.
  • Yang and Dobbie (2020) Yang, C. S. and Dobbie, W. (2020). Equal protection under algorithms: A new statistical and legal framework. Michigan Law Review, 119(2):291–395.

Appendix A Solution for the example

In this section we present the calculations necessary for all computations in the example given in Section 3. Here, we assume that:

  • is drawn independently with and .

  • The decision-maker holds a prior belief that independently across , with and .

  • independent of everything else.

  • The training data is balanced, .

a.1 A convenient reparametrization

Writing for the respective averages, we have that

which are all independent. Furthermore, we can write:

a.2 Human decisions

We update as:

a.3 Biases and variances

By independence, we can calculate biases and variances (given the true values of ) separately by average and disparity to find:

The results in Table 1 follow, noting that :

a.4 Proofs of the remarks in the example section

See 1

Proof of 1.

Immediate from the explicit calculations, where we note that is a convex combination of and . ∎

See 2

Proof of 2.

The results for automation are immediate from the explicit expressions above. For assistance we note that risks are equal for

yielding the threshold for . ∎

Appendix B Proofs of main results

See 3


The statements about disparities are immediate from positive variance and . Since we assume a constant variance , the statement about risks holds by

where we note that is symmetrical in around zero and strictly decreasing in , positive at , and negative as . Hence there exists such . ∎

See 1


By -disparate beliefs and for , , (where ),

almost surely, where we have used that the do not vary with and are not informative about the . Similarly, . At the same time,

as -almost surely by Doob’s posterior consistency theorem (e.g. Miller, 2018, Theorem 2.2) where we have used that is a sufficient statistic for . In particular,

as . The result follows. ∎

See 1

Proof of 1.

The result follows as in the two previous proofs and

See 2

Proof of 2.

Since -almost surely