The current interest in fairness properties of algorithms includes several distinct themes, one of which is the question of fairness in allocating scarce resources. Research on this question has a long history, with foundational work in the 1980s and 1990s on allocating resources in computer systems [Jaffe81bottleneckflow, Chiu1989analysis, kelly1998rate]. Allocation problems continue to form an important topic for fairness considerations, especially as automated systems make allocation decisions in a wide range of areas that reach far beyond the original computational settings of the problem [eubanks2018automating, procaccia13cakecacm].
Recently, Elzayn_2019 (Elzayn_2019) considered a novel allocation problem in this style. In their formulation, individuals are divided into groups, each of which has some probability distribution of candidates who desire the resource. There is a fixed amount of resource and the question is how to allocate this resource across the groups. Such a set-up arises in many applications; as one concrete motivating example for purposes of discussion, suppose that we have a set number of doctors, each with a maximum number of people they can assist, and we would like to allocate them across a set of geographically distributed regions. Each region contains a population with a potentially different probability distribution over the number of sick people in need of doctors. We know the distribution of demand for each region, but we will only know the actual demand — drawn from these distributions — once we have allocated the doctors by dividing them up in some fashion across the regions. If a community has too few doctors, some sick people will go unassisted; and if a community has more doctors than it needs for the number of sick people it has, some of the doctor’s capacity to help will go unused. Formally, the expected number of candidates that receive a resource under a certain allocation is called the utilization of that allocation.
One allocation strategy could be to put the doctors where they are most likely to be used, with the aim of maximizing utilization. However, such an allocation could potentially “starve” certain regions, leaving them with very few doctors even though they have a non-trivial level of need. Such an allocation violates a natural definition of fairness: sick people in some regions would have a higher probability of receiving assistance than sick people in other regions, which conflicts with the premise that people are equally deserving of care from the doctors.
In this work, we thus say that an allocation is fair across multiple regions if a candidate who needs the resource has the same probability of receiving it, regardless of the region they belong to: the identity of the region doesn’t impact the probability that those in need obtain assistance. At some points, we relax this definition slightly to say that an allocation is -fair if the probability that a candidate receives the resource in two different regions is within a maximum difference of , for across all of the regions. Note that corresponds to our initial notion of fairness, and imposes no constraint at all on the allocation.
Elzayn_2019 (Elzayn_2019) consider a version of this allocation problem in the scenario where the candidate probability distributions are unknown and must be learned. At each time step, a certain allocation is chosen and the feedback obtained reveals only the number of candidates who received the resource, not the true number. They adapt learning results from previous dark pool trading scenarios proposed in ganchev2009censored (ganchev2009censored). From these results, they construct algorithms that learn utilization-maximizing and -fair utilization maximizing allocations through this type of censored feedback111They consider a related, but different definition of fairness from the one used in this work. Later sections will expand on this difference.. They also define the Price of Fairness (PoF) as the ratio of the utilizations of the max-utilizing allocation and the max-utilizing -fair allocation. From an empirical dataset, they calculate the PoF for various levels of in practice. However, they largely left open the following category of questions: can we obtain theoretical upper bounds on the PoF over different probability distributions? These questions appear to become quite complex even when the distributions are known.
The present work: The interaction of fairness and utilization.
In this work we obtain bounds on the price of fairness for this family of resource allocation problems, both through general results that hold for all distributions and through stronger results that are specific to common families of distributions. We also distinguish between two versions of the problem — one in which we require that the allocation to each group be integer-valued, and one in which we allow resources to be allocated fractionally (or probabilistically, which yields equivalent results in expectation).
In the case of integer-valued resource allocation, we show through a constructive proof that the PoF can be unboundedly large. If we allow resources to be allocated fractionally, then we show that the PoF is bounded above by for arbitrary distributions and . In the case where , the PoF can be unboundedly large for fractional allocations as well.
We then show that much stronger upper bounds can be obtained for large families of distributions that are often used to model levels of demand. First we show that certain families of natural distributions have PoF equal to 1, the smallest possible bound: for these distributions, there is no trade-off between fairness and maximum utilization. We show that distributions with this property include Exponential and Weibull distributions. We then consider the family of Power Law distributions; we show that these distributions have a PoF that can be strictly greater than , but is always bounded above by a universal constant. Table 3 in Appendix A contains a high-level summary of results explored in this paper.
Overall, our results reveal a rich picture in the trade-off between fairness and utilization for this type of allocation problem, through the different families of bounds for the price of fairness — ranging from distributions where fairness can be achieved while maximizing utilization, to those in which the gap is bounded by a constant, to others for which the gap can become large. The techniques used for the analysis suggest further opportunities for reasoning about the ways in which balancing the probability of service to different groups can both constrain and also be compatible with other objectives.
2 Motivating example
To motivate the problem and to make some of the types of calculations more clear, we start with the following example. Suppose there is a remote stretch of coastline with two very small hamlets, and , each with a small set of houses. This stretch of coastline is prone to severe localized storms that lead to power outages. Each hamlet has a slightly different probability distribution over storms - distributions that are known. In any particular week, the probability distribution over the number of houses impacted by power outages in each hamlet is as follows:
In our example, assume that each week’s power outage is independent of other weeks, and the hamlets are far enough apart that the power outages in one hamlet aren’t correlated with the power outages in the other. A regional planning committee has at its disposal 2 generators, each of which can restore power to 1 house. They are trying to decide how to allocate these generators across the two hamlets. The generators cannot be transferred from one hamlet to another after a storm strikes. If there are generators allocated to a hamlet and houses in need of generators, the number of houses that receive generators is .
The regional planning committee first decides that it wishes to maximize utilization: the expected number of houses in need of generators that receive them. There are three options for allocations: where means that hamlet gets generators. Formally, utilization can be written as:
The utilizations across these three allocations are given in Table 1. The allocation that maximizes utilization is .
However, something bothers the regional committee: it feels fundamentally unfair that all of the houses that receive generators will be in hamlet B. As a way to formalize this, we could ask what fraction of houses in need of generators obtain them, on average, and aim to select an allocation that brings this proportion as close to equality as possible between the two hamlets. First, we derive the relevant probability for an arbitrary hamlet:
We will assume that there is a total number of houses in the hamlet and that all houses within the same hamlet are interchangable. The total number of houses in the hamlet who need generators is given by . Given that houses need generators, there is a probability that a randomly selected house will be one of those. Given that a house is in need of generators, the probability it obtains one is equal to .
Similarly, we can use the term to expand out the probability
, so the above term simplifies to the term below. Note that , which is why the sum starts from .
Combining these results tells us that:
where in the last step we have defined the probability of receiving the resource, conditional on needing it, as , where indexes the hamlet that house is preseent in. Note that this definition differs from Elzayn_2019 (Elzayn_2019): that definition calculated the probability of candidates receiving the resource weighted by each time period. By contrast, our definition is based on the per-person probability of receiving the resource, which we believe is a more natural definition. One goal might be to bring these probabilities across the hamlets to be as close as possible to each other.
Given these numbers, the regional committee could look at the utilization and fairness values of each allocation and make their own decision about how resources should be allocated. In particular, they might decide that they want the fractions between each hamlet to be within a certain difference and pick the allocation with maximal utilization, subject to those constraints. It is this scenario that much of this paper will consider: comparing max-utilization with maximizing utilization subject to an -fairness constraint.
3 Applications and formal model
The purpose of this paper is not to provide guidance for which allocation decision-makers should choose, but rather to provide general bounds on when this procedure will result in a large trade-off of utilization and fairness, and when the trade-off will be smaller. One main departure from the illustrative example is that much of this paper will focus cases with many candidates and many resources, which in the limit will be be approximated by continuous probability distributions. We will discuss the contrasting cases of continuous and discrete probability distributions further in later seections.
3.1 Model detail
In this section, we take the motivating example and make it more abstract. We have a set of groups. Each group has a distribution over the number of candidates that are present during a particular unit of time. We assume that
and that each time period’s number of candidates is independently and identically distributed. The distribution could be discrete or continuous. As mentioned previously, much of this paper will focus on the continuous probability distribution case, but the results obtained largely do not depend on this factor.
In the model, we assume we have a number of resources that we can allocate across these groups. We assume that allocations must be selected before the number of candidates is realized each time period. If resources and candidates are present a given group, then candidates will receive the resource. If , some resources will go unused, and if , some candidates will go without resources. We assume that the only distinguishing characteristic candidates have is which group they come from: otherwise, they are interchangeable.
We have two different objective functions: utilization and fairness.
The fairness term bounds the difference in the probability of a candidate receiving a resource between groups. We say an allocation is -fair if the fairness objective above has value , so all fractions are within of each other.
If there exists an allocation that simultaneously maximizes utilization and is fair, then the PoF is 1 and there is no tension between our two objectives. Otherwise, the PoF is and we will need to trade off between these objectives in order to make a decision about which allocation to choose. We will use the standard notation of having
represent the probability density function (or probability mass function) and
represent the cumulative distribution function of a distribution.
There are a few assumptions in this model that are important to recognize. To start, we require that any allocation always use its entire budget . It might be possible that an -fair allocation might prefer to use less than its entire budget in order to achieve fairness: this is a case that our model specifically disallows. Secondly, our definition of fairness revolves around the maximum difference in probability of receiving the resource. A different metric could have used the average difference in probability, for example. While these (and other) different models might also be valid and interesting avenues to explore, the rest of this paper will use the assumptions and model listed above.
3.2 Application areas
Despite the above motivating example, the focus of this paper is not to model any one specific application such as disaster relief aid. Instead, we view this model as representing the core features of allocation problems in a variety of contexts. A few of these are listed below. We note that in each case, the potential application area will have key features that are not already captured in the above model. This is intentional because the goal of this paper is to provide broad results for a broad class of problems, rather than model more precisely a particular application area. Both Elzayn_2019 (Elzayn_2019) and ensign18a (ensign18a) use related models in the contexts of allocating police officers to districts with differing crime rates. While this is also an area of application for this model, it raises additional issues that we do not model here, such as the possibility that the presence of police officers will have an impact on the rate of crime.
Doctors: As described in the introduction, one example could be allocating doctors across regions with different probabilities of ill patients. Again, it is worth noting that the presence of doctors might influence the level of illnesses.
Blood banks: Donated blood needs to be allocated across different hospitals with potentially different needs for blood. This area has the added complication that blood donations are not interchangeable: candidates can only receive a certain subset of the available types of blood.
Schools: A public school district might try to allocate certain resources (teachers, computers, specialized classes) across schools of different size, and therefore different distributions of need.
4 General bounds on Price of Fairness
One natural question we might ask is, “What is the maximum amount of utilization we give up by requiring fairness?”. This amount depends on the set of probabilities distributions and our budget . If we have these quantities, we can directly calculate the PoF by first calculating the unconstrained max-utilization, then the -fair max-utilization. Elzayn_2019 (Elzayn_2019) provides algorithms to calculate both of these quantities so long as the allocations are integers. However, in many cases we might not know the specific in advance, but we might still want to have a bound on the PoF. In this section, we explore general bounds on the PoF when we do not know the candidate distributions in advance.
4.1 Discrete resource allocation has unbounded PoF
Suppose we allocate resources in integer units: for example, one generator at a time. In this formulation, we find that the PoF is unbounded: achieving fairness could require giving up arbitrarily large amounts of utilization. This proof is presented in detail in the Appendix, a rough sketch is below.
Suppose we are given a desired and a maximum PoF we are willing to accept. (We require because if the fairness constraint has no effect, so the PoF will be 1 always.) Then, it is possible to construct a set of candidate distributions and a set of resources over this such that the PoF is . The proof involves creating two different candidate distributions with parameters depending on and and then showing that the PoF of allocating resources across those groups is always .
4.2 PoF bounded by under fractional allocation
A critical reader might wonder about the PoF if we were allowed to allocate resources fractionally. Such allocations would offer more flexibility and might allow for lower PoF. In the motivating example of allocating generators across hamlets, for example, this might correspond to allocating 0.7 generators to Hamlet A and 1.3 generators to Hamlet B. For divisible resources (such as potable water stores), this type of allocation makes intuitive sense. For indivisible resources, like generators, we can view the allocation as probabalistic: An allocation of , for example, could be viewed as
The expected utilization is equal to . Both the “deterministic, fractional” and “probabilistic, integer-valued” interpretations result in the same expected utilization. This fact is proved in the Appendix. Given the results of this lemma, we will switch between these two interpretations (probabalistic and deterministic) depending on which makes more sense for the problem at hand.
Given the ability to allocate resources fractionally, we will find that the PoF is no longer unbounded. In particular, the PoF is upper bounded by . Note that this proof relies on an algorithm to calculate a max-utilizing allocation in the fractional case, as well as other related proofs, all present in this paper’s Appendix.
Given a set of candidate distributions , level of resources , and ability to allocate resources fractionally, it is possible to find an -fair allocation with PoF at most , though this allocation may not use all of the resources.
First, we find a max-utilization allocation. Next, we divide the groups into two categories:
If , then the max-utilization allocation is already -fair and PoF = . Otherwise, modify the allocation for each so exactly. We are able to do this because we know that is continuous and increases monotonically from 0 to 1.
This new allocation is -fair: . It uses at most the same amount of resources that the optimal allocation used so it is achievable. The only case where it uses exactly the same amount of resources is when is the empty set. For , we know that:
This tells us that the utilization of the groups in greater than or equal to multiplied by the utilization of the unconstrained maximum utilization. The utilization of groups in are identical to their unconstrained utilization. Then
where the inequality in the second to last step comes from the fact that could be the empty set. ∎
Note that we have required that all allocations use all of the resources, so we will need to convert the allocation calculated above into one with . The supporting lemma below (proved in the Appendix) provides this final step.
If there exists a fractional allocation over candidate distributions such that that is -fair, there also exists an allocation that is -fair and has utilization at least as large as , but additionally has the property that .
Taken together, these two lemmas show that, for every optimal allocation, we can find an allocation that is -fair, uses all of the resources , and has PoF less that . This tells us that the PoF for the max-utilization -fair allocation could be at most .
4.3 PoF unbounded for under fractional allocation
However, this bound is undefined for . The proof below shows that we can achieve arbitrarily high PoF with only groups, even in the case where we increase the budget by a constant factor in the case.
Suppose that in the case requiring allocation, the budget is multiplied by . For any , we can create a set of groups such that requiring fairness involves a PoF , assuming fractional allocation of resources.
We have 2 groups, one with distribution and one with distribution , described below:
Set , , , and . Finally, set .
Because , candidates are more likely to be present in Group 1, so the optimal allocation is .
The fair allocation has . In the case that , the probabilities can be calculated as follows:
As stated above, this equation only holds for . If , always. However, we must have in the fair allocation: we have insufficient resources to have . This implies that:
where we have incoporated the fact that the budget is times larger. The solution to this system of equations is:
This gives us a PoF as follows:
where we have used the fact that . Continuing simplifying gives us:
Using the facts that, and , the bound becomes:
as desired. ∎
Note that in the proof above is left as a free parameter. In particular, it could be set to something decreasing in , such as , which would create an example with tail probabilities decreasing in , but still PoF greater than .
5 PoF equal to 1 for certain families of distributions
The previous section showed that with fractional allocation, the upper bound on the Price of Fairness is . This could still be a fairly large price to pay: if the desired level of fairness is , this could mean we might be obliged to reduce utilization by 90% in order to achieve the desired level of equity. Depending on the context, this could be a very high price. However, the upper bound obtained in the previous section was derived without any reliance on the characteristics of the candidate distributions . In this section, we will show that for certain reasonable distributions, the Price of Fairness is much lower. In some cases, the max-utilization allocation is already fair!
5.1 Illustrative example for exponential distribution
First, we will work out an example illustrating this phenomenon for
groups, both with exponential distributions. As proved in the Appendix, there exists only one max-utilizing allocation, which is the one that has, with . This implies:
This allocation is max-utilizing. We will also show that it is fair, which means that PoF = 1. To do this, we will calculate for both groups.
This demonstrates that, for the given scenario, the max-utilizing allocation already satisfies fairness. This result is true more broadly than just in this illustrative example. In the next section, we will describe a property of probability distributions such that the max-utilizing allocation is also fair.
5.2 Proof of general property
Consider a set of continuous candidate distributions with and . Then, suppose the set of candidate distributions has the following property:
Then under fractional allocation of resources, the max-utilization allocation is already fair. In other words,
Intuitively, this property means that the CDFs of all of the candidate distributions are versions of the same function, scaled by the ratio of their expected values. If this property holds, it will turn out mathematically that the max-utilizing allocation is already fair.
First, we can verify that the exponential distribution satisfies these properties. The exponential distribution has the given expectation and CDF:
Substitution shows that the premise of the lemmas above holds
Next, we will explore why this property of probability distributions leads to .
As shown in the Appendix, the max-utilization allocation has the property that
only if . In this case, .
We assume , so . We additionally require , which tells us that there exists only one max-utilizing allocation. Next, we examine the fairness constraint and rewrite it as:
We will find it convenient to rewrite this as:
For notational convenience, we denote . We will prove this theorem by considering subterms of each side independently and proving that they are equal.
where the first step comes from the fact that ∎
The max-utilizing allocation has the property that . Having the additional property that implies that . Because the statement of the above lemma holds , it also holds for , which implies that
Next, we consider the second subterm in the equality:
What is written out in terms of the CDF of a distribution? We will consider as a different probability distribution. It has PDF and CDF as shown below:
Then, we can rewrite one of the terms above as:
We will use this fact to evaluate the following term, substituting as the input :
where we have used the assumption of equality in the CDFs in the last step. Next, we note that
By the reverse chain rule,
, so if we denote the anti-derivative of by , then
This allows us to rewrite the item above as
This gives us a value for the righthand side of the equality we are trying to show. For the lefthand side, we can use results of our previous analysis of to find that:
So the two terms are equivalent. ∎
Taken together, these two lemmas tell us that
so any max-utilization allocation is already fair.
5.3 Distributions that satisfy the given premise
We have previously shown that exponential distributions satisfy the given premise, which is perhaps not very surprising. Exponential distributions have certain nice properties: they are memoryless, for example, and are not heavy-tailed. At first guess, it could be reasonable to guess that this phenomenon of PoF equal to 1 is only present in exponential distributions.
However, there exist further families of distributions that also satisfy the premise above and therefore have PoF equal to 1; these include Weibull distributions with the same parameter. They have expectation and CDF as given below:
Again, substitution shows that the premise of the above lemmas holds: