More than of Singapore citizens and permanent residents live in public housing projects; these apartments are sold in a large-scale public market centrally managed by a government body called the Housing and Development Board (HDB). In addition to providing a public good — affordable apartments in a small country with little real estate — HDB estates serve a role in the social integration of Singapore’s diverse ethnic groups (Chinese, Malay and Indian/Others). As per the Ethnic Integration Policy introduced in 1989 Parliament of Singapore. Parliament Debates: Official Report. (1989), every public housing development must hold a certain percentage of every major ethnic group, which is somewhat proportional to the true percentages of these populations; for example, since 5 March 2010, every HDB housing block is required to hold no more than Chinese, Malay, and Indian/Others Housing and Development Board, Singapore (2010); Deng et al. (2013). Ethnic quotas ensure a diverse population composition at every estate, reflecting that of the general Singaporean population, and preventing the de-facto formation of segregated ethnic communities in public housing estates. HDB uses a lottery mechanism to allocate new developments: all applicants who apply for a particular development pick their flats in random order; however, these ethnicity constraints introduce some peculiarities. For example, consider an applicant of Chinese ethnicity applying for an apartment block with flats, up to of which may be assigned to ethnically Chinese applicants, and at most of which can be assigned to ethnically Malay applicants. Assume that is th in line to select an apartment; will she gets a chance to pick one? If at least Chinese applicants were allowed to choose a flat before , the Chinese ethnic quota for the block will have been filled and applicant will no longer be eligible for the block. On the other hand, suppose that is th in line to select an apartment; if Malay applicants end up before in the lottery, then of them will be rejected, and will have a spot111While this example is, of course, highly stylized, the effects it describes are quite real: one often hears stories of young couples who arrive at the HDB office to select a flat, only to be notified that their ethnic quota had just been filled..
As the example above shows, ethnic quotas add another layer of complexity to what is, at its foundation, a straightforward allocation problem. Indeed, the allocation of public housing is an economic problem similar to the classic assignment problem: a central planner (HDB) wishes to allocate goods (apartments) to agents (residents) in a manner satisfying certain economic criteria. Diversity, on the other hand, is a social goal external to the underlying economic domain; imposing it may result in reduced social welfare.
1.1. Our Contributions
We study the interplay between diversity and social welfare in the public housing market; we model this as an assignment problem with additional type-block constraints: agents are of multiple types and goods are divided into blocks. A limited number of goods in each block can be allocated to agents of each type; we call these upper bounds type-block capacities. This limitation results in several interesting outcomes. While the optimal assignment problem is well-known to be poly-time solvable, we show that imposing type-block constraints makes it computationally intractable (Section 3). However, we identify a -approximation algorithm (Section 3.1), as well as some agent utility models for which one can find optimal assignments with type-block constraints in polynomial time (Section 3.2). Next, we show that the potential utility loss from imposing type-block constraints — which we term the price of diversity – can be bounded by natural problem parameters (Section 4). Finally, we analyze the empirical price of diversity on simulated instances generated from real HDB construction data (Section 5).
1.2. Related work
One can think of public housing allocation as a bipartite matching problem Lovász and Plummer (2009), where agents are assigned apartments; an edge from an agent to an apartment is weighted with the utility agent will receive if she is allocated apartment . There is a rich literature on weighted bipartite matching problems (also known as assignment problems Munkres (1957)), and polynomial-time algorithms for the unconstrained version have long been known (e.g. Kuhn (1955)). Several generalizations and/or constrained versions have been examined; for example, the assignment problem with subset constraints studied by bauer2005subsetconst 2004 can be thought of as a special case of our problem, with a single block or a single type. If all agents of each type have identical utilities for all apartments in each block, and each type-block capacity is smaller than both the corresponding type and block sizes, then the problem reduces to a special case of the polynomial-time solvable capacitated -matching on a bipartite graph Ahn and Guha (2014). However, to the best of our knowledge, we are the first to study the effects of general diversity constraints on the hardness as well as the optimal objective value of the assignment problem.
Singapore’s public housing is our primary motivating domain; however, type-block constraints can naturally arise in many other settings related to assignment/allocation problems with no monetary transfers Hylland and Zeckhauser (1979); Zhou (1990). For example, consider the course allocation problem analyzed by budish2012multi 2012; one might require that each course has students from different departments and impose maximal quotas to ensure this. In public school allocation Abdulkadiroğlu and Sönmez (2003); Abdulkadiroğlu et al. (2009); Pathak and Sönmez (2013), one might require that certain schools admit students from diverse neighborhoods to prevent de-facto segregation. Other examples include matching medical interns or residents to hospitals Roth (1984), allocating subsidized on-campus housing to students Abdulkadiroğlu and Sönmez (1998), appointing teachers at public schools in different regions as done by some non-profit organizations Featherstone (2015), or assigning first year business school students to overseas programs Featherstone (2015). This line of work mainly explores the interaction between individual selfish-rational behavior and allocative efficiency (e.g. Pareto-optimality) of matching mechanisms, under either ordinal preferences or cardinal utilities, one-sided or two-sided (see, e.g. Bogomolnaia and Moulin (2001); Bhalgat et al. (2011); Bade (2016); Anshelevich et al. (2013) and references therein); we, on the other hand, focus on the impact of type-block constraints on welfare loss, when agents’ utilities are known to a central planner.
Another relevant strand of literature is that on the fair allocation of indivisible goods (see, e.g., Procaccia and Wang (2014); Caragiannis et al. (2016); Kurokawa et al. (2016); Barman et al. (2017); Barman and Murthy (2017) and references therein): fairness is usually quantified in terms of the utilities or preferences of agents for allocated items (e.g. proportionality, envy-freeness and the maximin share guarantee) but our contribution deals with a different notion of fairness: the proportionate representation of groups in the realized allocation, with no regard to agents’ utilities.
In a recent paper, raffles17immorlica 2017 study the efficiency of lottery mechanisms such as the ones used by HDB to allocate apartments; however, their work does not account for block ethnicity constraints; as we show both theoretically and empirically, these type-block constraints can have a significant effect on allocative efficiency.
1.3. The Singapore Public Housing Allocation System
The Singapore public housing system, managed by the HDB, provides low-cost apartments to Singapore citizens and permanent residents. Public housing is a dominant force in Singapore: as of 2017, approximately of apartments in Singapore are HDB flats Department of Statistics, Singapore (2017). New HDB flats are purchased directly from the government, which offers them at a heavily subsidized rate. New apartments are typically released at quarterly sales launches; these would normally consist of plans for several estates at various locations around Singapore: an estate normally consists of four or five blocks (each apartment block has approximately 100 apartments), sharing some communal facilities (e.g. a playground, a food court, a few shops etc.). Estates take between 3 to 5 years to complete, during which HDB publicly advertises calls to ballot for an apartment in the new estate. A household (say, a newly married couple looking for a new house) would normally ballot for a few estates (balloting is cheap: only S$10 per application Housing and Development Board, Singapore (2015)). HDB allocates apartments using a lottery: all applicants to a certain estate choose their flat in some random order; they are only allowed to select an apartment in a block such that their ethnic quota is not reached. We mention that there are a few complications here: first-time applicants and low-income families usually receive priority numbers in the lottery scheme; moreover, the same estate may have several balloting rounds in order to ensure that all apartments are allocated by the time of completion. However, the focus of this work is the welfare effects of using ethnic quotas rather than the specific intricacies of the HDB lottery mechanism. We must mention here the existing literature on the documentation of Singapore’s residential desegregation policies Chua (1991); Deng et al. (2013); Phang and Kim (2011) and the empirical evaluation of their impact on various socioeconomic factors Sim et al. (2003); Wong (2014); to the best of our knowledge, ours is the first formal approach towards this problem.
We first describe a formal model for the housing allocation problem with ethnicity quotas. Throughout the paper, given , we let be the set .
An instance of the Assignment with Type Constraints (AssignTC) problem is given by:
a set of agents partitioned into types ,
a set of items/goods partitioned into blocks ,
a utility for each agent and each item ,
a capacity for all , indicating the upper bound on the number of agents of type allowed in .
We assume here that the inequality holds for all type-block pairs without loss of generality since it is not possible to assign more than agents of type in block by definition. In the Singapore public allocation problem, tenant households are the agents and apartments are the items; types correspond to ethnic groups (Chinese, Malay, Indian/Others) and blocks to actual apartment blocks in a sales launch. In general, such partitions could be based on any criterion such as gender, profession, geographical location, or suchlike. For our theoretical analysis, we consider the idealized scenario where we have a central planner who has access to the utilities of each agent for all items, and determines an assignment that maximizes social welfare under type-block constraints.
An assignment of items to agents can be represented by a -matrix where if and only if item is assigned to agent ; a feasible solution is an assignment in which each item is allocated to at most one agent, and each agent receives at most one item, respecting the type-block capacities defined in (iv). We define the objective value (or total utility) as the utilitarian social welfare, i.e. the sum of the utilities of all agents in an assignment
. Clearly, this optimization problem can be formulated as the following integer linear program:
3. The Complexity of the Assignment Problem with Type Constraints
Our first main result is that the decision problem we introduce in Section 2 is NP-complete. We prove this by describing a polynomial-time reduction from the NP-complete Bounded Color Matching problem Garey and Johnson (1979), defined as follows:
An instance of the Bounded Color Matching (BCMatching) problem is given by (i) a bipartite graph , where the set of edges is partitioned into subsets representing the different edge colors, (ii) a capacity for each color , (iii) a profit for each edge , and (iv) a positive integer . It is a ‘yes’-instance iff there exists a matching (i.e. a collection of pairwise non-adjacent edges) such that the sum of the profits of all edges in the matching is at least , and there are at most edges of color in it, i.e. and for all .
The AssignTC problem is NP-complete.
That the problem is in NP is immediate: given an assignment, one can verify in poly-time that it satisfies the problem constraints and compute total social welfare. Given an instance of BCMatching, we construct an instance of the AssignTC problem as follows (see Example 3 for an illustration). Each edge is an agent, whose type is its color. Items in our construction are partitioned into two blocks: and . The items in block correspond to the vertices in : there is one item for each node . For every , we add items to , for a total of items. Thus, there is a total of items. Block accepts at most agents of type , whereas block has unlimited type-block capacity; in other words, and for all . Given , we define the utility function of agent as follows:
Here, is an arbitrarily large constant, e.g., . Finally, let .
We begin by showing that if the original BCMatching instance is a ‘yes’ instance, then so is our constructed AssignTC instance. Let be a valid matching whose value is at least ; let us construct an assignment of items to agents via as follows. Observe some node ; if then we assign the item to the agent ; the remaining agents of the form , with , are arbitrarily assigned to the items . If contains no edges originating in , then we arbitrarily choose edges originating in and assign the corresponding agents to the items . We now show that this indeed results in a valid assignment satisfying the type-block constraints.
First, by construction, every agent is assigned at most one item. Moreover, since is a matching, every item is assigned to at most one agent of the form ; hence, every item in is assigned to at most one agent by construction.
Let be the edges of color in . Since matching satisfies the capacity constraints of the BCMatching instance, we have for all ; in particular, the number of items in assigned to agents of type is no more than . Thus, the type-block constraints for are satisfied. On the other hand, the type-block constraints for are trivially satisfied. We conclude that our constructed assignment is indeed valid, and satisfies the type-block constraints.
Finally, we want to show that total social welfare exceeds the prescribed bound. Let us fix a node . By our construction, if the edge is in the matching , then agent is assigned the item for a utility of . Thus the total welfare of agents in equals , which is at least by choice of . In addition, for every , there are exactly agents assigned to items in for a total utility of . Summing over all , we have that the total utility derived by agents in is
Putting it all together, we have that the total utility obtained by our assignment is at least .
Next, we assume that our constructed AssignTC instance is a ‘yes’ instance, and show that the original BCMatching instance must also be a ‘yes’ instance. Let be a constrained assignment whose social welfare is at least . Let be the set of edges correponding to agents assigned to items in ; we show that is a valid matching whose value is at least . First, for any , must assign the item to at most one agent . Next, since is greater than the total utility obtainable from assigning all items in , it must be the case that assigns all items to agents of the form , with , for every node ; thus, there can be one edge in that is incident on for every . Next, since satisfies the type-block constraints, we know that for every , there are at most agents from that are assigned items in ; thus, satisfies the capacity constraints. Finally, the utility extracted from the agents assigned to items in is exactly ; the total utility of the matching is at least , thus has a total profit of at least in the original BCMatching instance, and we are done. ∎
In Figure 1, the graph , with , , and , is an instance of the BCMatching problem; edge labels are profits. The associated instance of the AssignTC problem is defined by and , where , , and ; the utility of an agent for an item is equal to if there is no edge between them, to if the edge is dashed, and to the edge label otherwise.
3.1. A Polynomial-Time Constant Factor Approximation Algorithm
Having established that the AssignTC problem is computationally intractable in general, we next present an efficient constant-factor approximation algorithm: we construct an approximation-preserving reduction Orponen and Mannila (1987) — in fact, an S-reduction Crescenzi (1997) – from this problem to the BCMatching problem (Definition 3), for which a polynomial-time approximation algorithm is known. There exists a poly-time -approximation algorithm for the AssignTC problem.
Given an instance of the AssignTC problem, we define a complete bipartite graph whose nodes correspond to the sets of agents and items , and give the edge joining agent-node to item-node a profit equal to the utility for all , . We also give all edges joining agents of one type to items in one block the same color, so that there are colors indexed lexicographically by pairs ; let the capacity for color be . This produces, in time, an instance of BCMatching; the size of this instance is obviously polynomial in that of the original, and, by construction, there is a one-to-one correspondence between the sets of feasible solutions of the original and reduced instances with each corresponding pair having the same objective value (sum of edge-profits/utilities), so that the optimal values of the instances are also equal. We can now apply the polynomial-time -approximation algorithm introduced by stamoulis2014approximation 2014 for BCMatching on general weighted graphs. ∎
Theorem 3.1 offers a -approximation to the AssignTC problem; whether a better poly-time approximation algorithm exists is left for future work.
3.2. Uniformity Breeds Simplicity: Polynomial-Time Special Cases
Our results thus far make no assumptions on agent utilities; as we now show, the AssignTC problem admits a poly-time algorithm under some assumptions on the utility model.
Definition (Type-uniformity and Block-uniformity).
A utility model is called type-uniform if all agents of the same type have the same utility for each item, i.e. for all and for all , there exists such that for all . A utility model is called block-uniform if all items in the same block offer the same utility to every agent; that is, for all and for all , there exists such that for all .
Type uniformity assumes a strong correlation between agents’ type utility function. In the context of the HDB allocation problem, type uniformity implies that Singaporeans of the same ethnicity share the same preferences over apartments (perhaps due to cultural or socioeconomic factors). Cases that deal with uniform goods satisfy the block-uniformity assumption: e.g. students applying for spots in public schools or job applicants applying for multiple (identical) positions; in the HDB domain, block-uniformity captures purely location-based preferences, i.e. a tenant does not care which apartment she gets as long as it is in a specific block close to her workplace, family, or favorite public space. The AssignTC problem can be solved in time under either a type-uniform or a block-uniform utility model. We prove the result for a type-uniform utility model; the result for block-uniform utilities can be similarly derived. We propose a polynomial time algorithm based on the Minimum-Cost Flow problem which is known to be solvable in polynomial time. Recall that a flow network is a directed graph with a source node and a sink node , where each arc has a cost and a capacity representing the maximum amount that can flow on the arc; for convenience, we set and for all such that . Let us denote by and the matrices of costs and capacities respectively defined by and . A flow in the network is a function satisfying:
for all (capacity constraints),
(skew symmetry), and
for all (flow conservation).
The value of a flow is defined by and its cost is given by . The optimization problem can be formulated as follows. Given a value , find a flow that minimizes the cost subject to . This optimization problem that takes as input the graph , the matrices and , and the value , will be denoted by MinCostFlow hereafter; given an instance of the MinCostFlow problem, we let be the cost of the optimal flow for that instance.
Given an instance of AssignTC, we construct a flow network and matrices and as follows (see Figure 2 for an illustration). The node set is partitioned into layers: . is the agent type layer: there is one node for all agent types . is the type-block layer: it has a node for every type-block pair . Finally, is the item layer: there is one node for all items . The arcs in are as follows: for every in , there is an arc from to whose capacity is . Fixing , there is an arc from to every , where the capacity of is the quota for type in block (i.e. ). Finally, given , there is an arc from to iff ; in that case, we have . The costs associated with arcs from to (i.e. arcs of the form where ) are ; recall that is the utility that every agent of type assigns to item . All other arc costs are set to . We begin by proving a few technical lemmas on the above network.
Given a positive integer , there exists an optimal flow that is integer-valued since is integer-valued as well. Let be an integer-valued optimal flow, taken over all possible values of ; that is:
Finding the flow involves solving instances of MinCostFlow by definition; thus, one can find in polynomial time. Given as defined in (6), let be defined as follows: for every item , if for some , then we choose an arbitrary unassigned agent and set .
is a feasible solution of the AssignTC instance .
First, we assign at most one item to every agent by construction; next, let us show that each item is assigned to at most one agent. Since is a flow, we have due to flow conservation; note that the capacity of the arc is , thus at most one arc has . Finally, since item is assigned to an agent in iff , we conclude that item is assigned to at most one of the agents in .
Next, let us prove that assignment satisfies the type-block constraints; in other words, we need to show that:
Since is a flow, we have for every type-block pair due to flow conservation; moreover, we have by construction. As a consequence, we necessarily have for all . Since an item is matched with some agent if and only if we have , we conclude that (7) indeed holds. ∎
Now, let us establish a relation between the cost of and the utility of the feasible assignment .
The cost of the flow satisfies .
By construction, the cost of can only be induced by arcs from nodes in to nodes in , where the cost of all arcs of the form , with , is equal to (the negative of the uniform utility derived from item by members of ). In other words, the cost of can be written as follows:
As previously argued, we have that for all arcs ; moreover, iff item is assigned to some agent in . Therefore, we obtain:
where the second equality holds since all agents in have the same utility by assumption. ∎
Finally, we show that for every feasible solution to the AssignTC instance , there exists a flow with a matching cost.
Let be a feasible assignment for the AssignTC instance ; there exists some feasible flow such that . Moreover, we have .
Given a feasible assignment , we define as follows:
The function is indeed a flow: trivially verifies the skew symmetry condition by construction; next, we show that satisfies flow conservation. For all , the incoming flow to node from node is , and the outgoing flow to every is since is partitioned into ; hence flow is conserved. For a node , the incoming flow equals and an amount of flows to every node such that , thus flow is conserved. For a node such that , its incoming flow equals from every , for a total flow of , which equals its outgoing flow to . To conclude, satisfies flow conservation.
Now let us prove that satisfies the capacity constraints (i.e. for all arcs ). For all , we have since every agent is matched with at most one item. For all , we have since satisfies the type-block constraints. For all arcs , we have since item is matched with at most one of the agents in . For all , we have since item is matched with at most one of the agents in . Hence, satisfies the capacity constraints and is a valid flow. Note that we have:
Then, since is a feasible assignment of the AssignTC instance , we conclude that we have . We just need to prove that we have , and we are done. By definition of the flow network, only arcs of the form contribute to the cost and we have ; therefore, . Since (by definition of ) and for all agents (by hypothesis), we finally obtain . ∎
We are now ready to prove Theorem 3.2.
Proof of Theorem 3.2.
We begin by observing the flow as defined in (6), and the assignment derived from it. First, according to Lemma 3.2, is a feasible assignment of the AssignTC instance . Moreover, we have according to Lemma 3.2. Finally, for any feasible assignment of the AssignTC instance , there exists a flow such that ; furthermore, since , flow is a feasible solution of the MinCostFlow instance for some . Therefore, we have:
Thus, is an optimal solution of the AssignTC instance ; since can be computed in poly-time (Proposition 6), we are done. ∎
4. The Price of diversity
We now turn to the allocative efficiency of the constrained assignment. As before, an instance of the AssignTC problem is given by a set of agents partitioned into types , a set of items partitioned into , a list of capacity values , and agent utilities for items given by . We denote the set of all assignments of items to agents satisfying only the matching constraints (3-5) of Section 2 by , and that of all assignments additionally satisfying the type-block constraints (2) by ; the corresponding optimal social welfares for any given utility matrix are:
Clearly, since ; we define the following natural measure of this welfare loss that lies in :
For any instance of the AssignTC problem, the Price of Diversity is given by:
The main result of this section is to establish an upper bound on that is independent of the utility model. Denote the ratio of a type-block capacity to the size of the corresponding block by:
For any instance of AssignTC, we have:
and the above upper bound is tight.
In general, the bound in Theorem 4 grows linearly in (e.g. if the capacities are fixed constants). However, type-block capacities are determined by a central planner in our model; a natural way of setting them is to fix the proportional capacities or quotas in advance, and then compute when block sizes become available: by committing to a fixed minimum type-block quota (i.e. for all ), the planner can ensure a of at most , regardless of the problem size and utility function. Higher values of reduce the upper bound on but also increase the capacity of a block for every ethnicity, potentially affecting the diversity objective adversely: it thus functions as a tunable tradeoff parameter between ethnic integration and worst-case welfare loss. In fact, in the Singapore allocation problem, the Ethnic Integration Policy fixes a universal percentage cap for each of the three ethnicities in all blocks; these percentages are set slighlty higher than the actual respective population proportions: the current block quotas are for Chinese, for Malays and for Indian/Others Deng et al. (2013); plugging in these to the bound in Theorem 4, we have that the Singapore housing system has . This bound makes no assumptions on agent utilities; in other words, it holds under any utility model.
The proof relies on the following lemma. Given an assignment , let denote the total utility of agents in under :
For any instance of AssignTC and any optimal unconstrained assignment , we have:
Based on the optimal assignment we can construct an assignment satisfying the type-block constraints, by ‘revoking’ items in from agents in whenever the type-block constraint is violated for . By revoking from agents with the smallest utilities, we ensure that at least proportion of the utility remains under for . Summing over blocks, we obtain:
Hence, since , we have:
We can now complete the proof of the theorem.
Proof of Theorem 4.
Since we have for all , Lemma 4 implies that:
Depending on the utility matrix , this upper bound can be tight whenever for some type-block pair in the set . We identify an agent utility matrix for which the bound holds with equality:
The optimal assignment without type-block constraints fully allocates the items in block