1 Introduction
Participatory budgeting is a democratic approach to the allocation of public funds. In the participatory budgeting paradigm, city governments fund public projects based on constituents’ votes. In contrast to budget committees, which operate behind closed doors, participatory budgeting promises to directly take the voices of the community into account. Since 2014, Paris has allocated more than €100 million per year using constituents’ votes. Many other cities around the globe — including Porto Alegre, New York City, Boston, Chicago, San Francisco, Lisbon, Madrid, Seoul, Chengdu, and Toronto — employ participatory budgeting (Cabannes, 2004, 2014; Aziz and Shah, 2020).
Typically, participatory budgeting is used at a districtlevel. Each district of the city is allotted a budget proportional to its size. Constituents living in a given district vote on projects such as park, road or school improvements local to the district, using some version of approval voting. Then, the district’s budget is spent according to these votes. For instance, in Paris a participatory budget is split between 20 districts (a.k.a. arrondissements), constituents vote and then each district runs a greedy algorithm to maximize the total social welfare — i.e., the total number of votes — of the funded projects.^{1}^{1}1More specifically, projects are selected in descending order of vote count until the budget runs out.
Having separate elections for each district leads to several problems. Foremost, projects that are not local to a single district cannot be accommodated. For this reason, Paris must run an additional election for citywide projects. However, this splits the available budget for participatory budgeting between districtlevel and citywide elections in an ad hoc manner, which is not informed by votes.^{2}^{2}2In 2016, this split in Paris was €64.3 million for district elections and €30 million for citywide elections (Cabannes, 2017). Further, people may have interests in multiple districts, such as those who live and work in different districts. For this reason, Paris has to allow residents to choose the district in which they vote. Lastly, a project that only benefits voters at the edge of a district may receive a number of votes that is not proportional to the number of potential beneficiaries.
A simple solution to these problems is a single citywide election. However, such a voting scheme may result in unfair outcomes. For instance, if votes are aggregated to maximize social welfare (i.e., as is presently done in Paris on the district level) then it is possible that some districts might have none of their preferred projects funded despite deserving a large proportion of the budget. Such outcomes are likely when some districts are much more populous than others, in which case projects local to small districts cannot gather sufficiently many votes. Ideally, we would like a system that balances the tradeoff between social welfare and fairness without an arbitrary, predetermined split between districtspecific and citywide funding. This motivates our central research question:
How can we maximize social welfare in a way that is fair to all districts?
Intuitively, a solution that is fair to all districts should somehow represent each districts’ constituents. One way to formalize this intuition is to stipulate that no district should be able to obtain higher utility by purchasing projects with its proportional share of the budget. In particular, each district should receive at least as much utility as it would have received had it held a districtlevel election with its proportional share of the budget. We call this guarantee district fairness.^{3}^{3}3Our notion of district fairness can be thought of as a form of individual rationality where every district is seen as an “individual.” A districtfair allocation of funds always exists, since an outcome obtained by holding separate district elections is district fair. We aim to find districtfair outcomes that maximize social welfare. Such an outcome will be a Paretoimprovement on the status quo of districtlevel participatory budgeting, in the sense that each district’s welfare has increased.
Our Results.
In our model we think of (utilitarian) social welfare as induced by a given value assigned by each district to each project; our goal is to maximize the sum of these values over districts and selected projects. Note that this model captures the setting of approval votes, where each voter decides on a collection of projects to vote for; the social welfare of a district for a project would then be interpreted as the project’s overall number of approvals from voters in that district. This observation is important because some variant of approval voting is used in most realworld participatory budgeting elections, including in Paris.
We also assume that each district is endowed with an arbitrary fraction of the total budget. Clearly this captures, as a special case, the common setting where the endowment of each district is proportional to its size. Moreover, the reasoning behind the existence of districtfair outcomes immediately applies to the more general setting.
We first show that it is NPcomplete to compute an allocation that is welfaremaximizing subject to district fairness. This result holds even for the case of approval votes and proportional budgets, and therefore the generality of our model only strengthens our positive (algorithmic) results without weakening the main negative (hardness) result. We also show that the natural linear program (LP) formulation of the problem has an unbounded integrality gap. Since participatory budgeting elections can be large — hundreds of projects are proposed and hundreds of thousands of votes are cast in Paris — computational complexity can become a problem in practice. Thus, we seek polynomialtime solutions with reasonable approximation guarantees.
There are several ways one might relax our problem or tradeoff between parameters in our problem. In this work, we design polynomialtime algorithms that work when we relax or approximate some of the following: (1) the achieved social welfare; (2) the spent budget; (3) the fairness of the solution; and (4) the absence of randomization.
We first relax (4) by considering distributions over outcomes, a.k.a. “lotteries”. We show that using a multiplicativeweightstype algorithm, one can efficiently find a lottery that guarantees budget feasibility (ex post), optimum social welfare (ex post), and districtfairness in expectation up to an (ex ante). Since the fairness guarantee only holds in expectation, some districts may be underserved once the lottery is realized. However, since participatory budgeting typically happens repeatedly (e.g., annually), such districts could be compensated in the next election, for example by increasing their share of the budget in the next year.
We next consider what sort of deterministic guarantees are achievable. To this end, we show how to use techniques from submodular optimization to find an outcome that is district fair “up to one project” and which achieves optimum social welfare with the caveat that the outcome may need to spend more money than was originally budgeted. We also give a randomized algorithm with the same guarantees but which overshoots the budget by only a
fraction with high probability. Additionally, as a corollary of these results, we give both deterministic and randomized algorithms that achieve weaker utility and fairness guarantees but do not overspend the available budget.
Related Work.
The social choice literature on participatory budgeting has both studied the voting rules used in practice, and designed original voting schemes. Goel et al. (2019) study knapsack voting, used for example in Madrid (Cabannes, 2014), where voters cannot approve more projects than fit into the budget constraint. Talmon and Faliszewski (2019) axiomatically study a variety of approvalbased rules that maximize social welfare, both greedy and optimal ones.
The unit cost case (where all projects have the same cost) is beststudied, as multiwinner or committee elections (Faliszewski et al., 2017). For example, this setting models the election of a parliament. A main focus of that literature is the computational complexity of the winner determination of various voting rules. More relevant for our purposes are fairness axioms used in this setting. The most prominent such axioms are variants of justified representation (Aziz et al., 2017). These axioms are formulated for approval votes, and require that arbitrary subgroups of the electorate need to be represented in the outcome if they are cohesive, in the sense that there are a sufficient number of projects that are approved by every member of the subgroup. Several voting rules are known to satisfy these conditions, including Phragmén’s rule and Thiele’s Proportional Approval Voting (Janson, 2016; SánchezFernández et al., 2017; Brill et al., 2017; Aziz et al., 2018). By contrast, districtfairness gives guarantees to a specific selection of subgroups (i.e., disjoint districts) but does not require these groups to be cohesive.
A very strong fairness axiom that is sometimes discussed in the context of committee elections and participatory budgeting is the core (Fain et al., 2016; Aziz et al., 2017; Fain et al., 2018). It insists that every subgroup (or coalition) must be represented (in the sense that it should not be possible for the subgroup to propose an alternative use of their proportional share of the budget that each group member prefers to the chosen outcome), without a cohesiveness requirement. For approvalbased elections, it is a major open question whether there always exists a core outcome. For general additive utilities, there are instance where no core outcome exists (Fain et al., 2018), but several researchers have proved the existence of approximations to the core (Jiang et al., 2020; Fain et al., 2018; Cheng et al., 2019; Peters and Skowron, 2020). A districtfair outcome is, in a sense, in the core: no subgroup which coincides with a district can block the outcome. Thus, our work shows that for general utilities, a corelike outcome exists if we only allow a specific collection of (disjoint) coalitions to block.
The problem of knapsack sharing (Brown, 1979) has a similar motivation to our problem. The knapsack sharing problem supposes that the projects are separated into districts (instead of, in our case, the voters), and each project comes with a cost and a value. The aim is to find a budgetfeasible set of projects that maximize the minimum total value of the projects in a district. Note that in this formulation all districts are treated equally (there is no weighting by district population) and that there is no notion of the value of a project to a specific district. The literature contains a variety of algorithms for solving this NPhard problem (e.g., Yamada and Futakawa, 1997; Yamada et al., 1998; Hifi et al., 2005; Fujimoto and Yamada, 2006).
2 Formal Problem, Notation and Definitions
Formally, the setting we consider is as follows. We are given a budget . There are possible projects with associated nonnegative costs . We refer to a subset as an outcome. The cost of an outcome is . We say that a subset is budgetfeasible if .
There are districts . The social welfare (or utility) that project provides to district is . We assume that utilities are additive; i.e., the utility that an outcome provides to district is . Furthermore, the total social welfare of is .
Throughout this work we assume that and are both for each . (A function is if there exists a such that .) We can relax this assumption using wellknown bucketing techniques at the cost of an arbitrarily small in the guarantees of our algorithms. See the fully polynomial time approximation scheme for the knapsack problem (Chekuri and Khanna, 2005) for an example of this technique.
To model the participatory budgeting setting, we assume that each district deserves some portion of the budget and, in turn, deserves at least the utility it could achieve if it spent its budget on its most preferred projects. Specifically, each district deserves some budget where . District deserves utility , where is ’s favorite outcome costing at most .
Definition 1 (DistrictFair Outcome).
We say that an outcome is districtfair (DF) if for all .
Computing is precisely an instance of the knapsack problem; by our assumption that utilities and costs are polynomially bounded, this knapsack instance is solvable in polynomial time (Chekuri and Khanna, 2005). Thus, we will assume is known.
Note that the outcome is both budgetfeasible and districtfair, so an outcome with both properties always exists. Our goal is to find a budgetfeasible and districtfair outcome which maximizes social welfare . We call our problem districtfair welfare maximization. Throughout this paper, we let be some optimal solution, where the argmax is taken over budgetfeasible and districtfair solutions. Similarly, we let .
We consider two relaxations of district fairness. The first relaxation extends the concept to lotteries over outcomes. We require that each district only needs to be approximately satisfied in expectation. We give an efficient algorithm to compute optimal districtfair lotteries in Section 4.
Definition 2 (DistrictFair Lottery).
Given
, we say that a probability distribution
over outcomes of cost at most is an districtfair (DF) lottery if for every district .The second relaxation is districtfairness up to one good (DF1). Intuitively, an allocation is DF1 if each district would be satisfied if one additional project was funded.
Definition 3 (Df1).
An outcome is DF1 if for every ,
DF1 is inspired by the wellstudied notion of EF1 (envyfreeness up to one good) from the private goods setting (Budish, 2011). This relaxation is mild, and unlike relaxations that require districtfairness to hold on average over districts, it is a uniform relaxation which provides guarantees for all districts. We study DF1 outcomes in Section 5.
3 NPHardness
Our first result shows that the problem of optimizing social welfare subject to districtfairness is NPhard even in the restricted setting of approval votes (i.e., voters provide binary yes/no opinions over projects) and budgets proportional to district sizes. In fact, our problem remains NPhard in this restricted setting even when each district contains only one voter and projects have unit costs.
We reduce from exact 3cover (X3C), which is known to be NPhard (Garey and Johnson, 1979). The idea of our reduction is as follows. Given an instance of X3C, we define a district for each of the elements in the universe, and then add a large amount of dummy districts. We then define a project for each set in our problem instance which gives one utility to the districts corresponding to the elements which it covers. We also define a large set of dummy projects that are approved by all dummy districts. We then ask whether there exists a districtfair outcome that attains high social welfare. An optimal solution for our districtfair welfare maximization problem, then, will first try to solve the X3C instance as efficiently as possible so that it can spend as much of its budget as possible on highutility dummy projects. We formalize this idea in the following proof.
Theorem 1.
It is NPcomplete to decide, given an instance of districtfair welfare maximization and an integer , whether there exists a budgetfeasible and districtfair outcome such that . NPhardness holds even in the restricted setting of approval votes and budgets proportional to district sizes, and when each district contains one voter and all projects have unit cost.
Proof.
The stated problem is trivially in NP. For NPhardness we reduce from X3C. In an instance of X3C, we are given a universe and a collection of 3element subsets of . It is a “yes”instance if there exists a selection such that .
Given an instance of X3C, we construct an instance of our problem as follows. Let . We have districts, . Let , where each in corresponds to element . Additionally, let , where each is a dummy district. We have projects, . Let , where corresponds to set , and let , where each is a dummy project. Utilities are as follows: every dummy district approves every dummy project, so for each and . Also, each nondummy district approves of nondummy sets to reflect the structure of the X3C instance: that is, for each we have if and . All other utilities are 0: that is, for all other and . Each project has cost 1, and our budget is . We assume all districts contain 1 voter, so for every district . Clearly, for each . We ask whether there exists a district fair committee with social welfare at least .
If there exists a solution to the X3C instance, then is an outcome with cost . Clearly, is districtfair, and its social welfare is , so this is a “yes”instance for the districtfair welfaremaximization problem.
Conversely suppose that there exists a districtfair budgetfeasible outcome with social welfare at least . Note that all projects in together give overall welfare at most . Thus, we must have since otherwise the total welfare of is less than . Hence . By districtfairness, for each , there must be some such that . These two facts together imply that is a solution to the X3C instance. ∎
This NPhardness result holds even if each district consists of a single voter and all projects have unit cost. As we show in Appendix A in the supplementary material, this special case admits a polynomialtime approximation. Our algorithm is based on a greedy algorithm and a combinatorial argument which “matches away” high utility goods of the optimal solution. One might hope to achieve an approximation result for the general case. A natural approach would be to round the optimal solution to the LP relaxation of the natural ILP formulation of our problem. However, a simple example in Appendix B in the supplementary material shows that the integrality gap of that formulation is unboundedly large, so this approach will not work.
4 Optimal DistrictFair Lottery
In this section, we allow randomness and consider lotteries over outcomes. Our main result for the lottery setting is an DF lottery which always achieves the optimal social welfare subject to district fairness. The welfare guarantee is ex post, so that every outcome in the lottery’s support achieves optimal welfare. For the remainder of this section we let refer to the in the DF definition.
Theorem 2.
There is an algorithm which, in time, returns an DF lottery such that for all outcomes in the support of , we have .
The intuition for our algorithm is as follows. We begin by showing that our problem is polynomialtime solvable if the number of districts is constant. Such an algorithm is useful because we can artificially make the number of districts constant by convexly combining all districts into a single district . We can, then, compute as our solution a utilityoptimal outcome which is fair for but not necessarily fair for each individually. However, we can bias our solution to try and satisfy fairness for certain districts by increasing the weights of these districts in our convex combination. Thus, if is not fair for , we might naturally increase the proportional share of in the convex combination and recompute in the hopes that the new outcome we compute will be fair for
. We obtain our lottery by repeatedly increasing the weight of districts that do not have their fairness constraint satisfied, and then take a uniform distribution over the resulting outcomes.
Turning to the proof, we begin by describing how to solve our problem in polynomial time when is a constant. Our algorithm will solve the natural dynamic program (DP). Specifically, consider the true/false value which is the answer to the question, “Does there exists an outcome of cost at most using projects wherein district achieves social welfare at least ?” If the answer to this question is yes, then either the desired utilities are possible with the stated budget without using or there is an outcome which uses at most budget that doesn’t use in which every district gets at least its specified utility minus how much it values . Thus, is true if and only if either is true or is true, giving us a definition by recurrence.
By our assumption that all costs and utilities are polynomially bounded, we can easily solve the dynamic program (DP) for the above recurrence, giving the following result.
Lemma 3.
There is an algorithm that finds a budgetfeasible districtfair outcome with in time.
Proof.
Our algorithm simply fills in the DP table and returns the outcome corresponding to the entry in our DP table which is true, satisfies for all and which maximizes . The recurrence is correct by the above reasoning.
To see why we can fill in the DP table in the stated time, note that we can trivially solve our base case, , for each and possible value for each in polynomial time. Since is polynomially bounded in , we need only check polynomiallymany in values for each . Lastly, since and are bounded by a polynomial in , we conclude that our DP table has entries, giving the desired runtime. ∎
We now describe our multiplicativeweightstype algorithm to produce our lottery using the above algorithm.^{4}^{4}4We will only need to invoke the above algorithm for the case . This amounts to solving the knapsack problem with a single covering constraint, which to our knowledge is not one of the standard variants of the knapsack problem. We let be the “weight” of district in iteration and let be the total weight in iteration . Initially, our weights are uniform: for all .
For any iteration and district we let be the proportion of the weight that district has in iteration . These will induce our convex combination over districts; in particular we let be a district which values project to extent and which deserves utility. Also, let be the maximum welfare of an outcome.
With the above notation in hand, we can give our instantiation of multiplicative weights where is the number of iterations of our algorithm.

For all iterations :

Let be an outcome that maximizes subject to and . We can compute using Lemma 3.

Let be our “mistakes”, indicating how far off a district was from getting what it deserved.

Update weights: .


Return lottery , the uniform distribution over .
We now restate the usual multiplicative weights guarantee in terms of our algorithm. This lemma guarantees that, on average, the multiplicative weights strategy is competitive with the best “expert.” In the following is the usual inner product.
Lemma 4 (Arora et al., 2012).
For all we have
We can use this lemma to show the desired guarantees.
Proof of Theorem 2.
We use the algorithm described above.
Our algorithm is polynomial time since it runs for polynomiallymany iterations and in each iteration we compute a solution for a problem on only one district which is solvable in polynomial time by Lemma 3. Also, note that by Lemma 3 we know that for all , so all outcomes in the lottery are budgetfeasible.
We now argue that the above lottery is utilityoptimal. Fix an iteration . Notice that since is fair for all districts then it is fair for . In particular,
Thus, is a budgetfeasible solution for the problem of finding a maxutility outcome which is fair for . Thus, can only be larger than , meaning that .
We now argue that the above lottery is DF in expectation. Fix a district . By Lemma 4 we know that
(1) 
Now notice that by definition of and since our lottery is uniform over all we know that the righthandside of Equation 1 is
Thus, to show that , it suffices to show that the lefthand side of Equation 1 is at least . That is, we must show . However, this amounts to simply showing that is fair for ; in particular, we have that the lefthandside is
It holds that since we always choose a solution which is fair for , and so we conclude that the lefthandside of Equation 1 is at least . ∎
5 Optimal DF1 Outcome with Extra Budget
We now study how well we can do if we allow ourselves to overspend the available budget. Certainly it is possible to achieve district fairness and optimal fairnessconstrained utility if the algorithm can spend double the available budget: we can compute an outcome with that is welfaremaximizing without attempting to satisfy districtfairness, and we can compute some outcome with that is districtfair (see Section 2); then satisfies district fairness and we clearly have and . In this section, we show that we can find a solution that requires less than twice the budget, if we slightly relax the district fairness requirement to DF1. Our main result for the DF1 setting shows that, under DF1 fairness, there is a deterministic algorithm which achieves DF1 and optimal social welfare if one overspends a fraction of the budget.
Theorem 5.
For any constant , there is a time algorithm which, given an instance of districtfair welfare maximization, returns an outcome such that is DF1, , and .
Overspending by 64.7% is a worstcase result, and the algorithm may often overspend less. If the context does not permit any overspending, one can run the same algorithm with a reduced budget; then the output will be feasible for the true budget, yet will satisfy weaker fairness and social welfare guarantees. More precisely, given an instance and a multiplier , we define an instance , which is identical to but in which each district contributes only and thus deserves utility , where is ’s favorite outcome which costs at most . Additionally, let represent the maximum achievable social welfare over all districtfair solutions in using a budget of at most . Then, applying Theorem 5 to results in an outcome which is DF1 and utilityoptimal on this reduced instance and does not overspend the original budget .
Corollary 6.
For any constant , there is a time algorithm which, given an instance of districtfair welfare maximization, returns an outcome such that is DF1 for , , and .
Our result uses a submodular optimization as a subroutine. If one allows randomization in this subroutine, algorithms with better approximation ratios are known. Thus, we can prove a similar theorem (and corollary) with a randomized algorithm which achieves DF1 and optimal social welfare while overspending its budget by only a fraction of the budget, with high probability (i.e., with probability where is some polynomial in and ). We defer details of our randomized algorithm to Appendix C in the supplementary material.
In the remainder of this section, we will prove Theorem 5. Our main tool is a notion of the “coverage” of a partial outcome. An outcome has high coverage if we do not need to spend much more money to make it districtfair. On a high level, our proof consists of two main steps. First, we show how to complete an outcome with good coverage into a DF1 outcome. Second, we will show how to frame the problem of finding a solution with good coverage and social welfare as a submodular maximization problem subject to linear constraints, allowing us to use a result by Mizrachi et al. (2018).
We begin by formalizing the coverage of a solution. Roughly, if we imagine that initially every district requires its portion of the budget for fairness, then fractional coverage captures how much less districts must spend to satisfy their own fairness constraints. Thus, if we imagine that our algorithm first spends its budget to satisfy fairness as efficiently as possible, and then spends the remainder of its budget on the highest utility projects, then the coverage of a collection of projects is roughly how much budget this collection “frees up” for the algorithm to spend on the highest utility projects. More formally, we define coverage by way of the notions of fractional outcomes and residual budget requirements.
Definition 4 (fractional outcomes).
A fractional outcome is a vector
where . We overload notation and let the social welfare of for district be . Similarly the social welfare of is . Lastly, we define the cost of as .We now define the residual budget requirement of a district, given an outcome, which can be understood as the minimum amount of additional money that must be spent to satisfy the district, if fractional outcomes are allowed.
Definition 5 ().
The residual budget requirement of district given (integral) outcome is the minimum cost of a fractional outcome such that and for all .
We can now define the coverage of an outcome for a particular district in terms of the total amount of budget they deserve and their residual budget requirement.
Definition 6 ().
The coverage of an outcome for district is the difference between the amount of budget they deserve, , and their residual budget requirement: .
Lastly, we define the coverage of an outcome.
Definition 7 ().
The overall coverage of an outcome is the sum over all districts of the coverage affords : .
Next, we establish a useful property of DF1 solutions. In particular, given a set of projects that achieves relatively good fairness on average, we can then buy a small subset of projects that results in fairness up to one good for all districts. In particular, given a collection of projects that covers a fraction of all fairness constraints, we can use at most an extra fraction of our budget in order to complete this to a DF1 solution. Moreover, this completion is quite intuitive: purchase all projects whose total coverage exceed their cost, until there are no such projects remaining.
Formally, we state the following DF1 completion lemma.
Lemma 7 (DF1 Completion).
Given an outcome with , one can compute in polynomial time a set such that is DF1 and .
Proof.
We first prove that for every nonDF1 outcome , there exists a project that we can add to which increases its coverage by at least . Suppose that is an outcome that fails DF1, and let be a district such that for all . Let be the fractional outcome witnessing ; thus . We may assume without loss of generality that all but at most one project is integral in (because there is always some optimal with this property by additivity of ). Since fails DF1 for , there is some such that . Then (witnessed by the fractional outcome obtained from by removing from it). Thus, from definitions, , and hence .
Now suppose we are given an outcome with , which fails DF1. We can identify a project as above, add it to , and increase the coverage by at least . We repeat this until the outcome is DF1. This process must stop, since at each step the coverage increases by but by definition the coverage can never exceed . For the same reason, the cost of the projects we have added to cannot exceed , and thus . ∎
With this lemma in hand, we now turn to the problem of finding highcoverage outcomes with good welfare. Let be a lower bound on the social welfare we desire. We rephrase our problem as an optimization problem in which we maximize the coverage of an outcome subject to a linear knapsack constraint and a linear covering constraint. The knapsack constraint enforces budget feasibility, and the covering constraint encodes the requirement that the total utility of the outcome is at least .
(DF1P) 
The main tool we apply is a theorem on the maximization of nondecreasing submodular functions of Mizrachi et al. (2018). Recall that a set function is nondecreasing if its value never decreases as elements are added to its input, and submodular if it exhibits diminishing returns.
Definition 8.
Given a finite set , a set function is nondecreasing and submodular if for every such that we have and for all .
The theorem we apply is as follows.
Theorem 8 (Mizrachi et al., 2018, Theorem 5).
For each constant , there exists a deterministic algorithm for maximizing a nondecreasing submodular function subject to one packing constraint and one covering constraint that runs in time , where is the size of the support of the set function, satisfies the covering constraint up to a factor of and the packing constraint up to a factor of , and achieves an approximation ratio of .
We apply this theorem to find a solution that satisfies a fraction of coverage and achieves optimal fairnessconstrained utility. Then, we apply Lemma 7 to augment our solution using an additional fraction of our budget in order to obtain a final solution which satisfies full DF1. However, in order to apply Theorem 8, we must first establish that is a nondecreasing submodular function. In particular, note that the coverage functions for each district are clearly nondecreasing and submodular. It follows that their sum, is also nondecreasing and submodular, yielding the following lemma.
Lemma 9.
The function is nondecreasing and submodular.
We are now ready to prove Theorem 5, which applies the DF1 completion lemma to an approximately optimal solution for the problem DF1P.
Proof of Theorem 5.
Recall that we have assumed that the maximum utility of an outcome is polynomially bounded in and and that the maximum utility is integral. Thus, the value of falls in a polynomial range. For each value in this range, solve the problem DF1P using the algorithm from Theorem 8. Now consider all values of for which the algorithm returned a solution with ; such a value must exist since we are guaranteed this condition when (since for this value, the optimum of problem (DF1P) is ). Among all solutions we found that satisfy , take the one that maximizes . This solution provides social welfare at least .
6 Discussion
Our results extend to the special case of unit costs, also known as committee selection. In committee selection, we elect a committee to represent voters in a larger governmental body such as a parliament. Often, to ensure local representation, the electorate is split into voting districts, which elect their representatives separately. The districts may be apportioned different numbers of representatives, for example based on district size. While this scheme guarantees each district representation, it may well be possible to increase the welfare of the voters in a district, for example by electing a diverse array of candidates with expertise in various areas who can gather votes from across the electorate. Thus, it is natural for all districts to elect the committee together if we impose districtfairness constraints. This way, we can maximize social welfare of the final committee while guaranteeing each district fair representation. This gives a more holistic view of committee selection in exactly the same way we addressed participatory budgeting, only instead of pooling the budget between districts, we now pool seats on a committee.
Our model implicitly treats districts as atoms, and so district fairness is a kind of individual rationality property. In turn, individual rationality is a type of strategyproofness: it incentivizes districts not to leave the central election and instead hold a separate one. Is it possible to design a voting scheme that is fully strategyproof for districts, so that districts do not have incentives to misreport the utilities of their residents? Unfortunately not: Peters (2018) proves an impossibility theorem about committee elections which implies that there does not exist a voting rule that is efficient, districtfair, and also strategyproof. This result holds even for approval votes.
Several open questions remain. Most obvious is the question of whether can we achieve welfare maximization and DF1 in polynomial time while guaranteeing to overspend the budget by less than . More broadly, it would be interesting to study our problem with more general utility functions such as submodular or even general monotone valuation functions. Additionally, it would be exciting to study approximation algorithms which promise full district fairness. In Appendix B in the supplementary material, we present an algorithm which satisfies district fairness and provides a approximation to optimal districtfair social welfare in the special case of unanimous districts; it would be interesting to extend this result to the general case.
References
 The multiplicative weights update method: a metaalgorithm and applications. Theory of Computing 8 (1), pp. 121–164. Cited by: Lemma 4.
 Justified representation in approvalbased committee voting. Social Choice and Welfare 48 (2), pp. 461–485. Cited by: §1, §1.

On the complexity of extended and proportional justified representation.
In
Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI)
, pp. 902–909. Cited by: §1.  Participatory budgeting: models and approaches. In Pathways Between Social Science and Computational Social Science: Theories, Methods, and Interpretations, T. Rudas and G. Péli (Eds.), Cited by: §1.
 Phragmén’s voting methods and justified representation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI), pp. 406–413. Cited by: §1.
 The knapsack sharing problem. Operations Research 27 (2), pp. 341–355. Cited by: §1.
 The combinatorial assignment problem: approximate competitive equilibrium from equal incomes. Journal of Political Economy 119 (6), pp. 1061–1103. Cited by: §2.
 Participatory budgeting: a significant contribution to participatory democracy. Environment and Urbanization 16 (1), pp. 27–46. Cited by: §1.
 Participatory budgeting in paris: act, reflect, grow. Another city is possible with participatory budgeting, pp. 179–203. Cited by: footnote 2.
 Contribution of participatory budgeting to provision and management of basic services. London: IIED. Cited by: §1, §1.
 A polynomial time approximation scheme for the multiple knapsack problem. SIAM Journal on Computing 35 (3), pp. 713–728. Cited by: §2, §2.
 Group fairness in committee selection. In Proceedings of the 20th ACM Conference on Economics and Computation (ACM EC), pp. 263–279. Cited by: §1.
 The core of the participatory budgeting problem. In Proceedings of the 12th International Conference on Web and Internet Economics (WINE), pp. 384–399. Cited by: §1.
 Fair allocation of indivisible public goods. In Proceedings of the 19th ACM Conference on Economics and Computation (ACM EC), pp. 575–592. Note: Extended version arXiv:1805.03164 Cited by: §1.
 Multiwinner voting: a new challenge for social choice theory. In Trends in Computational Social Choice, U. Endriss (Ed.), Cited by: §1.
 An exact algorithm for the knapsack sharing problem with common items. European Journal of Operational Research 171 (2), pp. 693–707. Cited by: §1.
 Computers and intractability. Vol. 174, Freeman San Francisco. Cited by: §3.
 Knapsack voting for participatory budgeting. ACM Transactions on Economics and Computation (TEAC) 7 (2), pp. 1–27. Cited by: §1.
 An exact algorithm for the knapsack sharing problem. Computers & Operations Research 32 (5), pp. 1311–1324. Cited by: §1.

Probability inequalities for sums of bounded random variables
. Journal of the American Statistical Association 58 (301), pp. 13–30. Cited by: Appendix C.  Phragmén’s and Thiele’s election methods. Technical report Note: arXiv:1611.08826 [math.HO] Cited by: §1.
 Approximately stable committee selection. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pp. 463–472. Cited by: §1.
 A tight approximation for submodular maximization with mixed packing and covering constraints. Note: arXiv:1804.10947 Cited by: Appendix C, §5, §5, Theorem 14, Theorem 8.
 Proportionality and strategyproofness in multiwinner elections. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Vol. 1549–1557. Cited by: §6.
 Proportionality and the limits of welfarism. In Proceedings of the 21st ACM Conference on Economics and Computation (ACM EC), pp. 793–794. Cited by: §1.
 Proportional justified representation. In Proceedings of the 31st AAAI Conference on Artificial Intelligence (AAAI), pp. 670–676. Cited by: §1.
 A framework for approvalbased budgeting methods. In Proceedings of the 33rd Conference on Artificial Intelligence (AAAI), Vol. 33, pp. 2181–2188. Cited by: §1.
 The design of approximation algorithms. Cambridge University Press. Cited by: Appendix B.
 Some exact algorithms for the knapsack sharing problem. European Journal of Operational Research 106 (1), pp. 177–183. Cited by: §1.
 Heuristic and reduction algorithms for the knapsack sharing problem. Computers & operations research 24 (10), pp. 961–967. Cited by: §1.
Appendix A Approximation for Unanimous Districts with Unit Costs
In this section we study approximation algorithms for the simplest version of our problem which we know to be NPhard: when each districts consist of a single voter and every project has unit cost. In fact, we will study a strictly more general setting than each district consisting of a single voter; namely, we study the setting where each district is “unanimous.” Formally, we study instances of districtfair welfare maximization where for all and for all where is the number of voters in . For this setting we will give a approximation.
Our algorithm will make use of the following notion of conditional coverage which builds on Definition 6.
Definition 9.
The coverage of a project given an outcome is .
Notice that by our assumption of unit cost and unanimous districts we have that .
We now present our greedy algorithm that satisfies district fairness and achieves a approximation to the optimal districtfair utility for the setting of unanimous districts and unit costs. Formally, the algorithm, which we call the Unanimous Greedy Algorithm (UGA) proceeds as follows.

Given an instance , initialize .

For :

Let be the max possible coverage and let be all projects which achieve this coverage.

Let be the max covering project with maximum utility.

Update .


Return .
Theorem 10.
Given an instance consisting of unanimous approval districts, UGA returns a solution which satisfies district fairness and achieves a approximation to the optimal districtfair utility.
Proof.
Let represent the result of UGA, and let represent the optimal districtfair outcome. Furthermore, let and be all projects purchased by UGA that had conditional coverage at least 2, exactly 1 and exactly 0 when purchased by UGA respectively. Clearly is districtfair and budgetfeasible and so we need only argue that it achieves at least utility.
Now, consider the following subproblem, which we will call , which intuitively is our original instance but where all of is forced to be in a solution and no projects from are available. More formally, is but where our budget is changed to , is changed to for all and the set of purchasable projects is . is the same for all in as in . Notice that the coverage of any project in is at most but the total coverage required for fairness is , meaning that every budgetfeasible and districtfair solution for has size exactly . Also notice that is not only feasible for but also attains the optimal utility among all districtfair and budgetfeasible solutions.
We claim that there exists a subset which is districtfair and budgetfeasible for . To see this, note that we can iteratively build by initializing it to and then repeatedly adding to it any such that in is at least . After such additions we are guaranteed to have a districtfair and budgetfeasible solution for and such an always exists since is districtfair for . As noted above, any districtfair and budgetfeasible solution for has size and so .
Thus, since is optimal for , we know that achieves at least as high utility as , i.e., , and
(2) 
It remains to understand the utility of . However, note that at least half of the projects other than must be in the phase. That is, . Intuitively, this means that UGA “frees up” at least money to spend on highutility projects. Let be all projects not in or . We have that is the utility of the top projects in . On the other hand, consider projects in . We can divide these into projects which are in and and those which are not. In particular, let and let so that . Now notice that trivially
(3) 
On the other hand, and and so is at most the utility of the highest utility projects in . Since our utilities are additive and is the utility of the highest utility projects in , it follows that
(4) 
Since , we can combine the above bounds to conclude our approximation. Namely, applying the additivity of our utilities and combining Equations 2, 3 and 4 we have
and so we conclude that .
∎
Appendix B Integrality Gap
Here, we investigate the integrality gap of the natural LP for our problem. As a reminder, the integrality gap of an LP measures how much better a fractional solution can do than an integral solution. An unbounded integrality gap shows that any analysis of an approximation algorithm which charges the value of its integral solution to the value of the optimal LP gives an unboundedelybad approximation ratio. For this reason integrality gaps are sometimes taken as evidence of hardness of approximation. For more details on the topic of integrality gaps see Williamson and Shmoys (2011).
We will show that our LP has an unbounded integrality gap which suggests that approximation algorithms which return budgetfeasible and districtfair solutions with nearlyoptimal social welfare may be difficult or impossible to attain for the general case.
Formally our LP and its integrality gap are as follows. Our LP has a variable for each project corresponding to the extent to which we choose .
(DFLP) 
We let correspond to the polytope corresponding to the above LP for an instance of districtfair welfare maximization.
The integrality gap of DFLP is defined as
The basic idea of our integrality gap construction is as follows. We will construct an instance of socialwelfare maximization where the preferences of each district are “circular”. In particular, each district will like two projects and every project will be liked by exactly two districts. As in our NPhardness proof, we will also have a collection of dummy projects which are given very high utility by dummy districts which deserve no utility. An optimal fractional solution will be able to choose each nondummy project to extent essentially to satisfy districtfairness and then spend its remaining budget on highutility dummy projects. On the other hand, the optimal integral solution will have to spend its entire budget satisfying fairness.
Theorem 11.
There does not exist a function such that the integrality gap of DFLP is at most . Further, this integrality gap holds even when all projects have unit cost.
Proof.
Fix and a sufficiently small . We define our instance of socialwelfare maximization on districts where will be nondummy districts and the district will be a dummy district. Similarly, we will have projects where will be nondummy projects and the remaining projects will be dummy projects.
For each nondummy district we let and define its utility for project as
For the dummy district we let and define its utility for project as
for sufficiently large to be chosen later. Notice that for each dummy project . Lastly, we let our budget and we let for all .
Now notice that each nondummy district has . Consequently, any districtfair integral solution must include all nondummy projects, namely . However, since , it follows that the only districtfair integral solution is where .
On the other hand, consider the following fractional solution . For each nondummy project we let . For each dummy project we let be . Clearly . Moreover, notice that for each district we have and so our solution is indeed in the polytope of DFLP. However, since for each dummy project we have that .
Thus, for the above instance we have that the ratio of the optimal integral solution to the optimal fractional solution is at most
Since can be chosen independently of and , we have that the above instance has integrality gap strictly less than for any function of and . ∎
We note that the proof of the above result also rules out any integrality gap which is which is larger than where is the total number of voters across all districts.
Appendix C Randomized Optimal DF1 Outcome with Extra Budget
In this section we give our randomized analogues of Theorem 5 and Corollary 6. We use the notation of Section 5 throughout this section. Whereas our deterministic algorithms overspend budget by , our randomized algorithms will only overspend it by with high probability. Formally, we show the following theorem.
Theorem 12.
There is a time algorithm which, given an instance of districtfair welfare maximization, returns an outcome such that is DF1, with high probability, and for any fixed constant .
As with Corollary 6 for Theorem 5, we immediately have a corollary which gives an algorithm which does not overspend its budget.
Corollary 13.
There is a time algorithm which, given an instance of districtfair welfare maximization, returns an outcome such that is DF1 for , with high probability, and for any fixed constant .
On a high level, this proof will closely follow that of Theorem 5. In particular, it uses the same submodular optimization framing of the problem (i.e., DF1P). However, there are some notable differences. In particular, we leverage the following randomized result from Mizrachi et al. (2018) instead of the previous deterministic result from Mizrachi et al. (2018).
Theorem 14 (Mizrachi et al., 2018, Theorem 1).
For each constant