1 Introduction
In a decisionmaking scenario, the task is to aggregate the opinions of a group of different people into a common decision. This process is often distributed, in the sense that smaller groups first reach an agreement, and then the final outcome is determined based on the options proposed by each such group. This can be due to scalability issues (e.g., it is hard to coordinate a decision between a very large number of participants), due to different roles of the groups (e.g., when each group represents a country in the European Union), or simply due to established institutional procedures (e.g., electoral systems).
For example, in the US presidential elections^{1}^{1}1https://www.usa.gov/election, the voters in each of the 50 states cast their votes within their regional district, and each state declares a winner; the final winner is taken as the one that wins a weighted plurality vote over the state winners, with the weight of each state being proportional to its size. Another example is the Eurovision Song Contest^{2}^{2}2https://eurovision.tv/about/voting, where each participating country holds a local voting process (consisting of a committee vote and an Internet vote from the people of the country) and then assigns points to the 10 most popular options, on a 112 scale (with 11 and 9 omitted). The winner of the competition is the participant with the most total points.
The foundation of utilitarian economics, which originated near the end of the 18th century, revolves around the idea that the outcome of a decision making process should be one that maximizes the wellbeing of the society, which is typically captured by the notion of the social welfare. A fundamental question that has been studied extensively in the related literature is whether the rules that are being used for decision making actually achieve this goal, or to what extend they fail to do so. This motivates the following question: What is the effect of distributed decision making on the social welfare?
The importance of this investigation is highlighted by the example of the 2016 US presidential election (Wikipedia, 2016). While 48.2% of the US population (that participated in the election) viewed Hillary Clinton as the best candidate, Donald Trump won the election with only 46.1% of the popular vote. This was due to the districtbased electoral system, and the outcome would have been different if there was a single pool of voters instead. A similar phenomenon occurred in the 2000 presidential election as well, when Al Gore won the popular vote, but George W. Bush was elected president.
1.1 Our setting and contribution
For concreteness, we use the terminology of voting as a proxy for any distributed decisionmaking scenario. A set of voters are called to vote on a set of alternatives through a districtbased election. In other words, the set of voters is partitioned into districts and each district holds a local election, following some voting rule. The winners of the local elections are then aggregated into the single winner of the general election. Note that this setting models many scenarios of interest, such as those highlighted in the above discussion.
We are interested in the effect of the distributed nature of the election on the social welfare of the voters (the sum of their valuations for the chosen outcome). Typically, this effect is quantified by the notion of distortion (Procaccia and Rosenschein, 2006), which is defined as the worstcase ratio between the maximum social welfare for any of the outcomes and the social welfare for the outcome chosen through voting. Concretely, we are interested in bounding the distortion of voting rules for districtbased elections.
We consider three cases when it comes to the district partition:

symmetric districts, in which every district has the same number of voters and contributes the same weight to the final outcome,

unweighted districts, in which the weight is still the same, but the sizes of the districts may vary, and finally

unrestricted districts, where the sizes and the weights of the districts are unconstrained.
For each of these cases, we show upper and lower bounds on the distortion of voting rules, under standard assumptions.
First, in Section 3, we consider general voting rules (which might have access to the numerical valuations of the voters) and provide distortion guarantees for any voting rule as a function of the worstcase distortion of the voting rule when applied to a single district. As a corollary, we obtain distortion bounds for Range Voting, i.e., the rule that outputs the alternative that maximizes the social welfare, and prove that this mechanism is optimal among all voting rules for the problem. Then, in Section 4, we consider ordinal rules and provide a general lower bound on the distortion of any such rule. For the widelyused Plurality voting rule, we provide tight distortion bounds, proving that it is asymptotically the best ordinal voting rule in terms of distortion. In Section 5, we provide experiments based on real data to evaluate the distortion on “average case” and “average worst case” district partitions. Finally, in Section 6, we explore whether districting (i.e., manually partitioning the voters into districts in the bestway possible) can allow to recover the winner of Plurality or Range Voting in the election without districts. We conclude with possible avenues for future work in Section 7.
1.2 Related Work
The distortion framework was first proposed by Procaccia and Rosenschein (2006) and subsequently it was adopted by a series of papers; for instance, see (Anshelevich et al., 2018; Anshelevich and Postl, 2017; Benade et al., 2017; Bhaskar et al., 2018; Boutilier et al., 2015; Caragiannis et al., 2017; FilosRatsikas and Miltersen, 2014). The original idea of the distortion measure was to quantify the loss in performance due to the lack of information, meaning how well can an ordinal voting rule (that has access only to the preference orderings induced by the numerical values of the voters) can approximate the cardinal objective. In our paper, the distortion will be attributed to two factors: always the fact that the election is being done in districts, and possibly also the fact that the voting rules employed are ordinal. Our setting follows closely that of Boutilier et al. (2015) and Caragiannis et al. (2017), with the novelty of introducing districtbased elections and measuring their distortion. The worstcase distortion bounds of voting rules in the absence of districts can be found in the aforementioned papers.
The ill effects of districtbased elections have been highlighted in a series of related articles, mainly revolving around the issue of gerrymandering (Schuck, 1987), that is, the systematic manipulation of the geographical boundaries of an electoral constituency in favor of a particular political party. The effects of gerrymandering have been studied in the related literature before (Borodin et al., 2018; CohenZemach et al., 2018; Lev and Lewenberg, 2019), but never in relation to the induced distortion of the elections. While our district partitions are not necessarily geographicallybased, our worstcase bounds capture the potential effects of gerrymandering on the deterioration of the social welfare. Other works on districtbased elections and distributed decisionmaking include (Bachrach et al., 2016; Erdélyi et al., 2015).
Related to our results in Section 6 is the paper by Lewenberg et al. (2017), where the authors explore the effects of districting with respect to the winner of Plurality, when ballot boxes are placed on the real plane, and voters are partitioned into districts based on their nearest ballot box. The extra constraints imposed by the geological nature of the districts in their setting leads to an NPhardness result for the districting problem, whereas for our unconstrained (other than being symmetric) districts, we prove that making the Plurality winner the winner of the general election is always possible in polynomial time. In contrast, the problem becomes NPhard when we are interested in the winner of Range Voting instead of Plurality.
2 Preliminaries
A general election is defined as a tuple , where

is a set of alternatives;

is a set of voters;

is a set of districts, with district containing voters such that (i.e., the districts define a partition of the set of voters);

is a valuation profile for the voters, where contains the valuation of voter for all alternatives, and is the set of all such valuation profiles;

is a set of voting rules (one for each district), where is a map of valuation profiles with voters to alternatives.
For each voter , we denote by the district she belongs to. For each district , a local or district election between its members takes place, and the winner of this election is the alternative that gets elected according to . The outcome of the general election is an alternative
where is equal to if the event is true, and otherwise. In simple words, the winner of the general election is the alternative with the highest weighted approval score, breaking ties arbitrarily. For example, when all weights are , is the alternative that wins the most local elections.
Following the standard convention, we adopt the unitsum representation of valuations, according to which for every voter . For a given valuation profile , the social welfare of alternative is defined as the total value the agents have for her:
Throughout the paper, we assume that the same voting rule is applied in every local election (possibly for a different number of voters though, depending on how the districts are defined); we denote this voting rule by and also let to be the alternative that is chosen by when the voters have the valuation profile .
The distortion of a voting rule in a local election with voters is defined as the worstcase ratio, over all possible valuation profiles of the voters participating in that election, between the maximum social welfare of any alternative and the social welfare of the alternative chosen by the voting rule:
The distortion of a voting rule in a general election is defined as the worstcase ratio, over all possible general elections that use as the voting rule within the districts, between the maximum social welfare of any alternative and the social welfare of the alternative chosen by the general election:
Again, in simple words, the distortion of a voting rule is the worstcase over all the possible valuations that voters can have and over all possible ways of partitioning these voters into districts. When , we recover the standard definition of the distortion.
Next, we define some standard properties of voting rules.
Definition 2.1 (Properties of voting rules).
A voting rule is

ordinal, if the outcome only depends on the preference orderings induced by the valuations and not the actual numerical values themselves. Formally, given a valuation profile , let be the ordinal preference profile formed by the values of the agents for the alternatives (assuming some fixed tiebreaking rule). A voting rule is ordinal if for any two valuation profiles and such that , it holds that .

unanimous, if whenever all agents agree on an alternative, that alternative gets elected. Formally, whenever there exists an alternative for whom for all voters and all alternatives , then .

(strictly) Pareto efficient, if whenever all agents agree that an alternative is better than , then can not be elected instead of . Formally, if for all , then .^{3}^{3}3We remark that Pareto efficiency usually requires that there is no other alternative who all voters weakly prefer and who one voter strictly prefers. For our lower bounds however, using the definition of strict Pareto efficiency is sufficient; actually, it makes our bounds even stronger. Also note that when the valuations do not exhibit ties (and therefore the induced preference orderings are strict), the two definitions coincide.
Remark.
It is not hard to see that we can assume that the best voting rule in terms of distortion is Pareto efficient, without loss of generality. Indeed, for any voting rule that is not Pareto efficient, we can construct the following Pareto efficient rule : for every input on which outputs a Pareto efficient alternative, outputs the same alternative; for every input on which outputs an alternative that is not Pareto efficient, outputs a maximal Pareto improvement, that is, a Pareto efficient alternative which all voters (weakly) prefer more than the alternative chosen by .
Clearly, is Pareto efficient and achieves a social welfare at least as high as . Note also that Pareto efficiency implies anonymity, so we will be using both properties in our proofs without loss of generality.
Finally, most of the voting rules that are being employed in practice are ordinal, with the notable exception of Range Voting, which is the voting rule that outputs the alternative that maximizes the social welfare.
We consider the following three basic cases for the general elections, depending on the size and the weight of the districts:

Symmetric Elections: all districts consist of voters and have the same weight, i.e., and for each .

Unweighted Elections: all districts have the same weight, but not necessarily the same number of voters, i.e., for each .

Unrestricted Elections: there are no restrictions on the sizes and weights of the districts.
Of course, the class of symmetric elections is a subclass of that of unweighted elections which in turn is a subclass of the class of unrestricted elections.
3 The effect of districts for general voting rules
Our aim in this section is to showcase the immediate effect of using districts to distributively aggregate votes. To this end, we present tight bounds on the distortion of all voting rules in a general election. We will first state a general theorem relating the distortion of any general election that uses a voting rule for the local elections, with the distortion of the voting rule.
Theorem 3.1.
Let be a voting rule with . Then, the distortion of in the general election is

for symmetric elections;

for unweighted elections;

for unrestricted elections.
Proof.
We prove the first two parts together, and the third one separately.
Parts (i) and (ii).
Consider a general unweighted election with a set of alternatives, a set of voters, a set of districts such that each district consists of voters (if the election is symmetric, then ) and has weight . Let be the valuation profile consisting of the valuations of all voters for all alternatives. Let be the winner of the election and denote by the set of districts in which wins according to . Then, we have that
(1) 
By the unitsum assumption we have that
Since is the winner in the districts of according to , and has distortion at most , it must be the case that for every . Also, we have that . Hence, we obtain
(2) 
Let be the optimal alternative, and let be the set of districts in which is the winner. We split the social welfare of into three parts:
We can now make the following observations:

Since wins in the districts of according to , the first part is at most .

Since the value of each voter in for is by definition at most , the second part is at most .

Since for every district and loses in , the total value of the voters in for cannot exceed , as otherwise, would have distortion larger than . As a result, the third part is at most .

Since is the election winner, and .
Putting all of these together, we upperbound the social welfare of as follows:
(3) 
Hence, by (1), (2) and (3), we obtain
The proof of part (ii) is now complete. For part (i), we get the desired bound of by simply setting .
Part (iii).
Observe that the proof of part (iii) does not follow directly from the proof of part (ii) since now that the districts may have arbitrary weights, the number of districts that the election winner wins does not need to be higher than the number of districts in which is the winner. In other words, it might be the case that . However, since , inequality (2) can be simplified to
(4) 
For the optimal alternative we can also simplify our arguments by using the trivial fact that all voters not in districts of have by definition value at most for . Then, we obtain
(5) 
By combining (1), (4) and (5), we finally have that
This completes the proof. ∎
We now turn to concrete voting rules and consider perhaps the most natural such rule: Range Voting (RV).
Definition 3.2 (Range Voting (RV)).
Given a valuation profile with voters, Range Voting elects the alternative that maximizes the social welfare of the voters.
Note that the rule is both unanimous and Pareto efficient. Immediately from the definition of the rule and Theorem 3.1, we have the following corollary.
Corollary 3.3.
The distortion of RV in the general election is

for symmetric elections;

for unweighted elections;

for unrestricted elections.
We continue by presenting matching lower bounds on the distortion of any voting rule in a general election. The highlevel idea in the proof of the following theorem is that the election winner is chosen arbitrarily among the alternatives with equal weight, which might lead to the cardinal information within the districts to be lost.
Theorem 3.4.
The distortion of all voting rules in a general election is at least

for symmetric elections;

for unweighted elections;

for unrestricted elections.
Proof.
We prove the first two parts together and the third one separately.
Parts (i) and (ii).
Consider an unweighted general election with a set of districts such that and district consists of voters for . Suppose that the valuations of the voters are such that there are different district winners . Then, without loss of generality, the election winner is one of these alternatives; let be the winner. Let . We define the following valuation profile :

all voters in district have value for and for every other alternative;

all voters in district have value for and for everyone else;

all voters in district for have value for , for and for everyone else.
Note that since the voting rule is unanimous without loss of generality, the winner of the first district is , the winner of the second district is and the winner of district for is .
The optimal alternative is with
while the winner of the election has
As tends to zero, the ratio becomes
The bounds follow by setting and for unweighted elections, and for symmetric elections.
Part (iii).
For the unrestricted case, consider a general election such that there is a district with weight . Since has so much weight, the winner of this district is also the election winner. Let and be two distinguished alternatives, and . We define the following valuation profile :

all voters in district have value for and for every other alternative;

all voters in each district have value for and for everyone else.
Since the voting rule is unanimous without loss of generality, is the winner in district and is the winner in every other district.
The optimal alternative is with
while the election winner is alternative with
As tends to zero, the ratio becomes
and the proof follows by setting . ∎
4 Ordinal voting rules and Plurality
Although Range Voting is quite natural, its documented drawback is that it requires a very detailed informational structure from the voters, making the elicitation process rather complicated. For this reason, most voting rules that have been applied in practice are ordinal (see Definition 2.1), as such rules present the voters with the much less demanding task of reporting a preference ordering over the alternatives, rather than actual numerical values.
Thus, a very meaningful question, from a practical point of view, is “What is the distortion of ordinal voting rules?” The most widely used such rule is Plurality Voting. Besides its simplicity, the importance of this voting rule also comes from the fact that it is used extensively in practice. For instance, it is used in presidential elections in a number of countries like the USA and the UK.
Definition 4.1 (Plurality Voting (PV)).
Given a valuation profile and its induced ordinal preference profile , PV elects the alternative with the most first position appearances in , breaking ties arbitrarily.
It is known that the distortion of Plurality Voting is (Caragiannis et al., 2017). Therefore, if we plugin this number to our general bound in Theorem 3.1, we obtain corresponding upper bounds for PV. However, in the following we obtain much better bounds, taking advantage of the structure of the mechanism; these bounds are actually tight.
Theorem 4.2.
The distortion of PV is exactly

for symmetric elections;

for unweighted elections;

, for unrestricted elections.
Proof.
We the upper and the lower bounds separately, starting with the former.
Upper bounds.
Consider a general unweighted election with a set of alternatives, a set of voters, a set of districts such that each district consists of voters and has weight . Let be the valuation profile consisting of the valuations of all voters for all alternatives, which induces the ordinal preference profile . To simplify our discussion, let be the set of voters in district that rank alternative in the first position, and also set .
Let be the winner of the election and denote by the set of districts in which wins according to PV. Then, we have that
(6) 
Since has the plurality of votes in each district , we have that for every , and by the fact that , we obtain that . Similarly, for each agent we have that for every , and by the unitsum assumption, we obtain that . We also have that . Hence,
(7) 
Let the optimal alternative, and denote by the set of districts in which is the winner. We split the social welfare of into three parts:
(8) 
We will now bound each term individually. First consider a district . Then, the welfare of the agents in for can be written as
Since is the favourite alternative of every agent , . By definition, the value of every agent for is at most . The value of every agent for can be at most since otherwise would definitely be the favourite alternative of such an agent. Combining these observations, we get
where the second inequality follows by considering the value of all agent in for alternative , while the third inequality follows by the fact that wins by plurality. By summing over all districts in , we can bound the first term of (8) as follows:
(9) 
For the second term of (8), by definition we have that the value of each agent in the districts of for alternative can be at most , and therefore
For the third term of (8), observe that the total value of the agents in a district for must be at most ; otherwise would necessarily be ranked first in strictly more than half of the agents’ preferences and therefore win in the district. Hence,
By substituting the bounds for the three terms of (8), as well as by taking into account the facts that and , we can finally upperbound the social welfare of as follows:
(10) 
By (6), (7) and (10), we can upperbound the distortion of PV as follows:
This completed the proof of part (ii). For part (i), we get the desired bound of by simply setting .
For part (iii), Since , we simplify inequality (7) to
(11) 
For the optimal alternative we also simplify our arguments by using the trivial fact that all agents not in districts of have by definition value at most for . Then, by also using inequality (9), we obtain
(12) 
By combining (6), (11) and (12), we finally have that
This completes the proof of the upper bounds.
Lower bounds.
We now provide matching lower bounds. For unweighted districts, consider a general election with a set of districts such that , district consists of agents for , and is a multiple of . We enumerate the alternatives as . Suppose that the agent preferences are such that there are different district winners , where and . Then, one of these alternatives is selected as the election winner.
We define the approval votes and the valuation profile of the voters as follows:

The voters in district are split into sets , …, of size each such that the voters of set approve alternative . Since PV is Pareto efficient, we can assume without loss of generality that the winner in this district is . The valuations are such that the voters in set for have value for and , the voters in set have value for all alternatives, and the voters in set have value for .

The voters in district all approve alternative and have value for her.

The voters in district for are split into two set of equal size such that the voters in the first set approve alternative and the voters in the second set approve . The voters in the first set have value for both and , while the voters in the second set have value for .
The optimal alternative is with
while the winner of the election has
Therefore, the distortion is equal to
The bound follows by selecting and . For part (i), we simply set .
For the unrestricted case, consider a general election with districts such that there is a district with weight . Since has so much weight, the winner of this district is the election winner as well. We enumerate the alternatives as . We define the approval votes and the valuation profile of the agents as follows:

District : voters approve and have value for all alternatives; voters approve and have value for her; voters approve alternative for , and have value for and . We assume without loss of generality that the winner in this district is (since PV is Pareto efficient).

District : all voters approve and have value for her.
The optimal alternative is with
while the election winner is alternative with
Therefore, the distortion is equal to
The proof follows by selecting . ∎
Our next theorem shows that PV is asymptotically the best possible voting rule among all (deterministic) ordinal voting rules.
Theorem 4.3.
The distortion of any ordinal voting rule is

, for symmetric elections;

, for unweighted elections;

, for unrestricted elections.
Proof.
Fix an arbitrary deterministic ordinal voting rule ; as we explained earlier in Section 2, we can assume without loss of generality that is Pareto efficient.
Parts (i) and (ii).
Consider a general election with a set of districts such that , district consists of agents for , and is an integer multiple of . We enumerate the alternatives as and let , . We will construct an ordinal preference profile such that there are different district winners . Then, without loss of generality, one of these alternatives is selected as the winner of the general election.
We define the rankings and the valuation profile of the voters as follows:

The voters in district are partitioned into sets , …, of equal size . The voters in set have the ranking . Since each alternative appears exactly the same number of times in each position and since is Pareto efficient, any alternative can be selected as the winner of ; thus, without loss of generality, we may assume that the winner is . The valuations are such that the voters in set for have value for alternative , while the voters in set have value for all alternatives.

All voters in district rank alternative first and the other alternatives arbitrarily in the remaining positions. Clearly, since is Pareto efficient, is the winner of . The valuations are such that all voters have value for .

For each , the agents in each of district are partitioned into two sets of equal size. All voters in the first set rank alternative first, alternative second, and then the other alternatives arbitrarily. All voters in the second set rank alternative first, alternative second, and then the other alternatives arbitrarily. Given these rankings and by the fact that is Pareto efficient, the winner of the district is either or . Without loss of generality, we assume that the tie is broken in favour of alternative . The valuations are such that the voters in the first set have value for and , while the voters in the second set have value for .
Given the above valuation profile, the optimal alternative is , while the winner of the election may be . Since
and