In light of the classic impossibility results for axiomatic approaches to social choice  and voting [18, 28], a fruitful approach has been to treat voting as an implicit optimization problem of finding the “best” candidate for the population in aggregate [9, 11, 24, 25]. Using this approach, voting systems can be compared based on how much they distort the outcome, in the sense of leading to the election of suboptimal candidates. A particularly natural optimization objective is the sum of distances between voters and the chosen candidate in a suitable metric space [1, 2, 3, 19]. The underlying assumption is that the closer a candidate is to a voter, the more similar their positions on key questions are. Because proximity implies that the voter would benefit from the candidate’s election, voters will rank candidates by increasing distance, a model known as single-peaked preferences [7, 15, 8, 23, 22, 6, 27, 5].
Even in the absence of strategic voting, voting systems can lead to high distortion in this setting, because they typically allow only for communication of ordinal preferences111Of course, it is also highly questionable that voters would be able to quantify distances in a metric space sufficiently accurately, in particular given that the metric space is primarily a modeling tool rather than an actual concrete object., i.e., rankings of candidates . In a beautiful piece of recent work, Anshelevich et al.  showed that this approach can draw very clear distinctions between voting systems: some voting systems (in particular, Copeland and related systems) have distortion bounded by a small constant, while most others (including Plurality, Veto, -approval, and Borda Count) have unbounded distortion, growing linearly in the number of voters or candidates.
The examples giving bad distortion typically have the property that the candidates are not “representative” of the voters. Anshelevich et al.  show more positive results when there are no near-ties for first place in any voter’s ranking. Cheng et al.  propose instead a model of representativeness in which the candidates are drawn randomly from the population of voters; under this model, they show smaller constant distortion bounds than the worst-case bounds for majority voting with candidates. Cheng et al.  left as an open question the analysis of the distortion of voting systems for representative candidates.
In the present work, we study the distortion of positional voting systems with representative candidates. Informally (formal definitions of all concepts are given in Section 2), a positional voting system is one in which each voter writes down an ordering of candidates, and the system assigns a score to each candidate based solely on his222For consistency, we always use male pronouns for candidates and female pronouns for voters. position in the voter’s ordering. The map from positions to scores is known as the scoring rule of the voting system, and for candidates is a function . The total score of a candidate is the sum of scores he obtains from all voters, and the winner is the candidate with maximum total score. The most well-known explicitly positional voting system is Borda Count , in which for all . Many other systems are naturally cast in this framework, including Plurality (in which voters give 1 point to their first choice only) and Veto (in which voters give 1 point to all but their last choice).
In analyzing positional voting systems, we assume that voters are not strategic, i.e., they report their true ranking of candidates based on proximity in the metric space. This is in keeping with the line of work on analyzing the distortion of social choice functions, and avoids issues of game-theoretic modeling and equilibrium existence or selection (see, e.g., ) which are not our focus.
As our main contribution, we characterize when a positional voting system is guaranteed to have constant distortion, regardless of the underlying metric space of voters and candidates, and regardless of the number of candidates that are drawn from the voter distribution. The characterization relies almost entirely on the “limit voting system.” By normalizing both the scores and the candidate index to lie in (we associate the out of
candidates with his quantile), we can take a suitable limit of the scoring functions as .
Our main result (Corollary 3.2 in Section 3) states the following: (1) If is not constant on the open interval , then the voting system has constant distortion. (2) If is a constant other than 1 on the open interval , then the voting system does not have constant distortion. The only remaining case is when on . In that case, the rate of convergence of to matters, and a precise characterization is given by Theorem 3.1.
As direct applications of our main result, we obtain that Borda Count and -approval for representative candidates have constant distortion; on the other hand, Plurality, Veto, the Nauru Dowdall method (see Section 2), and -approval for have super-constant distortion. In fact, it is easy to adapt the proof of Theorem 3.1 to show that the distortion of Plurality, Veto, and -approval, even with representative candidates, is .
Our results provide interesting contrasts to the results of Anshelevich et al. . Under adversarial candidates, all of the above-mentioned voting rules have distortion ; the focus on representative candidates allowed us to distinguish the performance of Borda Count and -approval from that of the other voting systems. Thus, an analysis in terms of representative candidates allows us to draw distinctions between voting systems which in a worst-case setting seem to be equally bad.
As a by-product of the proof of our main theorem, in Lemma 3.3, we show that every voting system (positional or otherwise) has distortion with representative candidates. Combined with the lower bound alluded to above, this exactly pins down the distortion of Plurality, Veto, and -approval with representative candidates to . For Veto, this result also contrasts with the worst-case bound of Anshelevich et al. , which showed that the distortion can grow unboundedly even for candidates.
2.1 Voters, Metric Space, and Preferences
The voters/candidates are embedded in a closed metric space , where is the distance between points . The distance captures the dissimilarity in opinions between voters (and candidates) — the closer two voters or candidates are, the more similar they are. The distribution of voters in is denoted by the (measurable) density function . We allow for to have point masses.333Since the continuum model allows for point masses, it subsumes finite sets of voters. Changing all our results to finite or countable voter sets is merely cosmetic. Unless there is no risk of confusion, we will be careful to distinguish between a location and a specific voter or candidate at that location. We apply equally to locations/voters/candidates.
We frequently use the standard notion of a ball in a metric space. For balls (and other sets) , we write .
An election is run between candidates according to rules defined in Section 2.3. The candidates are assumed to be representative of the population, in the sense that their locations are drawn i.i.d. from the distribution of voters.
Each voter ranks the candidates by non-decreasing distance from herself in . Ties are broken arbitrarily, but consistently444Our results do not depend on specific tie breaking rules., meaning that all voters at the same location have the same ranking. We denote the ranking of a voter or a location over candidates by or . The distance-based ranking assumption means that implies that and implies that . As mentioned in the introduction, we assume that voters are not strategic; i.e., they express their true ranking of candidates based on proximity in the metric space.
2.2 Social Cost and Distortion
Candidates are “better” if they are closer to voters on average. The social cost of a candidate (or location) is
The socially optimal candidate among the set of candidates running is denoted by . The overall optimal location is denoted by , which is any 1-median of the metric space. (If there are multiple optimal locations, consider one of them fixed arbitrarily.) The always exists, because the metric space is assumed to be closed, and the cost function is continuous and bounded below by 0. Note that it is not necessary that there be any voters located at .
Based on the votes, a voting system will determine a winner for the set of candidates, who will often be different from . The distortion measures how much worse the winner is than the optimum
We are interested in the expected distortion of positional voting systems under i.i.d. random candidates, i.e.,
Our distortion bounds are achieved by lower-bounding . A particularly useful quantity in this context is the fraction of voters outside a ball of radius around , which we denote by . The following lemma captures some useful simple facts that we use:
For any candidate or location ,
The cost of any candidate or location can be written as
For all , the cost of the optimum location is lower-bounded by
The proof of the first inequality simply applies the triangle inequality under the integral:
For the second equation, observe that
, and the expectation of any non-negative random variablecan be rewritten as .
For the third inequality, we apply the previous part with , and lower bound
2.3 Positional Voting Systems and Scoring Rules
We are interested in positional voting systems. Such systems are based on scoring rules: voters give a ranking of candidates, and with each position is associated a score.
Definition 2.1 (Scoring Rule).
A scoring rule for candidates is a non-increasing function with and .
Definition 2.2 (Positional Voting System).
A positional voting system is a sequence of scoring rules , one for each number of candidates .
The interpretation of is that if voter puts a candidate in position on her ballot, then obtains points from . The total score of candidate is then
The winning candidate is one with highest total score, i.e., for a set of candidates, ; again, ties are broken arbitrarily, and our results do not depend on tie breaking.
The restriction to monotone non-increasing scoring rules is standard when studying positional voting systems. One justification is that in any positional voting system violating this restriction, truth-telling is a dominated strategy, rendering such a system uninteresting for most practical purposes. Given this restriction, the assumption that and is without loss of generality, because a score-based rule is invariant under affine transformations.
Next, we want to capture the notion that a positional voting system is “consistent” as we vary the number of candidates . Intuitively, we want to exclude contrived voting systems such as “If the number of candidates is even, then use Borda Count; otherwise use Plurality voting.” This is captured by the following definition.
Definition 2.3 (Consistency).
Let be a positional voting system with scoring rules . We say that is consistent if there exists a function such that for each rational quantile and accuracy parameter , there exists a threshold such that and for all . We call the limit scoring rule of .
Intuitively, this definition says that the sequence of scoring rules is consistent with a single scoring rule in the limit. Using the fact that is monotone non-increasing for each , it can be shown that is also monotone non-increasing. We note that converges pointwise to in a precise and natural sense. Formally, when is rational, there exists an infinite sequence of integers with , and consistency implies that must equal the limit of for that sequence of values of . Therefore the limit scoring rule is uniquely defined if it exists.
All positional voting systems we are aware of are consistent according to Definition 2.3.
To illustrate the notion of a consistent positional voting system, consider the following examples, encompassing most well-known scoring rules.
In Plurality voting with candidates, and for all . The limit scoring rule is and for all .
In Veto voting with candidates, for all and . The limit scoring rule is for all and .
In -approval voting with constant , we have for , and for all other . The limit scoring rule is and for all , i.e., the same as for Plurality voting. (This relies on being constant, or more generally, .)
In -approval voting with linear , there exists a constant with for all , and for all larger . The limit scoring rule is for and for .
The Borda voting rule has (after normalization). The limit scoring rule is .
3 The Main Characterization Result
In this section, we state and prove our main theorem, characterizing positional voting systems with constant distortion.
Let be a positional voting system with a sequence of scoring rules for . Then, has constant expected distortion if and only if there exist constants and such that for all ,
Let be a consistent positional voting system with limit scoring rule .
If is not constant on the open interval , then has constant expected distortion.
If is equal to a constant other than 1 on the open interval , then does not have constant expected distortion.
The constant in Theorem 3.1 and Corollary 3.2 depends on , but not on the metric space or the number of candidates. Corollary 3.2 has the advantage of determining constant expected distortion only based on the limit scoring rule . The only case when it does not apply is when for all . In that case, the higher complexity of Theorem 3.1 is indeed necessary to determine whether has constant distortion. Fortunately, Veto voting is the only rule of practical importance for which on , and it is easily analyzed.
Before presenting the proofs, we apply the characterization to the positional voting systems from Example 2.1. Using the limit scoring rules derived in Example 2.1, Corollary 3.2 implies constant expected distortion for Borda Count and -approval with linear , and super-constant expected distortion for Plurality, -approval with , and the Dowdall method.
This leaves Veto voting, for which it is easy to apply Theorem 3.1 directly. Because for all , for any constant and large enough , the left-hand side of (4) is 0, while the right-hand side is positive. Hence, (4) can never be satisfied for sufficiently large , implying super-constant expected distortion. The proof easily generalizes to show that when voters can veto candidates, the distortion is super-constant.
In this section, we prove that condition (4) suffices for constant distortion. First, because of the monotonicity of , if (4) holds for , then it also holds for all . Now, the high-level idea of the proof is the following: we define a radius large enough so that the ball around the socially optimal location contains a very large (but still constant) fraction of all voters, such that satisfies (4). If the number of candidates is large enough (a large constant), standard Chernoff bounds ensure that as grows large, most candidates who are running will be from inside . In turn, if many candidates inside are running, all candidates outside are very far down on almost everyone’s ballot, and therefore cannot win. In particular, Inequality (4) implies that the total score of an average candidate in exceeds the maximum possible total score of a candidate outside . This allows us to bound the expected distortion in terms of the cost of .
The case of small is much easier, since we can treat as a constant. In that case, the following lemma is sufficient.
If candidates are drawn i.i.d. at random from , the expected distortion is at most .
The proof illustrates some of the key ideas that will be used later in the more technical proof for a large number of candidates. We want to bound
In order for a candidate at distance at least from to win, it is necessary that at least one such candidate be running. By a union bound over the
candidates, the probability of this event is at most, so
Lower-bounding the cost of the optimum candidate from in terms of the overall best location , the expected distortion is
In preparation for the case of large , we begin with the following technical lemma, which shows that whenever (4) holds, it will also hold when the terms on the left-hand side are “shifted,” and the right-hand side can be increased by a factor of 2 (or, for that matter, any constant factor).
Assume that there exist and such that (4) holds. Then, there exists such that for all , all , and all integers ,
We now flesh out the details of the construction. By Lemma 3.4, there exists and such that (5) holds for all , all , and all integers . For simplicity of notation, write , and let and . Notice that only depend on , but not on the metric space or number of voters.
Let , so that , and . (Both inequalities hold with equality unless there is a discrete point mass at distance from .)
Consider any and write and , as depicted in Figure 1. When candidates are drawn i.i.d. from , the expected fraction of candidates drawn from outside of is exactly . Let be the event that more than candidates are from outside . Lemma 3.5 uses Chernoff bounds and the definitions of the parameters to show that happens with sufficiently small probability; Lemma 3.6 then shows that unless happens, the distortion is constant.
By the Chernoff bound , applied with and , the probability of is at most
Recall that . Because , we have that ; in particular, , so the probability can be upper-bounded by making the exponent as small as possible. Because , the exponent is lower-bounded by 1. Thus, we obtain that the probability of is at most . ∎
Whenever does not happen, the winner of the election is from .
Let . Assume that exactly out of the candidates are drawn from . Consider a candidate . We will compare the average number of points of candidates in with the maximum possible number of points of candidate , and show that the former exceeds the latter.
Each voter gives at most one point to . On the other hand, even if ranks all of in the last positions, the total number of points assigned by to is at least . The difference between the number of votes to and the average number of votes to candidates in is thus at most
Because no more than a fraction of voters are strictly outside , the total advantage of over an average candidate in resulting from such voters is at most
Each voter will rank all candidates in (who are at distance at most from her) ahead of all candidates outside (who are at distance strictly more than from her).
Let be such that ranks in position . Then, gets points from . Because ranks all of ahead of , she gives at least points in total to . Hence, the difference in the number of points that gives to an average candidate in and the number of votes that gives to is at least
Because at least a fraction of voters are in , the total advantage of an average candidate in resulting from voters in is at least
We show that , using condition (5). Because is monotone non-increasing, and because , we get that
Because and is monotone, we get that . Hence,
We now wrap up the sufficiency portion of the proof of Theorem 3.1. We distinguish two cases, based on the number of candidates . If , then Lemma 3.3 implies an upper bound of on the expected distortion. Now assume that . Recall that . By Lemmas 3.5 and 3.6, for any , the probability that the election’s winner is outside is at most . The rest of the proof is similar to that of Lemma 3.3. We again use that
To upper-bound , recall that at least a fraction of voters are outside of or on the boundary. Therefore, by Inequality (3), . Substituting this bound, the expected cost of the winning candidate is at most
as depends only on the voting system , but not on the metric space or the number of candidates. This completes the proof of sufficiency.
3.1.1 Proof of Lemma 3.4
Because condition (4) holds for all , we may assume that . Define , and consider any . Fix , and write and . Let be arbitrary. We define
By monotonicity of ,
furthermore, . Therefore, it suffices to show that . By condition (4) and monotonicity of , and because ,
To upper-bound in terms of , we show that the contribution of to the preceding sum is small, and upper-bound in terms of . Because , using the monotonicity of , we can write
Combining the preceding inequalities, we now obtain that
Solving for , and using that the definition of ensures , we now bound
completing the proof.
We will show that the distortion of is not bounded by any constant.
The high-level idea of the construction is as follows: we define two tightly knit clusters and that are far away from each other. contains a large fraction of the population, and thus should in an optimal solution be the one that the winner is chosen from. We will ensure that with probability at least , the winner instead comes from . Because is far from , most of the population then is far from the chosen candidate, giving much worse cost than optimal.
The metrics underlying and are as follows: will essentially provide an “ordering,” meaning that whichever set of candidates is drawn from , all voters in (and essentially all in ) agree on their ordering of the candidates. This will ensure that one candidate from will get a sufficiently large fraction of first-place votes, and will be ranked highly enough by voters from , too. will be based on a large number of discrete locations . Their pairwise distances are chosen i.i.d.: as a result, the rankings of voters are uniformly random, and there is no consensus among voters in on which of their candidates they prefer. Because the vote is thus split, the best candidate from will win instead.
The following parameters (whose values are chosen with foresight) will be used to define the metric space.
Let be any constant; we will construct a metric space and number of candidates for which the distortion is at least .
Let solve the quadratic equation . A solution exists because at , the left-hand side is ; it goes to infinity as , while the right-hand side is a positive constant. is the fraction of voters in the small cluster .
Let denote the fraction of voters in the large cluster .
Let be the distance between the clusters and . (Each cluster will have diameter at most .)
Let satisfy ; such an exists because the left-hand side goes to 0 as . is a high-probability upper bound on the fraction of candidates that will be drawn from .
Let ; this is a lower bound on the number of candidates that ensures that the actual fraction of candidates drawn from is at most with sufficiently high probability.
Let be the whose existence is guaranteed by the assumption (6) (for and ).
Let ; this is the number of discrete locations we construct within the larger cluster .
We now formally define the metric space consisting of two clusters:
The metric space consists of two clusters and . has discrete locations, and has a point mass of on each such location. The total probability mass on is , distributed uniformly over the interval . Locations in are identified by . The distances are defined as follows:
For each distinct pair , the distance is drawn independently uniformly at random from .
For each distinct pair of locations, the distance is defined to be .
Partition into disjoint intervals of length each, one for each permutation of the locations in . For and , let be the position of in , and define the distance between and to be .
Definition 3.1 defines a metric.
Non-negativity, symmetry, and indiscernibles hold by definition. Because all distances within clusters are in , and distances across clusters are more than 2, the triangle inequality holds for all pairs and all pairs .
Because for all and , and distances within or are at least 1, there can be no shorter path than the direct one between any and . Therefore, the triangle inequality is satisfied. ∎
Now consider a (random) set of candidates, drawn i.i.d. from . We are interested in the event that the resulting slate of candidates is highly representative of the voters, in the following sense.
Let be the (random) set of candidates drawn from . Let be defined as the conjunction of the following:
For each location , the set contains at most one candidate from .
At least a fraction of candidates in is drawn from (and thus at most an fraction of candidates are from ).
At least an fraction of candidates in is drawn from .
No pair has .
happens with probability at least .
We upper-bound the probability of the complement of each of the four constituent sub-events.
For each of the at most pairs of candidates, the probability that they are both drawn from the same location is at most . By a union bound over all pairs, the probability that any location has at least two pairs is at most .
Let the random variable be the number of candidates drawn from . Then, , and is a sum of i.i.d. Bernoulli random variables. By the Hoeffding bound , with , we obtain that the fraction of candidates from is too small with probability at most .
The proof is essentially identical to the previous case (except because , the bounds are even stronger), so this event happens with probability at least as well.
Consider all intervals of of length , starting at for some . If with existed, they would both be contained in at least one such interval (because the interval length is twice as long as the distance).
For any of the intervals , the probability that a specific pair of candidates is drawn from is at most . By a union bound over all (at most ) pairs of candidates and all intervals, the probability that any pair is drawn from any interval is at most .
Because , a union bound shows that happens with probability at least . ∎
Whenever happens, the winning candidate is from