Approximating Voting Rules from Truncated Ballots

02/13/2020 ∙ by Manel Ayadi, et al. ∙ 0

Classical voting rules assume that ballots are complete preference orders over candidates. However, when the number of candidates is large enough, it is too costly to ask the voters to rank all candidates. We suggest to fix a rank k, to ask all voters to specify their best k candidates, and then to consider "top-k approximations" of rules, which take only into account the top-k candidates of each ballot. We consider two measures of the quality of the approximation: the probability of selecting the same winner as the original rule, and the score ratio. We do a worst-case study (for the latter measure only), and for both measures, an average-case study and a study from real data sets.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The input of a voting rule is usually a collection of complete rankings over candidates (although there are exceptions, such as approval voting). However, requiring a voter to provide a complete ranking over the whole set of candidates can be difficult and costly in terms of time and cognitive effort. We suggest to ask voters to report only their - candidates, for some (small) fixed value of (the obtained ballots are then said to be top-). Not only it saves communication effort, but it is also often easier for a voter to find out the top part of their preference relation than the bottom part. However, this raises the issue of how usual voting rules should be adapted to top- ballots. Reporting top- ballots is a specific form of voting with incomplete preferences, and is highly related to vote elicitation. Work on these topics is reviewed in the recent handbook chapter [5]

. Existing work on truncated ballots can be classified into two classes according to the type of interaction with the voters:

(i) Interactive elicitation

An interactive elicitation protocol asks voters to expand their truncated ballots in an incremental way, until the outcome of the vote is eventually determined. This line of research starts with Kalech et al. [14] who start by top-1 ballots, then top-2, etc., until there is sufficient information for knowing the winner. Lu and Boutilier [17, 16] propose an incremental elicitation process using minimax regret to predict the correct winner given partial information. A more general incremental elicitation framework, with more types of elicitation questions, is cost-effective elicitation [25]. Naamani Dery et al. [10] present two elicitation algorithms for finding a winner with little communication between voters.

(ii) Non-interactive elicitation

The central authority elicits the top- ballots at once, for a fixed value of , and outputs a winner without requiring voters to provide extra information. A possibility consists in computing possible winners given these truncated ballots: this is the path followed by Baumeister et al. [2] (who also consider double-truncated ballots where each voter ranks some of her top and bottom candidates). Another possibility – which is the one follow – consists in generalizing the definition of a voting rule so that it takes truncated ballots as input. In this line, Oren et al. [21] analyze - voting by assessing the values of needed to ensure the true winner is found with high probability for specific preference distributions. Skowron et al. [23] use - voting as a way to approximate some multiwinner rules. Filmus and Oren [12] study the performance of top- voting under the impartial culture distribution for the Borda, Harmonic and Copeland rules. They assess the values of needed to find the true winner with high probability, and they report on numerical experiments that show that under the impartial culture, top- ballots for reasonable small values of give accurate results.

Bentert and Skowron [3] focus on top- approximations of voting rules that are defined via the maximization of a score (positional scoring rules and maximin). They evaluate the quality of the approximation of a voting rule by a top- rule by the worst-case ratio between the scores, with respect to the original profile, of the winner of the original rule and the winner of the approximate rule. They identify the top- rules that best approximate positional scoring rules (we give more details in Section 5). Their theoretical analysis is completed by numerical experiments using profiles generated from different distributions over preferences: they show that for the Borda rule a small value of is needed to achieve a high approximation guarantee while maximin needs more information from a sufficiently many voters to determine the winner.

Ayadi et al. [1] evaluate the extent to which STV with - ballots approximates STV with full information. They show that for small , - ballots are enough to identify the correct winner quite frequently, especially for data taken from real elections. Finally, the recognition of singled-peaked - profiles is studied in [15] while the computational issues of manipulating rules with - profiles is addressed in [20].

Our contribution concerns non-interactive elicitation. We adapt different voting rules to truncated ballots: we define approximations of voting rules which take as input the - candidates of each ballot. The question is then, are these approximations good predictors of the original rule? We answer this question by considering two measures: the probability that the approximate rule selects the ‘true’ winner, and the ratio between the scores (for the original rule) of the true winner and the winner of the approximate rule. For the latter measure we give a worst-case theoretical analysis. For both measures we give an empirical study, based on randomly generated profiles and on real-world data. Our findings are that for several common voting rules, both for randomly generated profiles and real data, a very small suffices.

Our research can be seen as a continuation of Filmus and Oren [12]. We go further on several points: we consider more voting rules; beyond impartial culture, we consider a large scope of distributions; we study score distortion; and we include experiments using real-world data sets. Our work is also closely related to [3], who have obtained related results independently (see Sections 4 and 5 for a discussion).

Our interpretation of top- ballots is epistemic: the central authority in charge of collecting the votes and computing the outcome ignores the voters’ preferences below the - candidates of each voter, and has to cope with it as much as possible. Voters may very well have a complete preference order in their head (although it does not need to be the case), but they will simply not be asked to report it.

Section 2 gives some background. Section 3 defines top- approximations of different voting rules. Section 4 analyses empirically the probability that approximate rules select the true winner. Section 5 analyses score distortion, theoretically and empirically.

2 Preliminaries

An election is a triple where: is the set of voters, is the set of candidates, with ; and is the preference profile of voters in , where for each , is a linear order over . is the set of all profiles over alternatives (for varying ).

Given a profile , is the number of voters who prefer to in . The majority graph is the graph whose set of vertices is the set of the candidates and in which for all , there is a directed edge from to (denoted by ) in if .

A resolute voting rule is a function . Resolute rules are typically obtained from composing an irresolute rule (mapping an election into an non-empty subset of candidates, called co-winners) with a tie-breaking mechanism.

A positional scoring rule (PSR)

is defined by a non-negative vector

such that and . Each candidate receives points from each voter who ranks her in the position, and the score of a candidate is the total number of points she receives from all voters i.e. . The winner is the candidate with highest total score. Examples of scoring rules are the Borda and Harmonic rules, with and .

We now define three pairwise comparison rules.

The Copeland rule outputs the candidate maximizing the Copeland score, where the Copeland score of is the number of candidates with in , plus half the number of candidates with no edge between and in .

The Ranked Pairs (RP) rule proceeds by ranking all pairs of candidates according to (using tie-breaking when necessary); starting from an empty graph over , it then considers all pairs in the described order and includes a pair in the graph if and only if it does not create a cycle in it. At the end of the process, the graph is a complete ranking, whose top element is the winner.

The maximin rule outputs the candidates that maximize .

For the experiments using randomly generated profiles, we use the Mallows -model [18]. It is a (realistic) family of distributions over rankings, parametrized by a modal or reference ranking and a dispersion parameter : , where is any ranking, is the Kendall tau distance and is a normalization constant. With small values of , the mass is concentrated around , while

gives the uniform distribution

Impartial Culture (IC), where all profiles are equiprobable.

3 Approximating Voting Rules from Truncated Ballots

Given , a top- election is a triple where and are as before, and , where each is a ranking of out of candidates in . is called a top- profile. If is a complete profile, is the top- truncation of (i.e., the best candidates, ranked as in ), and is the top--profile induced from and . A top- (resolute) voting rule is a function that maps each - election to a candidate in . We sometimes apply a top- rule to a complete profile, with . We now define several - rules.

3.1 Borda and Positional Scoring Rules

Definition 1.

A - PSR is defined by a scoring vector such that and . Each candidate in a - vote receives points from each voter who ranks her in the position. A non-ranked candidate gets points. The winner is the candidate with highest total score.

When starting from a specific PSR for complete ballots, defined by scoring vector , two choices of particularly make sense:

  • zero score:

  • average score:

We denote the corresponding approximate rules as and . is known under the name average score modified Borda Count [8, 13], while is known under the name modified Borda Count [11]). In the experiments we report only on , as gives very similar results.

Young [24] characterized positional scoring rules by these four properties, which we describe informally (for resolute rules):

  • Neutrality: all candidates are treated equally

  • Anonymity: all voters are treated equally

  • Reinforcement: if and are two profiles (on disjoint electorates) and is the winner for and the winner for , then it is also the winner for .

  • Continuity: if and are two profiles and is the winner for but not for , adding sufficiently many votes of to leads to elect .

is a PSR if and only if it satisfies neutrality, anonymity, reinforcement and continuity [24].

These four properties still make sense for truncated ballots. It is not difficult to generalize Young’s result to - PSR:

Theorem 1.

A - voting rule is a - PSR if and only if it satisfies neutrality, anonymity, reinforcement, and continuity.

Proof.

The left-to-right direction is obvious. For the right-to-left direction, let us first define the --only property: a standard voting rule is --only if for any two complete profiles , if , then . Then (1) a positional scoring rule is --only if and only if (if this equality is not satisfied, then it is easy to construct two profiles , such that and ). Now, assume is a - rule satisfying neutrality, anonymity, reinforcement, and continuity. Let be the standard voting rule defined by . Clearly, also satisfies neutrality, anonymity, reinforcement, and continuity, and due to Young’s characterization result, is a PSR, associated with some vector . Because is also --only, using (1) we have , therefore, is a --PSR. ∎

3.2 Rules Based on Pairwise Comparisons

Given a truncated ballot and two candidates , we say that dominates in , denoted by , if one of these two conditions holds: (1) and are listed in , and ; (2) is listed in , and is not.

For instance, for , , and , then dominates , both and dominate and , but and remain incomparable in . Now, the notions of pairwise comparison and majority graph are extended to - truncated profiles in a straightforward way:

Definition 2.

Given a top- profile , is the number of voters in for whom dominates . The top- majority graph induced by is the graph whose set of vertices is the set of the candidates and in which there is a directed edge from to if .

The top- rules , and are defined exactly as their standard counterparts, but starting from the top- pairwise comparisons and majority graph instead of the standard ones. Note that , and (for all rules we consider) coincides with plurality.

Example 1.

Let us consider this 62-voter profile: 20 votes , 10 votes , 15 votes and 17 votes: .

Figure 1: top- approximations of Copeland and Maximin

Fig. 1 (a) shows the top- majority graph and the Copeland winner for , and Fig. 1 (b) shows the top- pairwise majority matrix and the winner for . In both cases, the winner for (resp. ) is (resp. ). For RP, the winner under for is the same as the winner under since the k-truncated majority graph does not create cycles.

4 Probability of Selecting the True Winner

The first way of measuring the quality of the - approximations is to determine the probability that they output the ‘true winner’; that is, the winner of the original voting rule, under various distributions (Subsection 4.1) and for real-world data (Subsection 4.2). In both cases, the procedure is similar: given a voting rule , we consider many profiles, and for each profile we compare to for each . The difference between Subsections 4.1 and 4.2 is that in the former we randomly draw profiles according to a given distribution, and for the latter, we draw a profile by selecting votes at random in the database. We include in our experiments rule defined by Ayadi et al. [1], which takes - ballots as input; and we compared it to our truncated rules. proceeds as follows: in each round the candidate with the smallest number of votes is eliminated (using a tie breaking when necessary), if all ranked candidates are eliminated by STV, the vote is then ‘exhausted’ and ignored during further counting.

4.1 Experiments Using Mallows Model

Here we follow the research direction initiated by Filmus and Oren [12], but we consider more rules, and beyond Impartial Culture we also consider correlated distributions within the Mallows model.For each experiment we draw 1000 random preference profiles. In the first set of experiments, we take , we let and vary, and we measure the accuracy of the approximate rule for and . Results are reported on Table 1. Note that for , our results can be viewed as answering the question: with which probability does the true winner with respect to the chosen rule coincide with the plurality winner?

n=100 n=200 n=300 n=400 n=500 n=100 n=200 n=300 n=400 n=500
0.7 0.902 0.958 0.986 0.992 1.0 0.951 0.98 0.992 1.0 1.0
0.8 0.77 0.855 0.9 0.94 0.963 0.853 0.913 0.956 0.972 0.986
0.9 0.588 0.694 0.685 0.718 0.771 0.772 0.805 0.827 0.846 0.873
1 0.434 0.445 0.424 0.422 0.397 0.576 0.56 0.586 0.598 0.584
0.7 0.908 0.968 0.991 0.994 1.0 0.947 0.99 1.0 1.0 1.0
0.8 0.736 0.847 0.891 0.934 0.949 0.822 0.904 0.952 0.984 0.982
0.9 0.497 0.567 0.655 0.684 0.726 0.62 0.69 0.77 0.805 0.838
1 0.325 0.332 0.323 0.343 0.319 0.458 0.432 0.45 0.442 0.425
0.7 0.908 0.969 0.986 0.99 1.0 0.968 0.991 1.0 1.0 1.0
0.8 0.787 0.856 0.915 0.939 0.955 0.872 0.934 0.961 0.976 0.977
0.9 0.57 0.633 0.691 0.717 0.748 0.735 0.76 0.794 0.838 0.869
1 0.415 0.4 0.423 0.393 0.391 0.52 0.532 0.544 0.545 0.525
0.7 0.941 0.986 0.996 1.0 1.0 0.98 0.992 1.0 1.0 1.0
0.8 0.895 0.916 0.958 0.959 0.968 0.958 0.974 0.987 0.988 0.996
0.9 0.805 0.808 0.83 0.866 0.863 0.895 0.921 0.934 0.939 0.952
1 0.725 0.742 0.74 0.697 0.737 0.872 0.867 0.859 0.861 0.859
0.7 0.926 0.972 0.995 0.995 1.0 0.963 0.994 1.0 1.0 1.0
0.8 0.778 0.856 0.908 0.939 0.957 0.871 0.928 0.967 0.983 0.989
0.9 0.587 0.64 0.674 0.718 0.749 0.725 0.765 0.777 0.838 0.862
1 0.426 0.405 0.416 0.375 0.385 0.558 0.524 0.557 0.498 0.519
0.7 0.907 0.981 0.985 0.998 1.0 0.959 0.993 0.997 1.0 1.0
0.8 0.808 0.865 0.917 0.918 0.943 0.882 0.933 0.962 0.966 0.974
0.9 0.603 0.64 0.721 0.729 0.763 0.742 0.776 0.792 0.855 0.846
1 0.45 0.464 0.477 0.471 0.468 0.576 0.593 0.61 0.592 0.585
Table 1: Success rate, Mallows model: , varying , and .

For : when and , prediction reaches 90% for Borda, Copeland, Maximin and STV, 92% for RP, and 94% for Harmonic. When , the accuracy is perfect for all rules. For , the success rate decreases but results are still good with a large number of voters. For and , the rate reaches 86% for Harmonic and 72% for Copeland, with intermediate (and similar) results for Borda, Maximin and and STV. For the , the rate decreases dramatically when becomes small, except for Harmonic (73% when against 46% for STV, 31% for Copeland and 40% for the remaining rules).

For : the probability of selecting the true winner reaches 100% (resp. 98%) when (resp. ) and (resp. ). With high values of , Harmonic still outperforms other rules followed by and STV then the other rules. Consistently with the results obtained by Bentert and Skowron [3] for the IC, approximating the maximin rule is harder than position scoring rules where maximin needs more information from the voters in order to obtain high approximation guarantees. In all cases, top-2 ballots seem to be always sufficient in practice to predict the winner with 100% accuracy with a low value of .

In the second set of experiments, we are interested in determining the value of needed to predict the correct winner with large elections and with high value of . We take , , and . Fig. 2 shows depicted results where 1000 random preference profiles are generated for each experiment. Results suggest that in large elections and unless is very high (), top- rules are able to identify the true winner when (resp. ) for Harmonic (resp. the remaining rules) out of . We can also observe the behavior of different truncated rules when : the best accuracy is obtained again by Harmonic and the accuracy of all other rules are very close, which we found surprising. When , the latter behavior changes: Harmonic still has the best results, followed by and STV, then the remaining rules. The good performance of Harmonic in all cases can be explained by the fact that the closer the scoring vector to plurality, the better the prediction.

Figure 2: Success rate, Mallows model: , , varying and .

Next, for each value of , , and , we generated 1000 random profiles, and for each of our rules, we determined the minimal value (as a function of ) such that the winner is correctly determined from the top- votes for all generated profiles. The results for are:

  • for , is always sufficient, whatever .

  • for , (resp. ) is always sufficient for (resp. ), whatever the value of .

  • for , we observe that the minimal value of such that the correct winner is always correctly predicted is around (for and (for .

  • for , the minimal value of is : we always find a generated profile for which we get an incorrect result if the profile is not complete.

The results for Copeland, maximin, RP and STV are similar to those for Borda. For Harmonic, we observe that is always sufficient for and , and that for (resp. ), the value of needed is around (resp. ).

In order to see how our approximations behave with small number of voters and a high dispersion parameter, we take , , , and . The results are on Fig. 3. The worst performance is obtained with Copeland, while the other rules perform more or less equally well. These results are consistent with the results obtained by Skowron et al. [23] for multiwinner rules: elections with few voters and high dispersion appear to be the worst-case scenario for predicting the correct winner using top-truncated ballots. For Harmonic, even with few voters, winner prediction is almost perfect when and .

Figure 3: Success rate, Mallows model: , , varying and .

4.2 Experiments Using Real Data Sets

We now consider real data set from Preflib [19]: 2002 election for Dublin North constituency with 12 candidates and 3662 voters. We consider data with samples of voters among (), starting by and increment in steps of . In each experiment, 1000 random profiles are constructed with voters; then we consider the top- ballots obtained from these profiles, with , and we compute the frequency with which we select the true winner. Fig. 4 shows results for Dublin with small elections () while Fig.  5 presents results for large elections (). Arrows indicate the number of voters from which the prediction is perfect.

Consistently with the results of Fig. 3, for small elections; the success rate is low when is too small, except for Harmonic where it gives the best performance followed by STV (especially when ) then the remaining rules, e.g. For Harmonic (resp. STV), 92% (resp. 82%) accuracy is reached with , and against around 75% for the remaining rules.

Figure 4: Success rate, Dublin, varying ; .
Figure 5: Success rate, Dublin, varying ; .

For large elections, when , the different approximations exhibit almost the same behavior except Harmonic, that performs better especially with few voters. Obviously, increasing the value of leads to a decrease in the number of voters needed for correct winner selection. In general, the different approximations needs a sufficient number of voters to converge to the correct prediction. Scoring rules tend to require less voters.

5 Measuring the Approximation Ratio

5.1 Worst Case Study

In order to measure the quality of approximate voting rules whose definition is based on score maximization, a classical method consists in computing the worst-case approximation ratio between the scores (for the original rule) of the ‘true’ winner and of the winner of the approximate rule. Using worst-case score ratios is classical: they are defined for measuring the quality of approximate voting rules [7, 22], for defining the price of anarchy of a voting rule [6] or for measuring the distortion of a voting rule [4].

Worst-case score ratios particularly make sense if the score of a candidate is meaningful beyond its use for determining the winner. This is definitely the case for Borda, as the Borda count is often seen as a measure of social welfare (see [9]). This worst-case score ratio is called the price of top- truncation.

Definition 3.

Let be a voting rule defined as the maximization of a score , and a top- approximation of . The price of top--truncation for , , , and , is defined as: .

Positional Scoring Rules

Let be a positional scoring rule defined with scoring vector . Assume the tie-breaking priority favors . Let be a top- approximation of , associated with vector , with the same tie-breaking priority. Let , i.e., for . Obviously, . For instance, if is the average-score approximation of the Borda rule, then and .

Let be the score of for under and be the score of for under . From now on when we write scores we omit and , i.e., we write instead of , instead of etc. In the rest of Subsection 5.1 we assume . Let and .

Lemma 1.

Proof.

The total number of points given to candidates under is , therefore .

Let us write , where (resp. ) is the number of points that gets from the top (resp. bottom ) positions of the ballots in . Let be the number of ballots in which is not in the top positions. Then .

As appears in at least top- ballots, we have . Moreover we have . Now,

We now focus on the lower bound. We build the following pathological complete profile such that:

  • the winner for (resp. ) is (resp. ).

  • in , all candidates get the same number of points ( wins thanks to tie-breaking), and and get all their points from top-1 positions.

  • in , the score of is minimized by ranking it last everywhere where it was not in the top positions, and the score of is maximized by ranking it in position everywhere where it was not in the top positions.

  • is symmetric in .

Formally, is defined as follows:

  1. for each ranked list (resp. ) of (resp. ) candidates in : votes and votes (resp. votes ). and will be fixed later.

  2. and are chosen in such a way that all candidates get the same score .

Now, is obtained by completing as follows:

  1. each top- vote is completed into . “” means the remaining candidates are in an arbitrary order.

  2. each top- vote is completed into .

  3. each top- vote is completed into .

For instance, for and , is as follows:

Let and .

Lemma 2.

,

Proof.

In , and appear in top position in a number of votes equal to times the number of different permutations (ordered lists) of candidates out of , i.e. times. Thus . For similar reasons, for each ,

As a consequence, all candidates have the same score in if and only if

We fix and such that this equality holds. Thanks to the tie-breaking priority, the winner in is . In , the winner is and the scores of and are as follows:

Lemma 3.
Proof.

appears at the top of votes and at the bottom of all others, hence . appears times top position, and in position in the remaining votes, i.e., . Thus

Lemma 4.

Proof.

From Lemma 3 we get

Finally, using the expression of we get

From this we conclude:

Putting Lemmas 1 and 4 together we get

Proposition 1.

Note that the lower and upper bound coincide when , giving a tight worst-case approximation ratio for this class of approximations. This is however not guaranteed when (the reason being that the pathological profile used in the proof of Lemma 1 may not be the worst). Moreover, when , our (lower and upper) bound coincides with the optimal ratio given in [3] (Theorem 1).111Note that the ratios in our paper are the inverse of the ratios in [3]. That is, the inverse of the ratio given in Theorem 1 of [3] coincides with our ratio for . Since the ratio in [3] is shown to be the best possible ratio, this show that taking gives a optimal top- approximation of a positional scoring rule.222Interestingly, [3] give another optimal rule (thus with same worst-case ratio), which is much more complex, and which is not a top- PSR. Comparing the average ratio of both rules is left for further study.

In particular:

  • for (, the lower and upper bounds coincide and are equal to .

  • for (), the lower bound is and the upper bound is .

  • for (, the lower and upper bounds are equal to .

Also, note that for -approval with and , the (exact) worst-case ratio does not depend on . As a corollary, we get the following order of magnitudes when grows:

  • .

  • .

  • .

Maximin

Let be the Maximin rule with tie-breaking priority , and be the -truncated version of the Maximin rule with the same tie-breaking priority order. Let