Aggregation of preferences is a central problem in the field of social choice, and has received a considerable amount of attention from the artificial intelligence research community[<]see e.g., ¿Coni10a. While the most-studied scenario is that of selecting a single candidate out of many, it is often the case that one needs to select a fixed-size set of winners (committee): this includes domains such as parliamentary elections, the hiring of faculty members, or (automated) agents deciding on a set of plans [Elkind, Lang, SaffidineElkind et al.2011, LeGrand, Markakis, MehtaLeGrand et al.2007, Davis, Orrison, SuDavis et al.2014]. The study of algorithmic complexity of voting rules that output committees is an active research direction [<]see e.g., ¿LMM07a,PSZ08a,MPRZ08a,CKM10a,bou-lu:c:chamberlin-courant,BSU13a,SFS13a,cor-gal-spa:c:sp-width,sko-yu-fal:c:mwsc.
Much of the prior work on properties of multi-winner rules focuses on the setting where voters’ preferences are total orders of the candidates. In contrast, in this paper we focus on approval-based rules, where each voter lists the subset of candidates that she approves of. One of the advantages of such rules is the simplicity of ballots: approval ballots reduce the cognitive burden on voters (rather than providing a full ranking of the candidates, a voter only needs to decide which candidates to approve) and are also easier to communicate to the election authority. The most straightforward way to aggregate approvals is to have every approval for a candidate contribute one point to that candidate’s score and select the candidates with the highest score. This rule is called Approval Voting (). has many desirable properties in the single winner case [Brams, Kilgour, SanverBrams et al.2006, EndrissEndriss2013], including its “simplicity, propensity to elect Condorcet winners (when they exist), its robustness to manipulation and its monotonicity” [BramsBrams2010, p. viii]. However, for the case of multiple winners, the merits of are “less clear” [BramsBrams2010, p. viii]. In particular, for the multi-winner case, does not address concerns such as proportional representation: if the goal is to select winners, of the agents approve the same candidates, and the remaining agents approve a disjoint set of candidates, then the agents in minority do not get any of their approved candidates selected.
As a consequence, over the years, several multi-winner rules based on approval ballots have been proposed [<]see e.g., ¿Kilg10a. Under Proportional Approval Voting (), each agent’s contribution to the committee’s total score depends on how many candidates from the agent’s approval set have been elected. A sequential variant of this rule is known as Reweighted Approval Voting (). Another way to modulate the approvals is through computing a satisfaction score for each agent based on the ratio of the number of their approved candidates appearing in the committee and their total number of approved candidates; this idea leads to Satisfaction Approval Voting (). One could also use a distance-based approach: Minimax Approval Voting () selects a set of candidates that minimizes the maximum Hamming distance from the submitted ballots. All the rules informally described above have a more egalitarian objective than . For example, Steven Brams, the main proponent of in single-winner elections, has argued that is more suitable for more equitable representation in multi-winner elections [Brams KilgourBrams Kilgour2014].
Based on their relative merits, approval-based multi-winner rules have been examined in great detail in both economics and computer science literature in recent years [Brams FishburnBrams Fishburn2007, LeGrand, Markakis, MehtaLeGrand et al.2007, Meir, Procaccia, Rosenschein, ZoharMeir et al.2008, Caragiannis, Kalaitzis, MarkakisCaragiannis et al.2010]. The Handbook of Approval Voting discusses various approval-based multi-winner rules including , , and [KilgourKilgour2010]. However, there has been limited axiomatic analysis of these rules from the perspective of representation.
In this paper, we introduce and investigate the notion of justified representation () in approval-based voting. Briefly, a committee is said to provide justified representation for a given set of ballots if every large enough group of voters with shared preferences is allocated at least one representative. A rule is said to satisfy justified representation if it always outputs a committee that provides justified representation. This concept is related to the Droop proportionality criterion [DroopDroop1881] and Dummett’s solid coalition property [DummettDummett1984, Tideman RichardsonTideman Richardson2000, Elkind, Faliszewski, Skowron, SlinkoElkind et al.2014], but is specific to approval-based elections. A somewhat similar notion of representativeness was recenty proposed by Dud14a; however, justified representation does not imply representativeness and, conversely, representativeness does not imply justified representation.
We show that every set of ballots admits a committee that provides justified representation; moreover, such a committee can be computed efficiently, and checking whether a given committee provides can be done in polynomial time as well. This shows that justified representation is a reasonable requirement. However, it turns out that very few of the existing multi-winner approval-based rules satisfy it. Specifically, we demonstrate that , , and the standard variant of do not satisfy . On the positive side, is satisfied by and some of its variants, as well as an extreme variant of . Also, satisfies for a restricted type of voters’ preferences. We then consider a strengthening of the axiom, which we call extended justified representation (). This axiom captures the intuition that a very large group of voters with similar preferences may deserve not just one, but several representatives. turns out to be a more demanding property than : of all voting rules considered in this paper, only satisfies . Moreover, it is computationally hard to check whether a given committee provides . We also consider another strengthening of , which we call strong justified representation (); however, it turns out that for some inputs is impossible to achieve. We conclude the paper by showing how can be used to formulate other attractive approval-based multi-winner rules, and by identifying several directions for future work.
We consider a social choice setting with a set of agents (voters) and a set of candidates . Each agent submits an approval ballot , which represents the subset of candidates that she approves of. We refer to the list of approval ballots as the ballot profile. We will consider approval-based multi-winner voting rules that take as input , where is a positive integer that satisfies , and return a subset of size , which we call the winning set, or committee [Kilgour MarshallKilgour Marshall2012]. We omit and from the notation when they are clear from the context. Several such rules are defined below. Whenever the description of the rule does not uniquely specify a winning set, we assume that ties are broken according to a fixed priority order over size- subsets; however, most of our results do not depend on the tie-breaking rule.
Approval Voting (AV) Under , the winners are the candidates that receive the largest number of approvals. Formally, the approval score of a candidate is defined as , and outputs a set of size that maximizes . has been adopted by several academic and professional societies such as the Institute of Electrical and Electronics Engineers (IEEE) and the International Joint Conference on Artificial Intelligence.
Satisfaction Approval Voting (SAV) An agent’s satisfaction score is the fraction of her approved candidates that are elected. maximizes the sum of agents’ satisfaction scores. Formally, finds a set of size that maximizes . The rule was proposed with the aim of “representing more diverse interests” than [Brams KilgourBrams Kilgour2014].
Proportional Approval Voting (PAV) Under , an agent is assumed to derive a utility of from a committee that contains exactly of her approved candidates, and the goal is to maximize the sum of the agents’ utilities. Formally, the -score of a set is defined as , where , and outputs a set of size with the highest -score. was proposed by mathematician Forest Simmons in 2001, and captures the idea of diminishing returns—an individual agent’s preferences should count less the more she is satisfied. It has recently been shown that computing is NP-hard [Aziz, Gaspers, Gudmundsson, Mackenzie, Mattei, WalshAziz et al.2014]. We can generalize the definition of
by using an arbitrary non-increasing score vector in place of: for every vector , where are non-negative reals111 It is convenient to think of as an infinite vector; however, for an election with candidates only the first entries of matter. To analyze the complexity of - rules, one would have to place additional requirements on ; however, we do not consider algorithmic properties of such rules in this paper., and , we define a voting rule - that, given a ballot profile and a target number of winners , returns a set of size with the highest - score, defined by , where .
Reweighted Approval Voting (RAV) converts into a multi-round rule, by selecting a candidate in each round and then reweighing the approvals for the subsequent rounds. Specifically, it starts by setting . Then in round , , it computes the approval weight of each candidate as
selects a candidate with the highest approval weight, and adds him to . was invented by the Danish polymath Thorvald Thiele in the early 1900’s. has also been referred to as “sequential proportional AV” [Brams KilgourBrams Kilgour2014], and was used briefly in Sweden during the early 1900’s. Just as for , we can extend the definition of to score vectors other than : every vector with and defines a sequential voting rule -, which proceeds as , except that it computes the approval weight of a candidate in round as , where is the winning set after the first rounds.
Minimax Approval Voting (MAV) returns a committee that minimizes the maximum Hamming distance between and the agents’ ballots. Formally, let and define the -score of a set as . outputs a size- set with the lowest -score. Minimax approval voting was proposed by BKS07a. Computing the outcome of is known to be NP-hard [LeGrand, Markakis, MehtaLeGrand et al.2007].
3 Justified Representation
We will now define the main concept of this paper.
Definition 1 (Justified representation ())
Given a ballot profile over a candidate set and a target committee size , , we say that a set of candidates of size provides justified representation for if there does not exist a set of voters with such that and for all . We say that an approval-based voting rule satisfies justified representation () if for every profile and every target committee size it outputs a winning set that provides justified representation for .
The intuition behind this definition is that if candidates are to be selected, then a set of
voters that are completely unrepresented can demand that at least one of their unanimously approved candidates should be selected.
3.1 Existence and Computational Properties
We start our analysis of justified representation by observing that, for every ballot profile and every value of , there is a committee that provides justified representation for , and, moreover, such a committee can be computed efficiently given the voters’ ballots.
There exists a polynomial-time algorithm that, given a ballot profile over a candidate set , and a target committee size , , outputs a set of candidates such that and provides justified representation for .
Consider the following greedy algorithm, which we will refer to as Greedy Approval Voting (). We start by setting , , and . As long as and is non-empty, we pick a candidate that has the highest approval score with respect to , and set , . Also, we remove from all ballots such that . If at some point we have and is empty, we add an arbitrary set of candidates from to and return ; if this does not happen, we terminate after having picked candidates.
Suppose for the sake of contradiction that for some and some ballot profile , outputs a committee that does not provide justified representation for . Then there exists a set with such that and, when terminates, every ballot such that is still in . Consider some candidate . At every point in the execution of , ’s approval score is at least . As was not elected, at every stage the algorithm selected a candidate whose approval score was at least as high as that of . Since at the end of each stage the algorithm removed from all ballots containing the candidate added to at that stage, altogether the algorithm has removed at least ballots from . This contradicts the assumption that contains at least ballots when the algorithm terminates. ∎
Theorem 1 shows that it is easy to find a committee that provides justified representation for a given ballot profile. It is also not too hard to check whether a given committee provides . Indeed, while it may seem that we need to consider every subset of voters of size , in fact it is sufficient to consider the candidates one by one, and, for each candidate , compute ; the set fails to provide justified representation for if and only if there exists a candidate with . Thus, we obtain the following result.
There exists a polynomial-time algorithm that, given a ballot profile over a candidate set , and a committee , , decides whether provides justified representation for .
3.2 JR and Unanimity
Another desirable property of approval-based voting rules is unanimity: we say that an approval-based rule is unanimous if, given a ballot profile with and a target committee size , it outputs a winning set , , such that . While unanimity may appear to be similar to , the two properties are essentially unrelated. Specifically, for unanimity implies (to see this, note that the only way to violate for is to select a candidate that is not approved by any of the voters when there exists a candidate that is approved by all voters), but for this is not the case.
Fix a . Let , , , for . On this profile, every voting rule that provides outputs a set with for . However, a unanimous rule may behave arbitrarily.
The following example shows that does not imply unanimity either; note that it works even for .
Fix a and define , , , for , . Consider a voting rule that outputs on this profile, and coincides with on every other profile. Clearly, this rule is not unanimous, but it is impossible to find a group of unrepresented voters for .
3.3 JR under Approval-based Rules
We have argued that justified representation is a reasonable condition: there always exists a committee that provides it, and, moreover, such a committee can be computed efficiently. It is therefore natural to ask whether prominent voting rules satisfy . In this section, we will answer this question for , , , , and , as well as identify conditions on that are sufficient/necessary for - and - to satisfy .
In what follows, for each rule we will try to identify the range of values of for which this rule satisfies . Trivially, all considered rules satisfy for . It turns out that fails for , and for the answer depends on the tie-breaking rule.
For , satisfies if ties are broken in favor of sets that provide . For , fails .
Suppose first that . Fix a ballot profile . If every candidate is approved by less than voters in , is trivially satisfied. If some candidate is approved by more than voters in , then selects some such candidate, in which case no group of voters is unrepresented, so is satisfied in this case as well. It remains to consider the case where , some candidates are approved by voters, and no candidate is approved by more than voters. Then necessarily picks at least one candidate approved by voters; denote this candidate by . In this situation can only be violated if the voters who do not approve all approve the same candidate (say, ), and this candidate is not elected. But the approval score of is , and, by our assumption, the approval score of every candidate is at most , so this is a contradiction with our tie-breaking rule. This argument also illustrates why the assumption on the tie-breaking rule is necessary: it can be the case that voters approve and , and the remaining voters approve , in which case the approval score of is the same as that of .
For , we let , , and consider the profile where the first voter approves , whereas each of the remaining voters approves all of . requires to be selected, but selects . ∎
In contrast, and fail even for .
and do not satisfy for .
We first consider . Fix , let , , , and consider the profile , where , , for . requires each voter to be represented, but will choose : the -score of is , whereas the -score of every committee with is at most . Therefore, the first voter will remain unrepresented.
For , we use the following construction. Fix , let , , , and consider the profile , where for , for . Every committee of size that provides for this profile contains . However, fails to select . Indeed, the -score of is : we have for and for . Now, consider some committee with , . We have for some , so . Thus, prefers to any committee that includes . ∎
The constructions used in the proof of Theorem 6 show that and may behave very differently: appears to favor voters who approve very few candidates, whereas appears to favor voters who approve many candidates.
Interestingly, we can show that satisfies if we assume that each agent approves exactly candidates and ties are broken in favor of sets that provide justified representation.
If the target committee size is , for all , and ties are broken in favor of sets that provide , then satisfies .
Consider a profile with for all .
Observe that if there exists a set of candidates with such that for all , then will necessarily select some such set. Indeed, for any such set we have for each , whereas if for some set with and some , then . Further, by definition, every set such that and for all provides justified representation for .
On the other hand, if there is no -element set of candidates that intersects each , , then the -score of every set of size is , and therefore can pick an arbitrary size- subset. Since we assumed that the tie-breaking rule favors sets that provide , our claim follows. ∎
While Theorem 7 provides an example of a setting where a well-known voting rule satisfies , this result is not entirely satisfactory: first, we had to place a strong restriction on voters’ preferences, and, second, we used a tie-breaking rule that was tailored to . In contrast, we will now show that another common voting rule, namely, satisfies for all ballot profiles and irrespective of the tie-breaking rule.
Fix a ballot profile and a and let . Let be the output of on . Suppose for the sake of contradiction that there exists a set , , such that , but . Let be some candidate approved by all voters in .
For each candidate , define its marginal contribution as the difference between the -score of and that of . Let denote the sum of marginal contributions of all candidates in . Observe that if were to be added to the winning set, this would increase the -score by at least . Therefore, it suffices to argue that the marginal contribution of some candidate in is less than : this would mean that swapping this candidate with increases the -score, a contradiction. To this end, we will prove that ; as , our claim would then follow by the pigeonhole principle.
Consider the set ; we have , so . Pick a voter , and let . If , this voter contributes exactly to the marginal contribution of each candidate in , and hence her contribution to is exactly . If , this voter does not contribute to at all. Therefore, we have , which is what we wanted to prove. ∎
The reader may observe that the proof of Theorem 8 applies to all voting rules of the form - where the weight vector satisfies for all . In Section 4 we will see that this condition on is also necessary for - to satisfy . (see Lemma 1 in the proof of Theorem 13).
Next, we consider . As this voting rule can be viewed as a tractable approximation of (recall that is NP-hard to compute), one could expect that satisfies as well. However, this turns out not to be the case, at least if is sufficiently large.
satisfies for , but fails it for .
For , we can use essentially the same argument as for (see the proof of Theorem 5); however, we do not need to assume anything about the tie-breaking rule. Indeed, just as in that proof, we only have to worry about the case where , no candidate is approved by more than voters, some candidate is approved by voters, another candidate is approved by a disjoint set of voters, and picks in the first round. But then the -score of in the second round is , and the only other candidates with this score are approved by the same group of voters, so one of them will necessarily be selected, irrespective of the tie-breaking rule.
Now, suppose that . Consider a profile over a candidate set with voters who submit the following ballots:
Candidates and are each approved by voters, the most of any candidate and these voters’ approvals do not overlap, so selects and first. This reduces the scores of and from to , so , whose score is , is selected next. Now, the scores of and become . The selection of any of or does not affect the score of the others, so all seven of these candidates will be selected before , who has approvals. Thus, after the selection of candidates, there are unrepresented voters who jointly approve .
To extend this construction to , we create additional candidates and additional voters such that for each new candidate, there are new voters who approve that candidate only. Note that we still have . will proceed to select , followeed by additional candidates, and or one of the new candidates will remain unselected. ∎
While itself fails , one could hope that this can be fixed by modifying the weights, i.e., that - satisfies for a suitable weight vector . However, Theorem 9 can be extended to - for every weight vector such that .
For every vector with , there exists a value of such that - does not satisfy for .
Pick a positive integer such that . Let , where
For each and each we construct voters who approve only and voters who approve and only. Finally, we construct voters who approve only and voters who approve only (note that the number of voters who approve is positive by our choice of ).
Set . Note that the number of voters is given by
and hence .
Under - initially the score of each candidate in is , the score of each candidate in is , the score of is , and the score of is , so in the first rounds the candidates from get elected. After that, the score of every candidate in becomes , while the scores of and remains unchanged. Therefore, in the next rounds the candidates from get elected. At this point, candidates are elected, and is not elected, even though the voters who approve him do not approve of any of the candidates in the winning set.
To extend this argument to larger values of , we proceed as in the proof of Theorem 9: for , we add new candidates, and for each new candidate we construct new voters who approve that candidate only. Let the resulting number of voters be ; we have , so - will first select the candidates in , followed by the candidates in , and then it will choose winners among the new candidates and . As a result, either or one of the new candidates will remain unselected. ∎
Theorem 10 partially subsumes Theorem 9: it implies that fails , but the proof only shows that this is the case for , while Theorem 9 states that fails for already. We chose to include the proof of Theorem 9 because we feel that it is useful to know what happens for relatively small values of . We remark that it remains an open problem whether satisfies for .
As we require , the only weight vector not captured by Theorem 10 is . In fact, - satisfies : indeed, this rule is exactly the greedy rule ! We can extend this result somewhat, by allowing the entries of the weight vector to depend on the number of voters : the argument used to show that satisfies extends to - where the weight vector satisfies . In particular, the rule - is somewhat more appealing than : for instance, if and , will pick , and then behave arbitrarily, whereas - will also pick , but then it will continue to look for candidates approved by as many voters as possible.
4 Extended Justified Representation
We have identified two families of voting rules that satisfy for arbitrary ballot profiles: - with (this includes the rule) and - with (this includes the rule). The obvious advantage of the greedy rule is that its output can be computed efficiently, whereas computing the output of in NP-hard. However, arguably, puts too much emphasis on representing every voter, at the expense of ensuring that large sets of voters with shared preferences are allocated an adequate number of representatives. For instance, if , there are voters who approve and , while and are each approved by a single voter, the greedy rule would include both and in the winning set, whereas in many settings it would be more reasonable to choose both and (and one of and ).
This issue is not addressed by the axiom, as it does not care if a given voter is represented by one or more candidates. Thus, if we want to capture the intuition that large cohesive groups of voters should be allocated several representatives, we need a stronger condition. Recall that says that each group of voters that all approve the same candidate “deserves” at least one representative. It seems reasonable to scale this idea and say that, for every , each group of voters that all approve the same candidates “deserves” at least representatives. This approach can be formalized as follows.
Definition 2 (Extended justified representation ())
Consider a ballot profile over a candidate set , a target committee size , , and a positive integer , . We say that a set of candidates , , provides -justified representation for if there does not exist a set of voters with such that , but for each ; we say that provides extended justified representation () for if it provides - for for all , . We say that an approval-based voting rule satisfies -justified representation (-) if for every profile and every target committee size it outputs a committee that provides - for . Finally, we say that a rule satisfies extended justified representation () if it satisfies - for all , .
Observe that - is simply . However, is not implied by : this is illustrated by the -candidate -voter example earlier in this section. Further, although is stronger than , it still does not imply unanimity; this follows from Example 4 in Section 3.
4.1 EJR under Approval-based Rules
It is natural to ask which of the voting rules that satisfy also satisfy . Our -voter -candidate example immediately shows that for the answer is negative. Consequently, no - rule such that the entries of do not depend on satisfies : if , this rule is , and if , this follows from Theorem 10.
The next example shows that fails even if each voter approves exactly candidates (recall that under this assumption satisfies ).
Let , , where and the sets are pairwise disjoint. Let , where for , and for . will select exactly one candidate from each of the sets and , but dictates that at least two candidates from are chosen.
It remains to consider .
Suppose that violates for some value of , and consider a ballot profile , a value of and a set of voters , , that witness this. Let , , be the winning set. We know that at least one of the candidates approved by all voters in is not elected; let be some such candidate. Each voter in has at most representatives in , so the marginal contribution of (if it were to be added to ) would be at least . On the other hand, the argument in the proof of Theorem 8 can me modified to show that the sum of marginal contributions of candidates in is at most .
Now, consider some candidate with the smallest marginal contribution; clearly, his marginal contribution is at most . If it is strictly less than , we are done, as we can improve the total -score by swapping and , a contradiction. Therefore suppose it is exactly , and therefore the marginal contribution of each candidate in is exactly . Since satisfies , we know that for some . Pick some candidate , and set . Observe that after is removed, adding increases the total -score by at least . Indeed, approves at most candidates in and therefore adding to contributes at least to her satisfaction. Thus, the -score of is higher than that of , a contradiction again. ∎
Interestingly, Theorem 12 does not extend to weight vectors other than : our next theorem shows that is the unique - rule that satisfies .
For every weight vector with , the rule - does not satisfy .
Consider a weight vector . If for some , then - fails .
Suppose that for some and . Pick so that divides ; let . Let , where , , and the sets are pairwise disjoint. Note that . Also, construct pairwise disjoint groups of voters so that , , and for each the voters in approve the candidates in only. Observe that the total number of voters is given by .
We have , so every committee that provides justified representation for this profile must elect . However, we claim that - elects all candidates in instead. Indeed, if we replace an arbitrary candidate in with , then under - the total score of our committee changes by
i.e., has a strictly higher score than any committee that includes . ∎
Consider a weight vector . If for some , then - fails -.
Suppose that for some and . Pick . Let , where , and . Note that . Also, construct pairwise disjoint groups of voters so that , , the voters in approve the candidates in only, and for each the voters in approve only. Note that the number of voters is given by .
We have and , so every committee that provides must select all candidates in . However, we claim that - elects all candidates from and candidates from instead. Indeed, let be some candidate in , let be some candidate in , and let , . The difference between the total score of and that of is