Achieving Fully Proportional Representation: Approximability Results

12/14/2013 ∙ by Piotr Skowron, et al. ∙ 0

We study the complexity of (approximate) winner determination under the Monroe and Chamberlin--Courant multiwinner voting rules, which determine the set of representatives by optimizing the total (dis)satisfaction of the voters with their representatives. The total (dis)satisfaction is calculated either as the sum of individual (dis)satisfactions (the utilitarian case) or as the (dis)satisfaction of the worst off voter (the egalitarian case). We provide good approximation algorithms for the satisfaction-based utilitarian versions of the Monroe and Chamberlin--Courant rules, and inapproximability results for the dissatisfaction-based utilitarian versions of them and also for all egalitarian cases. Our algorithms are applicable and particularly appealing when voters submit truncated ballots. We provide experimental evaluation of the algorithms both on real-life preference-aggregation data and on synthetic data. These experiments show that our simple and fast algorithms can in many cases find near-perfect solutions.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

We study the complexity of (approximate) winner determination under the Monroe [32] and Chamberlin–Courant [10] multiwinner voting rules, which aim at selecting a group of candidates that best represent the voters. Multiwinner elections are important both for human societies (e.g., in indirect democracies for electing committees of representatives like parliaments) and for software multiagent systems (e.g., for recommendation systems [25]), and thus it is important to have good multiwinner rules and good algorithms for them. The Monroe and Chamberlin–Courant rules are particularly appealing because they create an explicit (and, in some sense, optimal) connection between the elected committee members and the voters; each voter knows his or her representative and each committee member knows to whom he or she is accountable. In the context of recommendation systems this means that every selected item is personalized, i.e., recommended to a particular user. Moreover, the Monroe rule ensures the proportionality of the representation. We assume that candidates participate in the election and that the society consists of voters, who each rank the candidates, expressing their preferences about who they would like to see as their representative.

When choosing a -member committee, the Monroe and Chamberlin–Courant rules work as follows. For each voter they assign a single candidate as their representative, respecting the following rules:

  • altogether exactly candidates are assigned to the voters. For the Monroe rule, each candidate is assigned either to about voters or to none; for the Chamberlin–Courant rule there is no such restriction and each committee member might be representing a different number of voters. The committee should take this into account in its operation, i.e., by means of weighted voting.

  • the candidates are selected and assigned to the voters optimally minimizing the total (societal) dissatisfaction or maximizing the total (societal) satisfaction.

The total (dis)satisfaction is calculated on the basis of individual (dis)satisfactions. We assume that there is a function such that measures how well a voter is represented by the candidate that this voter ranks as ’th best. The function is the same for each voter. We can view either as a satisfaction function (then it should be a decreasing one) or as a dissatisfaction function (then it should be an increasing one). For example, it is typical to use the Borda count scoring function whose -candidate dissatisfaction variant is defined as , and whose satisfaction variant is . In the utilitarian variants of the rules, the assignment should maximize (minimize) the total satisfaction (dissatisfaction) calculated as the sum of the voters’ individual satisfactions (dissatisfactions) with their representatives. In the egalitarian variants, the assignment should maximize (minimize) the total satisfaction (dissatisfaction) calculated as the satisfaction (dissatisfaction) of the worst-off voter.

The Monroe and Chamberlin–Courant rules create a useful connection between the voters and their representatives that makes it possible to achieve both candidates’ accountability to the voters, and proportional representation of voters’ views. Among common voting rules, the Monroe and Chamberlin–Courant rules seem to be unique in having both the accountability and the proportionality properties simultaneously. For example, First Past the Post system (where the voters are partitioned into districts with a separate single-winner Plurality election in each) can give very disproportionate results (forcing some of the voters to be represented by candidates they dislike). On the other side of the spectrum are the party-list systems, which achieve perfect proportionality. In those systems the voters vote for the parties, based on these votes each party receives some number of seats in the parliament, and then each party distributes the seats among its members (usually following a publicly available list of the party’s candidates). This makes the elected candidates feel more accountable to apparatchiks of their parties than to the voters. Somewhere between the First Past the Post system and the party-list systems, we have the single transferable vote rule (STV), but for STV it is difficult to tell which candidate represents which voters.

Unfortunately, the Monroe and Chamberlin–Courant rules have one crucial drawback that makes them impractical. It is -hard to tell who the winners are! Specifically, -hardness of winner determination under the Monroe and Chamberlin–Courant rules was shown by Procaccia et al. [37] and by Lu and Boutilier [25]. Worse yet, the hardness holds even if various natural parameters of the election are small [7]. Rare easy cases include those, where the committee to be elected is small, or we consider the Chamberlin–Courant rule and the voters have single-peaked [7] or single-crossing preferences [43].

Lu and Boutilier [25] proposed to use approximation algorithms and have given the first such algorithm for the Chamberlin–Courant system. Their procedure outputs an assignment that achieves no less than fraction of the optimal voter satisfaction. However, the approximation ratio here means that it is possible that, on average, each agent is represented by a candidate that this agent prefers to only about 63% of the candidates, even if there is a perfect solution that assigns each agent to their most preferred candidate. Such issues, however, would not occurr if we had a constant-factor approximation algorithm minimizing the total dissatisfaction. Indeed, if a perfect solution exists, then the optimal dissatisfaction is zero and a constant-factor approximation algorithm must also output this perfect solution.

The use of approximation algorithms in real-life applications requires some discussion. For example, their use is naturally justified in the context of recommendation systems. Here the strive for optimality is not crucial since a good but not optimal recommendation still has useful information and nobody would object if we replaced the exact recommendation with an approximate one (given that the exact one is hard to calculate). For example, Amazon.com may recommend you a book on gardening which may not be the best book for you on this topic, but still full of useful advice. For such situations, Herbert Simon [41] used the term ‘satisficing,’ instead of optimizing, to explain the behavior of decision makers under circumstances in which an optimal solution cannot be easily determined. On page 129 he wrote: “Evidently, organisms adapt well enough to ÔsatisficeÕ; they do not, in general, ‘optimize’.” Effectively, what Simon says is that the use of approximation algorithms fits well with the human nature.

Still, the use of approximation algorithms in elections requires some care. It is conceivable that the electoral commission finds an allocation of voters to candidates with a certain value of (dis)satisfaction and one of the parties participating in the election finds an allocation with a better value. This can lead to a political deadlock. There are two ways of avoiding this. Firstly, an approximation algorithm can be fixed by law. In such a case, it becomes an acting voting rule and a new way to measure fairness in the society. Secondly, an electoral commission may calculate the allocation, but also publish the raw data and issue a call for submissions. If, within the period specified by law, nobody can produce a better allocation, then the committee goes ahead and announces the result. If someone produces a better allocation, then the electoral commission uses the latter one.

The use of approximation algorithms is even more natural in elections with partial ballots. Indeed, even if we use an exact algorithm to calculate the winners, the results will be approximate anyway since the voters provide us with approximations of their real preferences and not with their exact preferences.

1.1 Our Results

In this paper we focus on approximation algorithms for winner determination under the Monroe and Chamberlin–Courant rules. Our first goal is to seek algorithms that find assignments for which the dissatisfaction of voters is within a fixed bound of the optimal one. Unfortunately, we have shown that under standard complexity-theoretic assumptions such algorithms do not exist. Nonetheless, we found good algorithms that maximize voter’s satisfaction. Specifically, we have obtained the following results:

  1. The Monroe and Chamberlin–Courant rules are hard to approximate up to any constant factor for the dissatisfaction-based cases (both utilitarian and egalitarian ones; see Theorems 123 and 4) and for the satisfaction-based egalitarian cases (see Theorems 5 and 7).

  2. For the satisfaction-based utilitarian framework we show the following. For the Monroe rule with the Borda scoring function we give a -approximation algorithm (often, the ratio is much better; see Section 4). In case of an arbitrary positional scoring function we give a ()-approximation algorithm (Theorem 13). For the Chamberlin–Courant rule with the Borda scoring function we give a polynomial-time approximation scheme (that is, for each , , we have a polynomial-time -approximation algorithm; see Theorem 15).

  3. We provide empirical evaluation of our algorithms for the satisfaction-based utilitarian framework, both on synthetic and real-life data. This evaluation shows that in practice our best algorithms achieve at least approximation ratios, and even better results are typical (see Section 5).

  4. We show that our algorithms work very well in the setting where voters do not necessarily rank all the candidates, but only provide the so-called truncated ballots, in which they rank several most preferred candidates (usually at least three). We provide theoretical guarantees on the performance of our algorithms (Propositions 10 and 16) as well as empirical evaluation (see Section 5.4).

Our results show that, as long as one is willing to accept approximate solutions, it is possible to use the utilitarian variants of the Monroe and Chamberlin–Courant rules in practice. This view is justified both from the theoretical and from the empirical point of view. Due to our negative results, we did not perform empirical evaluation for the egalitarian variants of the rules, but we believe that this is an interesting future research direction.

1.2 Related Work

A large number of papers are related to our research in terms of methodology (the study of computational complexity and approximation algorithms for winner determination under various -hard election rules), in terms of perspective and motivation (e.g., due to the resource allocation view of Monroe and Chamberlin–Courant rules that we take), and in terms of formal similarity (e.g., winner determination under the Chamberlin–Courant rule can be seen as a form of the facility location problem). Below we review this related literature.

There are several single-winner voting rules for which winner determination is known to be -hard. These rules include, for example, Dodgson’s rule [3, 20, 6], Young’s rule [38, 6], and Kemeny’s rule [3, 21, 5]. For the single-transferable vote rule (STV), the winner determination problem becomes -hard if we use the so-called parallel-universes tie-breaking [12]. Many of these hardness results hold even in the sense of parameterized complexity theory (however, there also is a number of fixed-parameter tractability results; see the references above for details).

These hardness results motivated the search for approximation algorithms. There are now very good approximation algorithms for Kemeny’s rule [1, 13, 24] and for Dodgson’s rule [30, 22, 8, 16, 9]. In both cases the results are, in essence, optimal. For Kemeny’s rule there is a polynomial-time approximation scheme [24] and for Dodgson’s rule the achieved approximation ratio is optimal under standard complexity-theoretic assumptions [8] (unfortunately, the approximation ratio is not constant but depends logarithmically on the number of candidates). On the other hand, for Young’s rule it is known that no good approximation algorithms exist [8].

The work of Caragiannis et al. [9] and of Faliszewski et al. [16] on approximate winner determination for Dodgson’s rule is particularly interesting from our perspective. In the former, the authors advocate treating approximation algorithms for Dodgson’s rule as voting rules in their own right and design them to have desirable properties. In the latter, the authors show that a well-established voting rule (so-called Maximin rule) is a reasonable (though not optimal) approximation of Dodgson’s rule. This perspective is important for anyone interested in using approximation algorithms for winner determination in elections (as might be the case for our algorithms for the Monroe and Chamberlin–Courant rules).

The hardness of the winner determination problem for the Monroe and Chamberlin–Courant rules have been considered in several papers. Procaccia, Rosenschein and Zohar [37] were the first to show the hardness of these two rules for the case of a particular approval-style dissatisfaction function. Their results were complemented by Lu and Boutilier [25], Betzler, Slinko and Uhlmann [7], Yu, Chan, and Elkind [45], Skowron et al. [43], and Skowron and Faliszewski [42]. These are showing the hardness in case of the Borda dissatisfaction function, obtain results on parameterized hardness of the two rules, and results on hardness (or easiness) for the cases where the profiles are single-peaked or single-crossing. Further, Lu and Boutilier [25] initiated the study of approximability for the Chamberlin–Courant rule (and were the first to use satisfaction-based framework). Specifically, they gave the -approximation algorithm for the Chamberlin–Courant rule. The motivation of Lu and Boutilier was coming from the point of view of recommendation systems and, in that sense, our view of the rules is quite similar to theirs.

In this paper we take the view that the Monroe and Chamberlin–Courant rules are special cases of the following resource allocation problem. The alternatives are shareable resources, each with a certain capacity defined as the maximal number of agents that may share this resource. Each agent has preferences over the resources and is interested in getting exactly one. The goal is to select a predetermined number of resources and to find an optimal allocation of these resources (see Section 2 for details). This provides a unified framework for the two rules and reveals the connection of proportional representation problem to other resource allocation problems. In particular, it closely resembles multi-unit resource allocation with single-unit demand [40, Chapter 11] (see also the work of Chevaleyre et al. [11] for a survey of the most fundamental issues in the multiagent resource allocation theory) and resource allocation with sharable indivisible goods [11, 2]. Below, we point out other connections of the Monroe and Chamberlin–Courant rules to several other problems.

Facility Location Problems.

In the facility location problem, there are customers located in some area and an authority, say a city council, that wants to establish a fixed number of facilities to serve those customers. Customers incur certain costs (say transportation costs) of using the facilities. Further, setting up a facility costs as well (and this cost may depend on the facility’s location). The problem is to find locations for the facilities that would minimize the total (societal) cost. If these facilities have infinite capacities and can serve any number of customers, then each customer would use his/her most preferred (i.e., closest) facility and the problem is similar to finding the Chamberlin–Courant assignment. If the capacities of the facilities are finite and equal, the problem looks like finding an assignment in the Monroe rule. An essential difference between the two problems are the setup costs and the distance metric. The parameterized complexity of the Facility Location Problem was investigated in Fellows and Fornau [17]. The papers of Procaccia et al. [37] and of Betzler et al. [7] contain a brief discussion of the connection between the Facility Location Problem and the winner determination problem under the Chamberlin–Courant rule.

Group Activity Selection Problem.

In the group activity selection problem [14] we have a group of agents (say, conference attendees) and a set of activities (say, options that they have for a free afternoon such as a bus city tour or wine tasting). The agents express preferences regarding the activities and organisers try to allocate agents to activities to maximise their total satisfaction. If there are possible activities but only must be chosen by organisers, then we are in the Chamberline-Courant framework, if all activities can take all agents, and in the Monroe framework, if all activities have the same capacities. The difference is that those capacities may be different and also that in the Group Activity Selection Problem we may allow expression of more complicated preferences. For example, an agent may express the following preference “I like wine-tasting best provided that at most people participate in it, and otherwise I prefer a bus city tour provided that at least people participate, and otherwise I prefer to not take part in any activity”. The Group Activity Selection Problem is more general than the winner determination in the Monroe and Chamberline-Courant rules. Some hardness and easiness results for this problem were obtained in [14], but the investigation of this problem has only started.

The above connections show that, indeed, the complexity of winner determination under the Monroe and Chamberlin–Courant voting rules are interesting, can lead to progress in several other directions, and may have impact on other applications of artificial intelligence.

2 Preliminaries

We first define basic notions such as preference orders and positional scoring rules. Then we present our Resource Allocation Problem in full generality and discuss which restrictions of it correspond to the winner determination problem for the Monroe and Chamberlin–Courant voting rules. Finally, we briefly recall relevant notions regarding computational complexity.

Preferences. For each , by we mean . We assume that there is a set of agents and a set of alternatives. Each alternative has the capacity , which gives the total number of agents that can be assigned to it. Further, each agent has a preference order over , i.e., a strict linear order of the form for some permutation of . For an alternative , by we mean the position of in the ’th agent’s preference order. For example, if is the most preferred alternative for then , and if is the least preferred one then . A collection of agents’ preference orders is called a preference profile.

We will often include subsets of the alternatives in the descriptions of preference orders. For example, if is the set of alternatives and is some nonempty strict subset of , then by we mean that for the preference order all alternatives in are preferred to those outside of .

A positional scoring function (PSF) is a function . A PSF is an increasing positional scoring function (IPSF) if for each , if then . Analogously, a PSF is a decreasing positional scoring function (DPSF) if for each , if then .

Intuitively, if is an IPSF then can represent the dissatisfaction that an agent suffers when assigned to an alternative that is ranked ’th in his or her preference order. Thus, we assume that for each IPSF it holds that (an agent is not dissatisfied by her top alternative). Similarly, a DPSF measures an agent’s satisfaction and we assume that for each DPSF it holds that (an agent is completely not satisfied being assigned his or her least desired alternative). Sometimes we write instead of , when it cannot lead to a confusion.

We will often speak of families of IPSFs (DPSFs) of the form , where , such that:

  1. For a family of IPSFs it holds that for all and .

  2. For a family of DPSFs it holds that for all and .

In other words, we build our families of IPSFs (DPSFs) by appending (prepending) values to functions with smaller domains. To simplify notation, we will refer to such families of IPSFs (DPSFs) as normal IPSFs (normal DPSFs). We assume that each function from a family can be computed in polynomial time with respect to . Indeed, we are particularly interested in the Borda families of IPSFs and DPSFs defined by and , respectively.

Assignment functions. A -assignment function is any function , such that (that is, it matches agents to at most alternatives), and such that for every alternative we have that (i.e., the number of agents assigned to does not exceed ’s capacity ).

We will also consider partial assignment functions. A partial -assignment function is defined in the same way as a regular one, except that it may assign a null alternative, , to some of the agents. It is convenient to think that for each agent we have . In general, it might be the case that a partial -assignment function cannot be extended to a regular one. This may happen, for example, if the partial assignment function uses alternatives whose capacities sum to less than the total number of voters. However, in the context of Chamberlin–Courant and Monroe rules it is always possible to extend a partial -assignment function to a regular one.

Given a normal IPSF (DPSF) , we may consider the following three functions, each assigning a positive integer to any assignment :

These functions are built from individual dissatisfaction (satisfaction) functions, so that they can measure the quality of the assignment for the whole society. In the utilitarian framework the first one can be viewed as a total (societal) dissatisfaction function in the IPSF case and a total (societal) satisfaction function in the DPSF case. The second and the third can be used, respectively, as a total dissatisfaction and satisfaction functions for IPSF and DPSF cases in the egalitarian framework. We will omit the word total if no confusion may arise.

For each subset of the alternatives such that , we denote as the partial -assignment that assigns agents only to the alternatives from and such that maximizes the utilitarian satisfaction . (We introduce this notation only for the utilitarian satisfaction-based setting because it is useful to express appropriate algorithms for this case; for other settings we have hardness results only and this notation would not be useful.)

The Resource Allocation Problem. Let us now define the resource allocation problem that forms the base of our study. This problem stipulates finding an optimal -assignment function, where the optimality is relative to one of the total dissatisfaction or satisfaction functions that we have just introduced. The former is to be minimized and the latter is to be maximized.

Definition 1.

Let be a normal IPSF. An instance of -DU-Assignment problem (i.e., of the disatisfaction-based utilitarian assignment problem) consists of a set of agents , a set of alternatives , a preference profile of the agents, and a sequence of alternatives’ capacities. We ask for an assignment function such that: (1) ; (2) for all ; and (3) is minimized.

Problem -SU-Assignment (the satisfaction-based utilitarian assignment problem) is defined identically except that is a normal DPSF and condition (3) is replaced with “() is maximal.” If we replace with in -DU-Assignment then we obtain problem -DE-Assignment, i.e., the dissatisfaction-based egalitarian variant. If we replace with in -SU-Assignment then we obtain problem -SE-Assignment, i.e., the satisfaction-based egalitarian variant.

Our four problems can be viewed as generalizations of the winner determination problem for the Monroe [32] and Chamberlin–Courant [10] multiwinner voting systems (see the introduction for their definitions). To model the Monroe system, it suffices to set the capacity of each alternative to be (for simplicity, throughout the paper we assume that divides 222In general, this assumption is not as innocent as it may seem. Often dealing with cases there does not divide requires additional insights and care. However, for our algorithms and results, the assumption simiplifies notation and does not lead to obscuring any unexpected difficulties.). We will refer to thus restricted variants of our problems as the Monroe variants. To represent the Chamberlin–Courant system, we set alternatives’ capacities to . We will refer to the so-restricted variants of our problems as CC variants.

Computational Issues. For many normal IPSFs and, in particular, for the Borda IPSF, even the above-mentioned restricted versions of the Resource Allocation Problem, namely, -DU-Monroe, -DE-Monroe, -DU-CC, and -DE-CC are -complete [7, 37] (the same holds for the satisfaction-based variants of the problems). Thus we seek approximate solutions.

Definition 2.

Let be a real number such that () and let be a normal IPSF (a normal DPSF). An algorithm is an -approximation algorithm for -DU-Assignment problem (for -SU-Assignment problem) if on each instance it returns a feasible assignment such that (such that ), where is the optimal total dissatisfaction (satisfaction) .

We define -approximation algorithms for the egalitarian variants analogously. Lu and Boutilier [25] gave a -approximation algorithm for the SU-CC family of problems.

Throughout this paper, we will consider each of the Monroe and CC variants of the problem and for each we will either prove inapproximability with respect to any constant (under standard complexity-theoretic assumptions) or we will present an approximation algorithm. In our inapproximability proofs, we will use the following two classic -complete problems [19].

Definition 3.

An instance of Set-Cover consists of set (called the ground set), family of subsets of , and positive integer . We ask if there exists a set such that and .

Definition 4.

X3C is a variant of Set-Cover where is divisible by , each member of has exactly three elements, and .

Set-Cover remains -complete even if we restrict each member of to be contained in at most two sets from (it suffices to note that this restriction is satisfied by Vertex-Cover, which is a special case of Set-Cover). X3C remains -complete even if we additionally assume that is divisible by and each member of appears in at most sets from  [19].

We will also use results from the theory of parameterized complexity developed by Downey and Fellows [15]. This theory allows to single out a particular parameter of the problem, say , and analyze its ‘contribution’ to the overall complexity of the problem. An analogue of the class here is the class which is the class of problems that can be solved in time , where is the size of the input instance, and is some computable function (for a fixed everything gets polynomial). Parameterized complexity theory also operates with classes which are believed to form a hierarchy of classes of hard problems (combined, they are analogous to the class ). It holds that , but it seems unlikely that , let alone . We point the reader to the books of Niedermeier [34] and Flum and Grohe [18] for detailed overviews of parametrized complexity theory. Interestingly, while both Set-Cover and Vertex-Cover are -complete, the former is -complete and the latter belongs to (see, e.g., the book of Niedermeier [34] for these now-standard results and their history).

3 Hardness of Approximation

We now present our inapproximability results for the Monroe and Chamberlin–Courant rules. Specifically, we show that there are no constant-factor approximation algorithms for the dissatisfaction-based variants of the rules (both utilitarian and egalitarian) and for the satisfaction-based egalitarian ones.

Naturally, these inapproximability results carry over to more general settings. For example, unless , there are no polynomial-time constant-factor approximation algorithms for the general dissatisfaction-based Resource Allocation Problem. On the other hand, our results do not preclude good satisfaction-based approximation algorithms for the utilitarian case and, indeed, in Section 4 we provide such algorithms.

Theorem 1.

For each normal IPSF and each constant factor , there is no polynomial-time -approximation algorithm for -DU-Monroe unless .

Proof.

Let us fix a normal IPSF and let us assume, aiming at getting a contradiction, that there is some constant and a polynomial-time -approximation algorithm for -DU-Monroe.

Let be an instance of X3C with ground set and family of -element subsets of . Without loss of generality, we assume that is divisible by both and and that each member of appears in at most 3 sets from .

Using , we build instance of -DU-Monroe as follows. We set (that is, the elements of the ground set are the agents) and we set , where is a set of alternatives corresponding to the sets from the family and is a set of dummy alternatives of cardinality , needed for the construction. We let and rename the alternatives in so that . We set .

We build agents’ preference orders using the following algorithm. For each , set and . Set and . As the frequency of the elements from is bounded by 3, we have . For each agent we set his or her preference order to be of the form , where the alternatives in and are ranked in an arbitrary way and the alternatives from are placed at positions in the way described below (see Figure 1 for a high-level illustration of the construction).

Figure 1: The alignment of the positions in the preference orders of the agents. The positions are numbered from the left to the right. The left wavy line shows the positions , each no greater than . The right wavy line shows the positions , each higher than . The alternatives from (positions of one such an alternative is illustrated with the circle) are placed only between the peripheral wavy lines. Each alternative from is placed on the left from the middle wavy line exactly 2 times, thus each such alternative is placed on the left from the right dashed line no more than times (exactly two times at the figure).

We place the alternatives from in the preference orders of the agents in such a way that for each alternative there are at most two agents that rank among their top alternatives. The following construction achieves this effect. If , then alternative is placed at one of the positions in ’s preference order. Otherwise, is placed at a position with index higher than (and, thus, at a position higher than ). This construction can be implemented because for each agent there are exactly alternatives such that .

Let be an assignment computed by on . We will show that if and only if is a yes-instance of X3C.

() If there exists a solution for (i.e., an exact cover of with sets from ), then we can easily show an assignment in which each agent is assigned to an alternative from the top positions of his or her preference order (namely, one that assigns each agent to the alternative that corresponds to the set , from the exact cover of , that contains ). Thus, for the optimal assignment it holds that . In consequence, must return an assignment with the total dissatisfaction at most .

() Let us now consider the opposite direction. We assume that found an assignment such that and we will show that is a yes-instance of X3C. Since we require each alternative to be assigned to either or agents, if some alternative from were assigned to some agents, at least one of them would rank at a position worse than . This would mean that . Analogously, no agent can be assigned to an alternative that is placed at one of the bottom positions of ’s preference order. Thus, only the alternatives in have agents assigned to them and, further, if agents , , , are assigned to some , then it holds that (we will call each set for which alternative is assigned to some agents selected). Since each agent is assigned to exactly one alternative, the selected sets are disjoint. Since the number of selected sets is , it must be the case that the selected sets form an exact cover of . Thus, is a yes-instance of X3C. ∎

One may wonder if hardness of approximation for -DU-Monroe is not an artifact of the strict requirements regarding the number of chosen candidates. It turns out that unless , there is no --approximation algorithm that finds an assignment with the following properties: (1) the aggregated dissatisfaction is at most times higher than the optimal one, (2) the number of alternatives to which agents are assigned is at most and (3) each selected alternative (the alternative that has agents assigned), is assigned to no more than and no less than agents. (The proof is similar to the one used for Theorem 1.) Thus, in our further study we do not consider such relaxations of the problem.

Theorem 2.

For each normal IPSF and each constant , there is no polynomial-time -approximation algorithm for -DE-Monroe unless .

Proof.

The proof of Theorem 1 applies to this case as well. In fact, it even suffices to take . ∎

Results analogous to Theorems 1 and 2 hold for the DU-CC family of problems as well.

Theorem 3.

For each normal IPSF and each constant factor , there is no polynomial-time -approximation algorithm for -DU-CC unless .

Proof.

Let us fix a normal IPSF . For the sake of contradiction, let us assume that there is some constant , and a polynomial-time -approximation algorithm for -DU-CC. We will show that it is possible to use to solve the -complete Vertex-Cover problem.

Let be an instance of Vertex-Cover, where is the ground set, is a family of subsets of (where each member of belongs to exactly two sets in ), and is a positive integer.

Given , we construct an instance of -DU-CC as follows. The set of agents is and the set of alternatives is , where each contains exactly (unique) alternatives. Intuitively, for each , the alternatives in correspond to the set . For each , , we pick one alternative, which we denote . For each agent , we set ’s preference order as follows: Let and , , be the two sets that contain . Agent ’s preference order is of the form (a particular order of alternatives in the sets and is irrelevant for the construction). We ask for an assignment of the agents to at most alternatives.

Let us consider a solution returned by on input . We claim that if and only if is a yes-instance of Vertex-Cover.

() If is a yes-instance then, clearly, each agent can be assigned to one of the top two alternatives in his or her preference order (if there is a size- cover, then this assignment selects at most candidates). Thus the total dissatisfaction of an optimal assignment is at most . As a result, the solution returned by has total dissatisfaction at most .

() If returns an assignment with total dissatisfaction no greater than , then, by the construction of agents preference orders, we see that each agent was assigned to an alternative from a set such that . Since the assignment can use at most alternatives, this directly implies that there is a size- cover of with sets from . ∎

Theorem 4.

For each normal IPSF and each constant factor , there is no polynomial-time -approximation algorithm for -DE-CC unless .

Proof.

The proof of Theorem 3 is applicable in this case as well. In fact, it even suffices to take the groups of alternatives, , to contain alternatives each. ∎

The above results show that approximating algorithms for finding the minimal dissatisfaction of agents is difficult. On the other hand, if we focus on agents’ total satisfaction then constant-factor approximation exist in many cases (see, e.g., the work of Lu and Boutilier [25] and the next section). Yet, if we focus on the satisfaction of the least satisfied voter, there are no efficient constant-factor approximation algorithms for the Monroe and Chamberlin–Courant systems. (However, note that our result for the Monroe setting is more general than the result for the Chamberlin–Courant setting; the latter is for the Borda DPSF only.)

Theorem 5.

For each normal DPSF (where each entry is polynomially bounded in the number of alternatives) and each constant factor , with , there is no -approximation algorithm for -SE-Monroe unless .

Proof.

Let us fix a DPSF , where each entry is polynomially bounded in the number of alternatives . For the sake of contradiction, let us assume that for some , , there is a polynomial-time -approximation algorithm for -SE-Monroe. We will show that the existence of this algorithm implies that X3C is solvable in polynomial time.

Let be an X3C instance with ground set and collection of subsets of . Each set in has cardinality three. Further, without loss of generality, we can assume that is divisible by three and that each appears in at most three sets from . Given , we form an instance of -SE-Monroe as follows. Let . The set of agents is partitioned into two subsets, and . contains agents (intuitively, corresponding to the elements of the ground set ) and contains agents (used to enforce certain properties of the solution). The set of alternatives is partitioned into two subsets, and . We set (members of correspond to the sets in ), and we set , where .

For each , , we set . For each , , we set the preference order of the ’th agent in to be of the form

Note that by our assumptions, . For each , , we set the preference order of the ’th agent in to be of the form

Note that each agent in ranks the alternatives from in positions . Finally, we set the number of candidates that can be selected to be .

Now, consider the solution returned by on . We will show that if and only if is a yes-instance of X3C.

() If there exists an exact set cover of with sets from , then it is easy to construct a solution for where the satisfaction of each agent is greater or equal to . Let be a set such that and . We assign each agent from to the alternative such that (a) and (b) , and we assign each agent from to his or her most preferred alternative. Thus, Algorithm has to return an assignment with the minimal satisfaction greater or equal to .

() For the other direction, we first show that . Since DPSFs are strictly decreasing, it holds that:

(1)

Then, by the definition of DPSFs, it holds that:

(2)

Using the fact that and using (2), we can transform inequality (1) to obtain the following:

This means that if the minimal satisfaction of an agent is at least , then no agent was assigned to an alternative that he or she ranked beyond position . If some agent from were assigned to an alternative from , then, by the pigeonhole principle, some agent from would be assigned to an alternative from . However, each agent in ranks the alternatives from beyond position and thus such an assignment is impossible. In consequence, it must be that each agent in was assigned to an alternative that corresponds to a set in that contains . Such an assignment directly leads to a solution for . ∎

Let us now move on to the case of SE-CC family of problems. Unfortunately, in this case our inapproximability argument holds for the case of Borda DPSF only (though we believe that it can be adapted to other DPSFs as well). Further, in our previous theorems we were showing that existence of a respective constant-factor approximation algorithm implies that collapses to . In the following theorem we will show a seemingly weaker collapse of to .

To prove hardness of approximation for -SE-CC, we first prove the following simple lemma.

Lemma 6.

Let be three positive integers and let be a set of cardinality . There exists a family of -element subsets of such that for each -element subset of , there is a set such that .

Proof.

Set and let be a family of all -element subsets of . Replace each element of with new elements (at the same time replacing with the same elements within each set in that contains ). As a result we obtain two new sets, and , that satisfy the statement of the theorem (up to the renaming of the elements). ∎

Theorem 7.

Let be the Borda DPSF (). For each constant factor , , there is no -approximation algorithm for -SE-CC unless .

Proof.

For the sake of contradiction, let us assume that there is some constant , , and a polynomial-time -approximation algorithm for -SE-CC. We will show that the existence of this algorithm implies that Set-Cover is fixed-parameter tractable for the parameter (since Set-Cover is known to be -complete for this parameter, this will imply ).

Let be an instance of Set-Cover with ground set and family of subsets of . Given , we build an instance of -SE-CC as follows. The set of agents consists of subsets of agents, , where each group contains exactly agents. Intuitively, for each , , the agents in the set correspond to the element in . The set of alternatives is partitioned into two subsets, and , such that: (1) is a set of alternatives corresponding to the sets from the family , and (2) , , is a set of dummy alternatives needed for our construction. We set .

Before we describe the preference orders of the agents in , we form a family of preference orders over that satisfies the following condition: For each -element subset of , there exists in such that all members of are ranked among the bottom positions in . By Lemma 6, such a construction is possible (it suffices to take and ); further, the proof of the lemma provides an algorithmic way to construct .

We form the preference orders of the agents as follows. For each , , set . For each , , and each , , the ’th agent from has preference order of the form:

(we pick any arbitrary, polynomial-time computable order of candidates within and ).

Let be an assignment computed by on . We will show that if and only if is a yes-instance of Set-Cover.

() If there exists a solution for (i.e., a cover of with sets from ), then we can easily show an assignment where each agent is assigned to an alternative that he or she ranks among the top positions (namely, for each , , we assign all the agents from the set to the alternative such that and belongs to the alleged -element cover of ). Under this assignment, the least satisfied agent’s satisfaction is at least and, thus, has to return an assignment where .

() Let us now consider the opposite direction. We assume that found an assignment such that and we will show that is a yes-instance of Set-Cover. We claim that for each , , at least one agent in were assigned to an alternative from . If all the agents in were assigned to alternatives from , then, by the construction of , at least one of them would have been assigned to an alternative that he or she ranks at a position greater than . For we have:

(we skip the straightforward calculation) and, thus, this agent would have been assigned to an alternative that he or she ranks at a position greater than . As a consequence, this agent’s satisfaction would be lower than . Similarly, no agent from can be assigned to an alternative from . Thus, for each , , there exists at least one agent that is assigned to an alternative from . In consequence, the covering subfamily of consists simply of those sets , for which some agent is assigned to alternative .

The presented construction gives the exact algorithm for Set-Cover problem running in time , where is polynomial in . The existence of such an algorithm means that Set-Cover is in . On the other hand, we know that Set-Cover is -complete, and thus if existed then would hold. ∎

4 Algorithms for the Utilitarian, Satisfaction-Based Cases

We now turn to approximation algorithms for the Monroe and Chamberlin–Courant multiwinner voting rules in the satisfaction-based framework. Indeed, if one focuses on agents’ total satisfaction then it is possible to obtain high-quality approximation results. In particular, we show the first nontrivial (randomized) approximation algorithm for -SU-Monroe. We show that for each we can provide a randomized polynomial-time algorithm that achieves approximation ratio; the algorithm usually gives even better approximation guarantees. For the case of arbitrarily selected DPSF we show a -approximation algorithm. Finally, we show the first polynomial-time approximation scheme (PTAS) for -SU-CC. These results stand in sharp contrast to those from the previous section, where we have shown that approximation is hard for essentially all remaining variants of the problem.

The core difficulty in solving -Monroe/CC-Assignment problems lays in selecting the alternatives that should be assigned to the agents. Given a preference profile and a set of up to alternatives, using a standard network-flow argument, it is easy to find a (possibly partial) optimal assignment of the agents to the alternatives from .

Proposition 8 (Implicit in the paper of Betzler et al. [7]).

Let be a normal DPSF, be a set of agents, be a set of alternatives (togehter with their capacities; perhaps represented implicitly as for the case of the Monroe and Chamberlin–Courant rules), be a preference profile of over , and a -element subset of (where divides ). Then there is a polynomial-time algorithm that computes a (possibly partial) optimal assignment of the agents to the alternatives from .

Note that for the case of the Chamberlin–Courant rule the algorithm from the above proposition can be greatly simplified: To each voter we assign the candidate that he or she ranks highest among those from . For the case of Monroe, unfortunately, we need the expensive network-flow-based approach. Nonetheless, Proposition 8 allows us to focus on the issue of selecting the winning alternatives and not on the issue of matching them to the agents.

Below we describe our algorithms for -SU-Monroe and for -SU-CC. Formally speaking, every approximation algorithm for -SU-Monroe also gives feasible results for -SU-CC. However, some of our algorithms are particularly well-suited for both problems and some are tailored to only one of them. Thus, for each algorithm we clearly indicate if it is meant only for the case of Monroe, only for the case of CC, or if it naturally works for both systems.

4.1 Algorithm A (Monroe)

Perhaps the most natural approach to solve -SU-Monroe is to build a solution iteratively: In each step we pick some not-yet-assigned alternative (using some criterion) and assign it to those agents that (a) are not assigned to any other alternative yet, and (b) whose satisfaction of being matched with is maximal. It turns out that this idea, implemented formally as Algorithm A (see pseudo code in Figure 2), works very well in many cases. We provide a lower bound on the total satisfaction it guarantees in the next lemma. We remind the reader that the so-called ’th harmonic number has asymptotics .

Notation: a map defining a partial assignment, iteratively built by the algorithm.
the set of agents for which the assignment is already defined.
the set of alternatives already used in the assignment.
if  then
      compute the optimal solution using an algorithm of Betzler et al. [7] and return.
      
       for  to  do
            
            
             foreach  do
                   sort so that if agent precedes agent then
                   chose first elements from
                  
                  
                  
                   foreach  do
                        
                        
                        
Figure 2: The pseudocode for Algorithm A.
Lemma 9.

Algorithm A is a polynomial-time -approximation algorithm for -SU-Monroe.

Proof.

Our algorithm explicitly computes an optimal solution when so we assume that . Let us consider the situation in the algorithm after the ’th iteration of the outer loop (we have if no iteration has been executed yet). So far, the algorithm has picked alternatives and assigned them to agents (recall that for simplicity we assume that divides evenly). Hence, each agent has unassigned alternatives among his or her top-ranked alternatives. By pigeonhole principle, this means that there is an unassigned alternative who is ranked among top positions by at least agents. To see this, note that there are slots for unassigned alternatives among the top positions in the preference orders of unassigned agents, and that there are unassigned alternatives. As a result, there must be an alternative for whom the number of agents that rank him or her among the top positions is at least:

In consequence, the agents assigned in the next step of the algorithm will have the total satisfaction at least . Thus, summing over the iterations, the total satisfaction guaranteed by the assignment computed by Algorithm Ais at least the following value: (to derive the fifth line from the fourth one we note that when ):

If each agent were assigned to his or her top alternative, the total satisfaction would be equal to . Thus we get the following bound:

This completes the proof. ∎

Note that in the above proof we measure the quality of our assignment against, a perhaps-impossible, perfect solution, where each agent is assigned to his or her top alternative. This means that for relatively large and , and small ratio, the algorithm can achieve a close-to-ideal solution irrespective of the voters’ preference orders. We believe that this is an argument in favor of using Monroe’s system in multiwinner elections. On the flip side, to obtain a better approximation ratio, we would have to use a more involved bound on the quality of the optimal solution. To see that this is the case, form an instance of -SU-Monroe with agents and alternatives, where all the agents have the same preference order, and where we seek to elect candidates (and where divides ). It is easy to see that each solution that assigns the universally top-ranked alternatives to the agents is optimal. Thus the total dissatisfaction of the agents in the optimal solution is: