 # New Candidates Welcome! Possible Winners with respect to the Addition of New Candidates

In voting contexts, some new candidates may show up in the course of the process. In this case, we may want to determine which of the initial candidates are possible winners, given that a fixed number k of new candidates will be added. We give a computational study of this problem, focusing on scoring rules, and we provide a formal comparison with related problems such as control via adding candidates or cloning.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In many real-life collective decision making situations, the set of candidates (or alternatives) may vary while the voting process goes on, and may change at any time before the decision is final: some new candidates may join, whereas some others may withdraw. This, of course, does not apply to situations where the vote takes place in a very short period of time (such as, typically, political elections in most countries), and neither does the addition of new candidates during the process apply to situations where the law forbids new candidates to be introduced after the voting process has started (which, again, is the case for most political elections). However, there are quite many practical settings where this may happen, especially situations where votes are sent by email during an extended period of time. This is typically the case when making a decision about the date and time of a meeting. In the course of the process, we may learn that the room is taken at a given time slot, making this time slot no longer a candidate. The opposite case also occurs frequently; we thought the room was taken on a given date and then we learn that it has become available, making this time slot a new candidate.

The paper focuses on candidate addition only. More precisely, the class of situations we consider is the following. A set of voters have expressed their votes about a set of (initial) candidates. Then some new candidates declare their intention to participate in the election. The winner will ultimately be determined using some given voting rule and the voters’ preferences over the set of all candidates. In this class of situations, an important question arises: who among the initial candidates can still be a winner once the voters’ preferences about all candidates are known? This is important in particular if there is some interest to detect as soon as possible the candidates who are not possible winners: for instance, candidates for a job may have the opportunity to apply for different positions, and time slots may be released for other potential meetings.

This question is strongly related to several streams of work in the recent literature on computational social choice, especially the problem of determining whether the vote elicitation process can be terminated [10, 29]; the possible winner problem, and more generally the problem of applying a voting rule to incomplete preferences [22, 26, 30, 5, 6] or uncertain preferences with probabilistic information ; swap bribery, encompassing the possible winner problem as a particular case ; voting with an unknown set of available candidates ; the control of a voting rule by the chair via adding candidates; and resistance to cloning—we shall come back to the latter two problems in more detail in the related work section.

Clearly, considering situations where new voters are added is a specific case of voting under incomplete preferences, where incompleteness is of a very specific type: the set of candidates is partitioned in two groups (the initial and the new candidates), and the incomplete preferences consist of complete rankings on the initial candidates. This class of situations is, in a sense, dual to a class of situations that has been considered more often, namely, when the set of voters is partitioned in two groups: those voters who have already voted, and those who have not expressed their votes yet. The latter class of situations, while being a subclass of voting under incomplete preferences, has been more specifically studied as a coalitional manipulation problem [11, 18], where the problem is to determine whether it is possible for the voters who have not voted yet to make a given candidate win. Varying sets of voters have also been studied in the context of compiling the votes of a subelectorate [8, 31]: there, one is interested in summarizing a set of initial votes, while still being able to compute the outcome once the remaining voters have expressed their votes.

The layout of the paper is as follows. In Section 2 we recall the necessary background on voting and we introduce some notation. In Section 3 we state the problem formally, by defining voting situations where candidates may be added after the votes over a subset of initial candidates have already been elicited. In the following sections we focus on specific voting rules and we study the problem from a computational point of view. In Section 4, we focus on the family of -approval rules, including plurality and veto as specific subcases, and give a full dichotomy result for the complexity of the possible winner problem with respect to the addition of new candidates; namely, we show that the problem is NP-complete as soon as and , and polynomial if or . In Section 5 we focus on the Borda rule and show that the problem is polynomial-time solvable regardless of the number of new candidates. We also exhibit a more general family of voting rules, including Borda, for which this result can be generalized. In Section 6 we show that the problem can be hard for some positional scoring rules even if only one new candidate is added. In Section 7 we discuss the relationship to the general possible winner problem, to the control of an election by the chair via adding candidates, and to candidate cloning. Section 8 summarizes the results and mentions further research directions.

## 2 Background and notation

Let be a finite set of candidates, and a finite set of voters. The number of voters is denoted by , and the (total) number of candidates by . A -vote (called simply a vote when this is not ambiguous) is a linear order over , denoted by or by . We sometimes denote votes in the following way: is denoted by , etc. An -voter -profile is a collection of -votes. Let be the set of all -votes and therefore be the set of all -voter -profiles. We denote by the set of all -voter -profiles for , i.e., .

A voting rule on is a function from to . A voting correspondence is a function from to . The most natural way of obtaining a voting rule from a voting correspondence is to break ties according to a fixed priority order on candidates. In this paper, we do not fix a priority order on candidates (one reason being that the complete set of candidates is not known to start with), which means that we consider voting correspondences rather than rules, and ask whether is a possible cowinner for a given profile . This is equivalent to asking whether there exists a priority order for which is a possible winner, or else whether is a possible winner for the most favorable priority order (with having priority over all other candidates). This is justified in our context by the fact that specifying such a priority order is problematic when we don’t know in advance the identities of the potential new candidates. With a slight abuse of notation we denote voting correspondences by just as voting rules. Let be the set of cowinners for profile .

For and , let be the number of votes in ranking in position , the number of votes in ranking first, and the number of votes in ranking above . Let

be a vector of integers such that

and . The scoring rule induced by elects the candidate(s) maximizing .

If is a fixed integer then -approval, , is the scoring rule corresponding to the vector – with 1’s and 0’s. The -approval score of a candidate is denoted more simply by : in other words, is the number of voters in who rank in the first positions, i.e., . When , we get the plurality rule , and when we get the veto (or antiplurality) rule. The Borda rule is the scoring rule corresponding to the vector .

We now define formally situations where new candidates are added.

###### Definition

A voting situation with a varying set of candidates is a 4-tuple where is a set of voters (with ), a set of candidates, an -voter -profile, and is a positive integer, encoded in unary.

denotes the set of initial candidates, the initial profile, and the number of new candidates. Nothing is known a priori about the voters’ preferences over the new candidates, henceforth their identity is irrelevant and only their number counts. The assumption that is encoded in unary ensures that the number of new candidates is polynomial in the size of the input. Most of our results would still hold if the number of new candidates is exponentially large in the size of the input, but for the sake of simplicity, and also because, in practice, will be small anyway, we prefer to exclude this possibility.

Because the number of candidates is not the same before and after the new candidates come in, we have to consider families of voting rules (for a varying number of candidates) rather than voting rules for a fixed number of candidates. While it is true that for many usual voting rules there is an obvious way of defining them for a varying number of candidates, this is not the case for all of them, especially scoring rules. Still, some natural scoring rules, including plurality, veto, more generally -approval, as well as Borda, are naturally defined for any number of candidates. We shall therefore consider families of voting rules, parameterized by the number of candidates (). We slightly abuse notation and denote these families of voting rules by , and consequently often write instead of . The complexity results we give in this paper make use of such families of voting rules, where the number of candidates is variable.

If is a -profile and , then the projection of on , denoted by , is obtained by deleting all candidates in in each of the votes of , and leaving unchanged the ranking on the candidates of . For instance, if , then and . In all situations, the set of initial candidates is denoted by , the set of the new candidates is denoted by . If is an -profile and an -profile, then we say that extends if the projection of on is exactly . For instance, let , ; the profile extends the -profile .

## 3 Possible winners when new candidates are added

We recall from  that given a collection of partial strict orders on representing some incomplete information about the votes, a candidate is a possible winner if there is a profile where each is a ranking on extending in which wins. Reformulated for the case where is a ranking of the initial candidates (those in ), we get the following definition:

###### Definition

Given a voting situation , and a collection of voting rules, we say that is a possible cowinner with respect to and if there is a -profile extending such that , where is a set of new candidates.

Note that we do not have in the input, because it would be redundant with : it is enough to know the number of new candidates. Note also that all new candidates have to appear in the extended votes composing .

Also, we do not consider the problem of deciding whether a new candidate is a possible cowinner, because it is trivial. Indeed, as soon as the voting correspondence satisfies the extremely weak property that a candidate ranked first by all voters is always a cowinner (which is obviously satisfied by all common voting rules), any new candidate is a possible cowinner.

We now define formally the problems we study in this paper.

###### Definition

Given a collection of voting rules, the possible cowinner problem with new candidates (or PcWNC) for is defined as follows:

Input

A voting situation and a candidate .

Question

Is a possible cowinner with respect to and ?

Also, the subproblem of PcWNC where the number of new candidates is fixed will be denoted by PcWNC.

We can also define the notion of necessary cowinner with respect to and : is a necessary cowinner with respect to , , and if for every -profile extending we have . However, the study of necessary cowinners in this particular setting will almost never lead to any significant results. There may be necessary cowinners among the initial candidates, but this will happen rarely (and this case will be discussed for a few specific voting rules in the corresponding parts of the paper).

Now we are in position to consider specific voting rules.

## 4 K-approval

As a warm-up we start by considering the plurality rule.

### 4.1 Plurality

Let us start with an example: suppose , , and the plurality scores in are , , . There is only one new candidate (). We have:

1. is a possible cowinner ( will win in particular if the top candidate of every voter remains the same);

2. is a possible cowinner: to see this, suppose that 2 voters who had ranked first now rank first; the new scores are , , , ;

3. is not a possible cowinner: to reduce the scores of (resp. ) to that of , we need at least 3 (resp. 1) voters who had ranked (resp. ) first to now rank first; but this then means that gets at least 4 votes, while has only 3.

More generally, we have the following result:

###### Proposition

Let be an -voter profile on , and . The candidate is a possible cowinner for and plurality with respect to the addition of new candidates if and only if

 ntop(PX,x∗)≥1k⋅∑xi∈Xmax(0,ntop(PX,xi)−ntop(PX,x∗))

Proof: Suppose first that the inequality holds. We build the following -profile extending :

1. for every candidate such that we simply take arbitrary votes ranking on top and place one of the ’s on top of the vote (and the other ’s anywhere), subject to the condition that no is placed on top of a vote more than times. (This is possible because the inequality is satisfied).

2. in all other votes (those not considered at step 1), place all ’s anywhere except on top.

We obtain a profile extending . First, we have , because in all the votes in where is on top, the new top candidate in the corresponding vote in is still (cf. step 2), and all the votes in where was not on top obviously cannot have on top in the corresponding vote in . Second, let . If then ; and if then we have . Therefore, is a cowinner for plurality in .
Conversely, if the inequality is not satisfied, in order for to become a cowinner in , the other ’s must lose globally an amount of votes. But since we have , for at least one of the ’s it must hold that ; therefore cannot be a cowinner for plurality in .

We do not need to pay much attention to the veto rule, since the characterization of possible cowinners is trivial. Indeed, by placing any of the new candidates below in every vote of where is ranked at the bottom position, we obtain a vote where no one vetoes , so any candidate is a possible cowinner.

As a corollary, computing possible cowinners for the rules of plurality (and veto) with respect to candidate addition can be computed in polynomial time (which we already knew, since possible cowinners for plurality and veto can be computed in polynomial time ).

### 4.2 K-approval, one new candidate

We start with the case where a single candidate is added. Recall that we denote by the score of for and -approval (i.e. the number of voters who rank among their top candidates); and by the number of voters who rank exactly in position .

###### Proposition

Let be an positive integer, be an -voter profile on , and . The candidate is a possible cowinner for and -approval with respect to the addition of one new candidate if and only if the following two conditions hold:

1. for every , if
then .

Proof: Assume conditions (1) and (2) are satisfied. Then, we build the following -profile extending :

• for every such that , we take arbitrary votes who rank in position in and place on top (condition (1) ensures that we can find enough such votes).

• in all other votes (those not considered at step (i)), place in the bottom position.

We obtain a profile extending . First, we have , because (a) all votes in ranking in position are extended in such a way that is placed in the bottom position, therefore gets a point in each of these votes if and only if it got a point in , and (b) in all the other votes (those where is not ranked in position in ), certainly gets a point in if and only if they got a point in . This holds both in the case where was added at the top or the bottom of the vote. Second, for every such that , loses exactly points when is extended into , therefore . Third, —because of (2)—hence . Therefore, is a cowinner for -approval in .

Now, assume condition (1) is not satisfied, that is, there is an such that and such that . There is no way of having lose more than points, therefore will never catch up with ’s advantage and is therefore not a possible cowinner. Finally, assume condition (2) is not satisfied, which means that we have . Then, in order for to reach the score of ’s we must add in one of the top positions in a number of votes exceeding , therefore , and therefore is not a possible cowinner.

Therefore, computing possible cowinners for -approval with respect to the addition of one candidate can be done in polynomial time.

### 4.3 2-approval, any (fixed) number of new candidates

For each profile and each candidate , we simply write for the score of in under , that is, , i.e. the number of times that is ranked within the top two positions in .

Let be an initial profile and the set of new candidates. Let . We want to know whether is a possible cowinner for 2-approval and . Let us partition into , and , where consists of the votes in which is ranked in the top position, consists of the votes in which is ranked in the second position and consists of the votes in which is not ranked within the top two positions. Let be an extension of to . For each candidate , we define the following three subsets of :

• is the set of votes in where is ranked in the second position and neither nor any new candidate is ranked in the top position (HP stands for “high priority”).

• is the set of votes in where or any new candidate is ranked in the top position and is ranked in the second position (MP stands for “medium priority”).

• is the set of votes in where is ranked in the top position and some is ranked in the second position (LP stands for “low priority”).

These definitions also apply to ; our definitions then simplify into: is the set of votes in where is ranked second and is not ranked first; is the set of votes in where is ranked first and is ranked second; is the set of votes in where is ranked first and is not ranked second. These definitions are summarized in Figure 1. Finally, for , let .

Let us compute these sets on a concrete example, which will be reused throughout the section.

###### Example

Let and consider the following profile consisting of 19 votes (we only mention the first two candidates in each vote):

 v1v2v3v4v5v6v7v8v9v10v11v12v13v14v15v16v17v18v19x∗x1x2x3x1x1x1x2x2x2x2x2x3x3x3x3x3x3x4x1x∗x∗x∗x4x4x5x1x3x4x5x5x1x2x4x4x5x6x6⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮⋮

We have , and . This is summarized together with the priority classification in the following table:

 {\rm HP}\rm MP\rm LPΔ(PX,xi)x1v8,v13v1v5,v6,v73x2v14v8,v9,v10,v11,v123x3v9v13,v14,v15,v16,v17,v184x4v5,v6,v10,v15,v16v192x5v7,v11,v12,v170x6v18,v19−2

If is an extension of to then we write , where is the vote over extending . We now establish a useful property of the extensions of for which is a cowinner. Without loss of generality, we assume that in every vote , every new candidate is ranked either in the first two positions, or below all candidates of .

###### Proposition

If there exists an extension of such that , then there exists an extension of such that , and satisfying the following conditions:

1. For each , if is ranked within the top two positions in , then is also ranked within the top two positions in .

2. For each , if the top candidate of is not in then the second-ranked candidate of is not in either.

3. For each and each , if is not ranked within the top two positions in , then for each , is not ranked within the top two positions in .

Proof: We consider in turn the different conditions:

1. This is because if there exists such that is not in the top two positions whereas is in the top two positions in its original vote , then we can simply move all of candidates in ranked higher than to the bottom positions. Let denote the vote obtained this way. By replacing with , we increase the score of by , and the score of each other candidate by no more than , which means that is still a cowinner.

2. If there exists such that is ranked in the top position and is ranked in the second position, then we simply obtain by switching and .

3. The condition states that for each candidate , whenever we want to reduce its score, we should first try to reduce it by putting a new candidate on top of some vote in . This is because by putting on top of some vote in , we may use only one extra candidate to reduce by one unit the score of the candidate ranked at the top position of . Formally, suppose there exist and such that is within the top two positions of (the extension of ) but not within the top two positions of (the extension of ). Let be any candidate ranked within the top two positions of . Let denote the vote obtained from by moving to the bottom, and let denote the vote obtained from by moving to the top position. Next, we replace and by and , respectively. It follows that the score of each candidate does not change, which means that is still a cowinner. We repeat this procedure until statement (3) is satisfied for every . Since after each iteration there is at least one additional vote that will never be modified again, this procedure ends in times.

Proposition 4.3 simply tells us that when looking for an extension that makes a cowinner, it suffices to restrict our attention to the extensions that satisfy conditions (1) to (3). Moreover, using (1) of Proposition 4.3, we deduce that . Hence, for votes (the votes in which is ranked in the second position), we can assume that the new candidates of are put in bottom positions in .

Define as the set of all candidates in such that . Our objective is to reduce all score differences to for , while keeping the score differences of each new candidate non-positive. (We do not have to care about the candidates in ).

The intuition underlying our algorithm is that when trying to reduce on the current profile , we first try to use the votes in , then the votes in , and finally the votes in . This is because putting some candidates from in the top positions in the votes of not only reduces by one unit, but also creates an opportunity to “pay” one extra candidate from to reduce by one unit, where is the candidate ranked on top of this vote. For the votes in , we can only reduce by one unit without any other benefit. For the votes in we will have to use two candidates from to bring down by one unit; however, if we already put some in the top position in order to reduce , where is the candidate ranked in the second position in the original vote, then we only need to pay one extra candidate in to reduce by one unit. Therefore, the major issue consists in finding the most efficient way to choose the votes in to reduce , when . We will solve this problem by reducing it to a max-flow problem.

The algorithm is composed of a main function CheckCowinner(.) which comes together with two sub-functions AddNewAlternativeOnTop(.) and BuildMaxFlowGraph(.) that we detail first.

The procedure AddNewAlternativeOnTop simply picks new candidates to be put on top of votes, and updates subsequently the profile. Note that in this procedure, candidates from to be added on top of the votes are those with the lowest score (or the lowest index, in case of ties). This results in choosing new candidates in a cyclic order

As for the function BuildMaxFlowGraph, it builds the weighted directed graph defined as follows:

• ;

• contains the following weighted edges:

• for each , an edge with weight ;

• for each and each : an edge with weight ; plus, if the candidate in second position in is in , an edge with weight ;

• for each , an edge with weight .

We refer the reader to Figure 2 for an illustration. (Once this graph is constructed, any standard function to compute a flow of maximal value can of course be used). We are now in a position to detail the main function CheckCowinner(.).

###### Proposition

Given a profile on , a candidate and a set of new candidates , a call to algorithm CheckCowinner returns in polynomial time the answer true if and only if there exists an extension of in which is a cowinner.

Proof: Algorithm 2 starts by partitioning into and : an alternative is in if and in if .

Let . Then by item (3) of Proposition 4.3, for each vote in , we can safely put one candidate from in the top position of ; this is done in the first phase of Algorithm 2, lines 2 to 2. Note that after adding a new candidate on top of a vote and after updating , the modified vote will no longer belong to . Instead, it will now belong to for some other candidate .

When Phase 1 is over, the score of may still need to be lowered down, which can be done next by using votes from . This is what Phase 2 does, from line 2 to line 2. There are three possibilities:

1. . In this case, the votes in are sufficient to make catch up : after Phase 1, we have and Phase 2 is void; we are done with .

2. and : in this case, to make catch up , it is enough to take arbitrary votes in and add one new candidate on top of them; this is what Phase 2 does, and after that we are done with .

3. : in this case, because of Proposition 4.3, we know that it is safe to add one new candidate on top of all votes of ; this is what Phase 2 does; after that, we still need to lower down the score of , which will require to add new candidates on top of votes of .

If at this point a newly added candidate has a score higher than , then cannot win, and we can stop the program (line 2).

For readability, let us denote by the profile obtained after Phases 1 and 2. For each satisfying condition 3, the only way to reduce is to put two candidates of within the top two positions in a vote of , because in Phases 1 and 2 we have used up all the votes in and . Now, reducing by one unit will cost us two candidates in , but meanwhile, is also reduced by one unit, where is the candidate ranked in the second position in . We must have . We note that . Choosing optimally the votes in for each can be done by solving an integral max-flow instance which is build by algorithm BuildMaxFlowGraph (note that in case where either or is empty, we just assume that the flow has a null value).

Let us show that is a possible cowinner if and only if the value of the flow from to is at least