1 Introduction and Motivation
Gerrymandering, namely creation of district plans with highly asymmetric electoral outcomes to disenfranchise voters, has continued to be a curse to fairness of electoral systems in USA for a long time in spite of general public disdain for it. Even though the US Supreme Court ruled in  that gerrymandering is justiciable
, they could not agree on an effective way of estimating it. Indeed, a huge impediment to removing gerrymandering lies in formulating an effective and precise measure for it that will be acceptable in courts.
In , the US Supreme Court opined  that a measure of partisan symmetry may be a helpful tool to understand and remedy gerrymandering. Partisan symmetry is a standard for defining partisan gerrymandering that involves the computation of counterfactuals typically under the assumption of uniform swings. To illustrate lack of partisan symmetry consider a two-party voting district and suppose that Party A wins by getting of total votes and of total seats. In such a case, a partisan symmetry standard would hold if Party B would also win of the seats had it won of the votes in a hypothetical election. Two frequent indicators cited for lack of partisan symmetry are cracking, namely dividing supporters of a specific party between two or more districts when they could be a majority in a single district, and packing, namely filling a district with more and more supporters of a specific party as long as this does not make this specific party the winner in that district.
There have been many theoretical and empirical attempts at remedying the lack of partisan symmetry by “quantifying” gerrymandering and devising redistricting methods to optimize such quantifications using well-known notions such as compactness and symmetry [25, 7, 6, 8, 26, 18, 3, 16]. Since it is often simply not possible to go over every possible redistricting map to optimize the gerrymandering measure due to rapid combinatorial explosion, researchers such as [22, 30, 7] have also investigated designing efficient algorithmic approach for this purpose. In particular, a popular gerrymandering measure in the literature is symmetry, which attempts to quantify the discrepancy between the share of votes and the share of seats of a party [26, 18, 3, 16]. In spite of such efforts, their success in convincing courts to adopt one or more of these measures has been unfortunately somewhat limited to date.
Recently, researchers Stephanopoulos and McGhee in two papers [23, 29] have introduced a new gerrymandering measure called the “efficiency gap”. Informally speaking, the efficiency gap measure attempts to minimize the absolute difference of total wasted votes between the parties in a two-party electoral system. This measure is very promising in several aspects. Firstly, it provides a mathematically precise measure of gerrymandering with many desirable properties. Equally importantly, at least from a legal point of view, this measure was found legally convincing in a US appeals court in a case that claims that the legislative map of the state of Wisconsin is gerrymandered; the case is now pending in US Supreme Court .
1.1 Informal Overview of Our Contribution and Its Significance
Redistricting based on minimizing the efficiency gap measure however requires one to find a solution to an combinatorial optimization problem. To this effect, the contribution of this article is as follows:
As a necessary first step towards investigating the efficiency gap measure, in Section 2 we first formalize the optimization problem that corresponds to minimizing the efficiency gap measure.
Subsequently, in Section 3 we study the mathematical properties of the formalized version of the measure. Specifically, Lemma 1 and Corollary 2 show that the efficiency gap measure attains only a finite discrete set of rational values; these properties are of considerable importance in understanding the sensitivity of the measure and in designing efficient algorithms for computing this measure.
Next, in Sections 4 and 5 we investigate computational complexity and algorithm design issues of redistricting based on the efficiency gap measure. Although Theorem 4 shows that in theory one can construct artificial pathological examples for which designing efficient algorithms is provably hard, Theorem 10 and Theorem 11 provide justification as to why the result in Theorem 4 is overly pessimistic for real data that do not necessarily correspond to these pathological examples. For example, assuming that the districts are geometrically compact (-convex in our terminology), Theorem 11 shows how to find a district map efficiently in polynomial time that minimizes the efficiency gap.
Finally, to show that it is indeed possible in practice to solve the problem of minimization of the efficiency gap, in Section 6 we design a fast randomized algorithm based on the local search paradigm in combinatorial optimization for this problem (cf. Fig. 4). Our resulting software was tested on four electoral data for the election of the (federal) house of representatives for the US states of Wisconsin [36, 35], Texas [38, 37], Virginia [33, 34] and Pennsylvania [31, 32]. The results computed by our fast algorithm are truly outstanding: the final efficiency gap was lowered to , , and from , , and for Wisconsin, Texas, Virginia and Pennsylvania, respectively, in a small amount of time. Our empirical results clearly show that it is very much possible to design and implement a very fast algorithm that can “un-gerrymander” (with respect to the efficiency gap measure) the gerrymandered US house districts of four US states.
Based on these empirical results, we also provide some interesting insights into three practical issues related the efficiency gap measure, namely issues pertaining to seat gain vs. efficiency gap, compactness vs. efficiency gap and the naturalness of original gerrymandered districts.
To the best of our knowledge, our results are first algorithmic analysis and implementation of minimization of the efficiency gap measure. Our results show that it is practically feasible to redraw district maps in a small amount of time to remove gerrymandering based on the efficiency gap measure. Thus, should the Supreme Court uphold the ruling of the lower court, our algorithm and its implementation will be a necessary and valuable asset to remove partisan gerrymandering.
1.2 Beyond scientific curiosity: impact on US judicial system
Beyond its scientific implications on the science of gerrymandering, we expect our algorithmic analysis and results to have a beneficial impact on the US judicial system also. Some justices, whether at the Supreme Court level or in lower courts, seem to have a reluctance to taking mathematics, statistics and computing seriously [28, 12], For example, during the hearing in our previously cite most recent US Supreme Court case on gerrymandering , some justices opined that the math was unwieldy, complicated, and newfangled. One justice called it baloney, and Chief Justice John Roberts dismissed the attempts to quantify partisan gerrymandering by saying
“It may be simply my educational background, but I can only describe it as sociological gobbledygook.”
Our theoretical and computational results show that the math, whether complicated or not (depending on one’s background), can in fact yield fast accurate computational methods that can indeed be applied to un-gerrymander the currently gerrymandered maps.
1.3 Some Remarks and Explanations Regarding the Technical Content of This Paper
To avoid any possible misgivings or confusions regarding the technical content of the paper as well as to help the reader towards understanding the remaining content of this article, we believe the following comments and explanations may be relevant. We encourage the reader to read this section and explore the references mentioned therein before proceeding further.
We employ a randomized local-search heuristic for combinatorial optimization for our algorithm in Fig.4. Our algorithmic paradigm is quite different from Markov Chain Monte Carlo simulation, simulated annealing approach, Bayesian methods and related similar other methods (e.g.19], by Fifield et al.  or by Cho and Liu .
While we do provide several non-trivial theoretical algorithmic results, we do not provide any theoretical analysis of the randomized algorithm in Fig. 4. The justification for this is that, due to Theorem 4 and Lemma 12, no such non-trivial theoretical algorithmic complexity results exist in general assuming P for deterministic local-search algorithms or assuming RP for randomized local-search algorithms. One can attribute this to the usual “difference between theory and practice” doctrine.
For readers unfamiliar with the complexity-theoretic assumptions P and RP, these are core complexity-theoretic assumptions that have been routinely used for decades in the field of algorithmic complexity analysis. For example, starting with the famous Cook’s theorem  in and Karp’s subsequent paper in , the P assumption is the central assumption in structural complexity theory and algorithmic complexity analysis. For a detailed technical coverage of the basic structural complexity field, we refer the reader to the excellent textbook .
In this article we use the data at the county level as opposed to using data at finer (more granular) level such as the “Voting Tabulation District” (VTD) level (VTDs are the smallest units in a state for which the election data are available). The reason for this is as follows. Note that our algorithmic approach already returns an efficiency gap of below for three states (namely, WI, TX and VA), and for PA it cuts down the current efficiency gap by a factor of about (cf. Table 2). This, together with the observation in [29, pp. 886-888] that the efficiency gap should not be minimized to a very low value to avoid unintended consequences, shows that even just by using county-level data our algorithm can already output almost desirable (if not truly desirable) values of the efficient gap measure and thus, by Occam’s razor principle111Occam’s razor principle  states that “Entia non sunt multiplicanda praeter necessitatem” (i.e., more things should not be used than are necessary). It is also known as rule of parsimony in biological context . Overfitting is an example of violation of this principle.
widely used in computer science, we should not be using more data at finer levels. In fact, using more data at a finer level may lead to what is popularly known as “overfitting” in the context of machine learning and elsewhere that may hide its true performance on yet unexplored maps. In this context, our suggestion to future algorithmic researchers in this direction is to use a minimal amount of data that is truly necessary to generate an acceptable solution.
In this article we are not comparing our approaches empirically to those in existing literature such as in [19, 13, 8]. The reason for this is that, to the best of our knowledge, there is currently no other published work that gives a software to optimize the efficiency gap measure. In fact, it would be grossly unfair to other existing approaches if we compare our results with their results. For example, suppose we consider an optimal result using an approach from  and find that it gives an efficiency gap of whereas the approach in this article gives an efficiency gap of . However, it would be grossly unfair to say that, based on this comparison, our algorithm is better than the one in  since the authors in  never intended to minimize the efficiency gap. Furthermore, even the two maps cannot be compared directly by geometric methods since no court has so far established a firm and unequivocal ground truth on gerrymandering by having a ruling of the following form:
[court]: “a district map is gerrymandered if and only if such-and-such conditions are satisfied”
(the line is crossed out above just to doubly clarify that such a ruling does not exist).
For certain scientific research problems, algorithmic comparisons are possible because of the existence of ground truths (also called ”gold standards” or “benchmarks”). For example, different algorithmic approaches for reverse engineering causal relationships between components of a biological cellular system can be compared by evaluating how close the methods under investigation are in recovering known gold standard networks using widely agreed upon metrics such as recall rates or precision values . Unfortunately, for gerrymandering this is not the case and, in our opinion, comparison of algorithms for gerrymandering that optimize substantially different objectives should be viewed with a grain of salt.
The main research goal of this paper is to minimize the efficiency gap measure exactly as introduced by Stephanopoulos and McGhee in [23, 29]. However, should future researchers like to introduce additional computable constraints or objectives, such as compactness or respect of community boundaries, on top of our efficiency gap minimization algorithm, it is a conceptually easy task to modify our algorithm in Fig. 4 for this purpose. For example, to introduce compactness on top of minimization of the efficiency gap measure, the following two lines in Fig. 4
should be changed to something like (changes are indicated in bold):
and each of are compact then
and appropriate minor changes can be made to other parts of the algorithm for consistency with this modification.
2 Formalization of the Optimization Problem to Minimize the Efficiency Gap Measure
Based on [23, 29], we abstract our problem in the following manner. We are given a rectilinear polygon without holes. Placing on a unit grid of size , we will identify an individual unit square (a “cell”) on the row and column in by for and (see Fig. 1). For each cell , we are given the following three integers:
an integer (the “total population” inside ), and
two integers (the total number of voters for Party A and Party B, respectively) such that .
Let denote the “size” (number of cells) of . For a rectilinear polygon included in the interior of (i.e., a connected subset of the interior of ), we defined the following quantities:
- Party affiliations in :
- Population of :
- Efficiency gap of :
Note that if then , i.e., in case of a tie, we assume Party A is the winner. Also, note that if and only if either or .
Our problem can now be defined as follows.
0.6cm Problem name: -district Minimum Wasted Vote Problem (Min-wvp). Input: a rectilinear polygon with for every cell , and a positive integer . Definition: a -equipartition of is a partition of the interior of into exactly rectilinear polygons, say , such that . Assumption: has at least one -equipartition. Valid solution: Any -equipartition of . Objective: minimize the total absolute efficiency gap222Note that our notation uses the absolute value for but not for individual ’s.. Notation: .
3 Mathematical Properties of Efficiency Gaps: Set of Values Attainable by the Efficiency Gap Measure
The following lemma sheds some light on the set of rational numbers that the total efficiency gap of a -equipartition can take. As an illustrative example, if we just partition the polygon into regions, then can only be one of the following possible values:
For any -equipartition of , always assumes one of the values of the form for .
If for some and some -equipartition of , then .
Proof. To prove (a), consider any -equipartition of with . Note that for any we have where
Letting be the number of ’s that are equal to , it follows that
To prove (b), note that, since for any , we have and . ❑
Using the reverse triangle inequality of norms, the absolute difference between two successive values of is given by
4 Approximation Hardness Result for Min-wvp
Recall that, for any , an approximation algorithm with an approximation ratio of (or, simply an -approximation) is a polynomial-time solution of value at most times the value of an optimal solution .
Assuming , for any rational constant , the Min-wvp problem for a rectilinear polygon does not admit a -approximation algorithm for any and all .
Since the PARTITION problem is not a strongly -complete problem (i.e., admits a pseudo-polynomial time solution), the approximation-hardness result in Theorem 4 does not hold if the total population is polynomial in (i.e., if for some positive constant ). Indeed, if is polynomial in then it is easy to design a polynomial-time exact solution via dynamic programming for those instances of Min-wvp problem that appear in the proof of Theorem 4.
Proof of Theorem 4. We reduce from the -complete PARTITION problem  which is defined as follows: given a set of positive integers , decide if there exists a subset such that . Note that we can assume without loss of generality that is sufficiently large and each of is a multiple of any fixed positive integer. For notational convenience, let .
Let be such that (we will later show that is at most the constant ). Our rectilinear polygon , as illustrated in Fig. 2 (A), consists of a rectangle of size with additional cells attached to it in any arbitrary manner to make the whole figure a connected polygon without holes. For convenience, let be the set of the additional cells. The relevant numbers for each cell are as follows:
First, we show how to select a rational constant such that any integer in the range can be realized. Assume that for some . Since the following calculations hold:
Claim 4.1 for each , and moreover each must be a separate partition by itself in any -equipartition of .
Proof. By straightforward calculation, . Since and , each partition in any -equipartition of must have a population of and thus each of population must be a separate partition by itself. ❑
Using Claim 4.1 we can simply ignore all in the calculation of of efficiency gap of a valid solution of and it follows that the total efficiency gap of a -equipartition of is identical to that of a -equipartition of . A proof of the theorem then follows provided we prove the following two claims.
If the PARTITION problem does not have a solution then .
If the PARTITION problem has a solution then .
Proof of soundness (refer to Fig. 2 (B))
Suppose that there exists a valid solution (i.e., a -equipartition) of Min-wvp for with , and let . Then,
and thus is a valid solution of PARTITION, a contradiction!
Thus, assume that both and belong to the same partition, say . Then, since , every must belong to . Moreover, every with must belong to since otherwise will not be a connected region. This provides , showing that is indeed a valid solution (i.e., a -equipartition) of Min-wvp for . The total efficiency gap of this solution can be calculated as
Proof of completeness (refer to Fig. 2 (C))
Suppose that there is a valid solution of of PARTITION and consider the two polygons
By straightforward calculation, it is easy to verify the following:
, , and thus is a valid solution (i.e., a -equipartition) of Min-wvp for .
5 Efficient Algorithms Under Realistic Assumptions
Although Theorem 4 seems to render the problem Min-wvp intractable in theory, our empirical results show that the problem is computationally tractable in practice. This is because in real-life applications, many constraints in the theoretical formulation of Min-wvp are often relaxed. For example:
- (i) Restricting district shapes:
Individual partitions of the -equipartition of may be restricted in shape. For example, states require their legislative districts to be reasonably compact and states require congressional districts to be compact .
- (ii) Variations in district populations:
A partition of is only approximately -equipartition, i.e., are approximately, but not exactly, equal to . For example, the usual federal standards require equal population as nearly as is practicable for congressional districts but allow more relaxed substantially equal population (e.g., no more than deviation between the largest and smallest district) for state and local legislative districts .
- (iii) Bounding the efficiency gap measure away from zero:
A -equipartition of is a valid solution only if for some . Indeed, the authors that originally proposed the efficiency gap measure provided in [29, pp. 886-887] several reasons for not requiring to be either zero or too close to zero.
In this section, we explore algorithmic implications of these types of relaxations of constraints for Min-wvp.
5.1 The Case of Two Stable and Approximately Equal Partitions
This case considers constraints (ii) and (iii). The following definition of “near partitions” formalizes the concept of variations in district populations.
Definition 5 (Near partitions)
Let , and let be an instance of Min-wvp. Let be such that is a partition of , such that for each , we have
for some . Then we say that is a -near partition.
The next definition of “stability” formalizes the concept of bounding away from zero the efficiency gap of each partition.
Definition 6 (Stability)
Let , let be an instance of Min-wvp, and let be a partition for . We say that is -stable, for some , if for all , we have
Definition 7 (Canonical solution)
Let be an instance of Min-wvp. Let . Let be the partition of obtained as follows: We partition into rectangles, where each rectangle consists of the intersection of rows with columns, except possibly for the rectangles that are incident to the right-most and the bottom-most boundaries of . We refer to the rectangles in as -basic (see figure 3). Let be the set of cells consisting of the union of the left-most column of , the top row of , and for each -basic rectangle , the bottom row of , and the right-most column of , except for the cell that is next to the top cell of that column (see figure 3). We refer to as the -basic tree For each -basic rectangle , we define its interior to be the set of cells in that are at distance at least from . A solution of is called -canonical if it satisfies the following properties.
(2) Let be a -basic rectangle, and let be its interior. For each , let be the set of connected components of . Then, for each , there exists a unique cell in that is adjacent to both and . Moreover, all other cells in are in (see Figure 3).
Let be an instance of Min-wvp. Suppose that there are no empty cells and the maximum number of people in any cell of is . Let , and suppose that there exists a -stable solution for . Then, for any , at least one of the following conditions hold:
(1) Either or is contained in some -basic rectangle.
(2) There exists a -canonical -near solution of , for some , such that for all , we have , and .
Proof. It suffices to show that if condition (1) does not hold, then condition (2) does. We define a partition of as follows. We initialize to be empty. Let be the -basic tree. We add to . For each -basic rectangle , let be its interior. For each , let be the set of connected components of . Since is a valid solution, we have that is connected. Since condition (1) does not hold, it follows that intersects at least two -basic rectangles. Therefore, each component must contain some cell on the boundary of . By construction, must be incident to some cell that is incident to . We add to . Repeating this process for all basic rectangles, and for all components as above. Finally, we define . This completes the definition of the partition of . It remains to show that this is the desired solution.
First, we need to show that is a valid solution. To that end, it suffices to show that both and are connected. The fact that is connected follows directly from its construction. To show that is connected we proceed by induction on the construction of . Initially, consists of just the cells in , and thus its complement is clearly connected. When we consider a component , we add to . Since we add only a single cell that is incident to both and , it follows inductively that remains simply connected (that is, it does not contain any holes), and therefore its complement remains connected. This concludes the proof that both and are connected, and therefore is a valid solution.
The solutions and can disagree only on cells that are not in the interior of any basic rectangle. All these cells are contained in the union of rows and columns. Thus, the total number of voters in these cells is at most . It follows that for each , we have , and .
Since there are no empty cells, we have . It follows that for all , we have
Thus is -near, for some , which concludes the proof. ❑
Let . Let be an instance of Min-wvp. Suppose that there are no empty cells and the maximum number of people in any cell of is . Suppose that there exists a -stable partition of . Then, for any fixed , there exists an algorithm which given computes some -near partition of , for some , such that for all , we have , and , in time .
Proof. We can check whether there exist a partition satisfying the conditions, and such that either or is contained in the interior of a single -basic rectangle. This can be done by trying all -basic rectangles, and all possible subsets of the interior of each -basic rectangle, in time .
It remains to consider the case where neither of and is contained in the interior of any -basic rectangle. It follows that condition (2) of Lemma 8 holds. That is, there exists some -canonical -near solution of , for some , such that for all , we have , and . We can compute such a partition via dynamic programming, as follows. Let be the union of the interiors of all -basic rectangles. By the definition of a canonical partition, it suffices to compute and . Since , it suffices to compute . Let , and . Clearly, . Thus there are at most different values for the pair