Assertion-based Approaches to Auditing Complex Elections, with application to party-list proportional elections

07/25/2021
by   Michelle Blom, et al.
The University of Melbourne
0

Risk-limiting audits (RLAs), an ingredient in evidence-based elections, are increasingly common. They are a rigorous statistical means of ensuring that electoral results are correct, usually without having to perform an expensive full recount – at the cost of some controlled probability of error. A recently developed approach for conducting RLAs, SHANGRLA, provides a flexible framework that can encompass a wide variety of social choice functions and audit strategies. Its flexibility comes from reducing sufficient conditions for outcomes to be correct to canonical `assertions' that have a simple mathematical form. Assertions have been developed for auditing various social choice functions including plurality, multi-winner plurality, super-majority, Hamiltonian methods, and instant runoff voting. However, there is no systematic approach to building assertions. Here, we show that assertions with linear dependence on transformations of the votes can easily be transformed to canonical form for SHANGRLA. We illustrate the approach by constructing assertions for party-list elections such as Hamiltonian free list elections and elections using the D'Hondt method, expanding the set of social choice functions to which SHANGRLA applies directly.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

11/22/2019

Sets of Half-Average Nulls Generate Risk-Limiting Audits: SHANGRLA

Risk-limiting audits (RLAs) for many social choice functions can be redu...
02/17/2021

Auditing Hamiltonian Elections

Presidential primaries are a critical part of the United States Presiden...
01/07/2022

ALPHA: Audit that Learns from Previously Hand-Audited Ballots

BRAVO, the most widely tried method for risk-limiting election audits, c...
08/19/2020

A Unified Evaluation of Two-Candidate Ballot-Polling Election Auditing Methods

Counting votes is complex and error-prone. Several statistical methods h...
07/15/2021

Combatting Gerrymandering with Social Choice: the Design of Multi-member Districts

Every representative democracy must specify a mechanism under which vote...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Risk-limiting audits (RLAs) test reported election outcomes statistically by manually inspecting random samples of paper ballots. An RLA terminates either by endorsing the reported outcome or by proceeding to a full manual count if the evidence is inconclusive. The outcome according to the full count corrects the reported outcome if they differ. The risk limit is an upper bound on the probability that a wrong election outcome will not be corrected—this is set in advance, typically between 1% and 10%.

SHANGRLA [5] is a general framework for conducting RLAs of a wide variety of social choice functions.111Any social choice function that is a scoring rule—that assigns ‘points’ to candidates on each ballot, sums the points across ballots, and declares the winner(s) to be the candidate(s) with the most ‘points’—can be audited using SHANGRLA, as can some social choice functions that are not scoring rules, such as super-majority and IRV. SHANGRLA involves reducing the correctness of a reported outcome to the truth of a set of quantitative assertions about the set of validly cast ballots, which can then be tested using statistical methods. The assertions are either true or false depending on the votes on the ballots. If all the assertions are true, the reported outcome is correct.

This paper shows how to use the SHANGRLA RLA method to audit some complex social choice functions not addressed in the SHANGRLA paper. We give a recipe for translating sufficient conditions for a reported outcome to be correct into canonical form for SHANGRLA, when those conditions are the intersection of a set of linear inequalities involving transformations of the votes on each ballot. We focus on European-style party-list proportional representation elections, with the German state of Hesse as a case study.

1.1 Assertion-based auditing: Properties and challenges

For some social choice functions, the reduction to assertions is obvious. For instance, in plurality (first-past-the-post) elections, common in the United States, Alice won the election if and only if Alice’s tally was higher than that of each of the other candidates (where is the total number of candidates). That set of assertions is clearly a set of linear inequalities among the vote totals for the candidates.

In general, assertions involve not only the votes but also the reported results—the reported outcome and possibly the voting system’s interpretation of individual ballots (CVRs) or tallies of groups of ballots.

SHANGRLA [5, Sec 2.5] shows how to make assorters for any ‘scoring rule’ (e.g. Borda, STAR-voting, and any weighted scheme). For more complex social choice functions, constructing sufficient sets of assertions may be much less obvious. Blom et al. [2]

use a heuristic method, RAIRE, to derive assertions for Instant Runoff Voting (IRV) from the CVRs. RAIRE allows the RLA to test an IRV outcome—the claim that Alice won—without checking the entire IRV elimination. RAIRE’s assertions are

sufficient: if all of the assertions in are true, then the announced election outcome is correct. However, the set of assertions might not be necessary—even if one of the assertions in is false, Alice may still have won, but for reasons not checked by the audit.

A social choice function might be expensive to audit for two different reasons: it might require a very large sample for reasonable confidence, even when there are no errors (for instance, if it tends to produce small margins in practice); alternatively, it might be so complex that it is difficult to generate assertions that are sufficient to prove the reported election outcome is correct. Pilots and simulations suggest that IRV elections do not have small margins any more often than first-past-the-post elections. Hence IRV is feasible to audit in both senses.

Below, the sets of assertions we consider are conjunctive: the election outcome is correct if all the assertions in are true. Although it is possible to imagine an audit method that tests more complex logical structures (for example, the announced outcome is correct if either all the assertions in or all the assertions in are true), this is not currently part of the SHANGRLA framework.

Summary:

An audit designer must devise a set of assertions.

  • generally depends on the social choice function and the reported electoral outcome, and may also depend on the CVRs, vote subtotals, or other data generated by the voting system.

  • If every assertion in is true, then the announced electoral result is correct.

  • The announced electoral result may be correct even if not every assertion in is true.

SHANGRLA relies on expressing assertions in terms of assorters.

1.2 Assorters

The statistical part of SHANGRLA is agnostic about the social choice function. It simply takes a collection of sets of numbers that are zero or greater (with a known upper bound), and decides whether to reject the hypothesis that the mean of each set is less than or equal to —this is the

assorter null hypothesis

.

An assorter for some assertion assigns a nonnegative value to each ballot, depending on the selections the voter made on the ballot and possibly other information (e.g. reported vote totals or CVRs). The assertion is true iff the mean of the assorter (over all ballots) is greater than 1/2. Generally, ballots that support the assertion score higher than , ballots that cast doubt on it score less than , and neutral ballots score exactly . For example, in a simple first-past-the-post contest, might assert that Alice’s tally is higher than Bob’s. The corresponding assorter would assign 1 to a ballot if it has a vote for Alice, 0 if it has a vote for Bob, and if it has a no valid vote or a vote for some other candidate.

The audit designer’s first job is to generate a set of assertions which, if all true, imply that the announced electoral outcome (the winner or winners) is correct. Then they need to express each as an assorter. Finally, they need to test the hypothesis that any assorter mean is less than or equal to 1/2. If all those hypotheses are rejected, the audit concludes that the reported outcome is correct. The chance this conclusion is erroneous is at most the risk limit.

Section 2 gives a more precise definition of an assorter and a general technique for transforming linear assertions into assorters.

1.3 Risk-limiting audits using SHANGRLA: Pulling it all together

An overview of the workflow for a sequential SHANGRLA RLA is:

  1. Generate a set of assertions.

  2. Express the assertions as assorters.

  3. Test every assertion in , in parallel:

    1. Retrieve a ballot or set of ballots selected at random.

    2. Apply each assorter to every retrieved ballot.

    3. For each assertion in , test its corresponding assorter null hypothesis (i.e. that the assorter mean is ) using a sequentially valid test.222It can be more efficient to sample ballots in ‘rounds’ rather than singly; SHANGRLA can accommodate any valid test of the assorter nulls.

    4. If the assorter null is rejected for , remove from .

    5. If is empty (i.e. all of the null hypotheses have been rejected), stop the audit and certify the electoral outcome.

    6. Otherwise, continue to sample more ballots.

    7. At any time, the auditor can decide to ‘cut to the chase’ and conduct a full hand count: anything that increases the chance of conducting a full hand count cannot increase the risk.

As with any RLA, the audit may not confirm the reported result (for example, that Alice’s tally is the highest) even if all assertions are true (Alice’s tally may actually be higher than Bob’s, but the audit may not gather enough evidence to conclude so). This may happen because there are many tabulation errors or because one or more margins are small. When the audit proceeds to a full hand count, its result replaces the reported outcome if the two differ.

Conversely, the audit may mistakenly confirm the result even if the announced result is wrong. The probability of this kind of failure is not more than the risk limit. This is a parameter to SHANGRLA; setting it to a smaller value generally entails examining more ballots.

1.4 Party-list proportional representation contests

Party-list proportional representation contests allocate seats in a parliament (or delegates to an assembly) in proportion to the entities’ popularity within the electorate. The first step is (usually) rounding the party’s fraction down to the nearest integer number of seats. Complexity arises from rounding, when the fractions determined by voters do not exactly match integer numbers of seats. Largest Remainder Methods, also called Hamiltonian methods, successively allocate leftover seats to the entities with the largest fractional parts until all seats are allocated. Highest Averages Methods, such as the D’Hondt method (also called Jefferson’s method), weight this extra allocation by divisors involving a fraction of the seats already allocated to that party—they are hence more likely to allocate the leftover seats to small parties. The Sainte-Laguë method (also called Webster’s method) is mathematically similar but its divisors penalise large parties even more.333Another source of complexity is the opportunity for voters to select, exclude, or prioritise individual candidates within the party.

1.5 Related work and our contribution

Blom et al. [1] showed how to construct a SHANGRLA RLA for preferential Hamiltonian elections with a viability threshold, applicable to many US primaries. Stark and Teague [4] showed how to construct an RLA for highest averages party-list proportional representation elections. Their method was not directly based on assertions and assorters, but it reduces the correctness of the reported seat allocation to a collection of two-entity plurality contests, for which it is straightforward to construct assorters, as we show below.

This paper shows how to extend SHANGRLA to additional social choice functions. We use party-list proportional representation elections as an example, showing how the assorter from [1] can be derived as a special case of the solution for more general Hamiltonian elections. We have simulated the audit on election data from the German state of Hesse; results are shown in Section 4. Auditing the allocation of integer portions of seats involves inspecting a reasonable number of ballots, but the correctness of the allocations based on the fractional remainders and the correctness of the particular candidates who receive seats within each party generally involve very small margins, which in turn require large audit sample sizes. We also show how to apply the construction to highest averages methods such as D’Hondt and Sainte-Laguë. Our contributions are:

  • A guide to developing assertions and their corresponding SHANGRLA assorters, so that audits for contest types that are not already supplied can be derived, when correctness can be expressed as the intersection of a set of linear inequalities (Section 3).

  • New SHANGRLA-based methods for auditing largest remainder methods that allow individual candidate selection (no audit method was previously known for this variant of largest remainder method) (Section 3.1).

  • Simulations to estimate the average sample sizes of these new methods in the German state of Hesse (

    Section 4).

  • SHANGRLA assorters for highest averages methods (RLAs for these methods were already known, but had not been expressed as assorters). (Section 5).

2 Preliminaries

2.1 Nomenclature and notation for assertion-based election audits

An election contest is decided by a set of ‘ground truth’ ballots (of cardinality ). Many social choice functions are used in political elections. Some yield a single winner; others multiple winners. Some only allow voters to express a single preference; others allow voters to select or rank multiple candidates or parties.

Here, we focus on elections that allow voters to select (but not rank) one or more ‘entities,’ which could be candidates or parties.444Below, in discussing assorters, we use the term ‘entity’ more abstractly. For instance, when voters may rank a subset of entities, the assorters may translate ranks into scoring functions in a nonlinear manner, as in [2]—we do not detail that case here.

Let be the number of ‘seats’ (positions) to be filled in the contest, of which were awarded to entity . Each ballot might represent a single vote for an entity, or multiple votes for multiple entities. Important quantities for individual ballots include:

  • , the maximum permitted number of votes for any entity.

  • , the maximum permitted number of votes in total (across all entities).

  • , the total number of (valid) votes for entity on the the ballot.

  • , the total number of (valid) votes on the ballot.

Any of these may be greater than one, depending on the social choice function. Validity requires and . If ballot does not contain the contest in question or is deemed invalid, for all entities , and .

Important quantities for the set of ballots include:

  • , the tally of votes for entity .

  • , the total number of valid votes in the contest.

  • , the proportion of votes for entity .

2.2 Assertion-based auditing: Definitions

Here we formalize assertion-based auditing sketched in Section 1 and introduce the relevant mathematical notation. An assorter is a function that assigns a non-negative number to each ballot depending on the votes reflected on the ballot and other election data (e.g. the reported outcome, the set of CVRs, or the CVR for that ballot). Each assertion in the audit is equivalent to ‘the average value of the assorter for all the cast ballots is greater than 1/2.’ In turn, each assertion is checked by testing the complementary null hypothesis that the average is less than or equal to 1/2. If all the complementary null hypotheses are false, the reported outcome of every contest under audit is correct.

Definition 1

An assertion is a statement about the set of paper ballots of the contest. An assorter for assertion is a function that maps selections on a ballot to for some known constant , such that assertion holds for iff where is the average value of over all .

A set of assertions is sufficient if their conjunction implies that the reported electoral outcome is correct.

2.3 Example assertions and assorters

Example 1

First-past-the-post voting. Consider a simple first-past-the-post contest, where the winner is the candidate with the most votes and each valid ballot records a vote for a single candidate. The result is correct if the assertions for each losing candidate all hold.

We can build an assorter for the assertion as follows:

Example 2

Majority contests. Consider a simple majority contest, where the winner is the candidate achieving over 50% of the votes, assuming again each valid ballot holds a single vote (if there is no winner, a runoff election is held). The result can be verified by the assertion .

We can build an assorter for the more general assertion as follows:

3 Creating assorters from assertions

In this section we show how to transform generic linear assertions, i.e. inequalities of the form , into canonical assertions using assorters as required by SHANGRLA. There are three steps:

  1. Construct a set of linear assertions that imply the correctness of the outcome.555Constructing such a set is outside the scope of this paper; we suspect there is no general method. Moreover, there may be social choice functions for which there is no such set.

  2. Determine a ‘proto-assorter’ based on this assertion.

  3. Construct an assorter from the proto-assorter via an affine transformation.

We work with social choice functions where each valid ballot can contribute a non-negative (zero or more) number of ‘votes’ or ‘points’ to various tallies (we refer to these as votes henceforth). For example, in plurality voting we have a tally for each candidate and each ballot contributes a vote of 1 to the tally of a single candidate and a vote of 0 to all other candidates’ tallies. The tallies can represent candidates, groups of candidates, political parties, or possibly some more abstract groupings of candidates as might be necessary to describe an assertion (see below); we refer to them generically as entities.

Let the various tallies of interest be for different entities. These represent the total count of the votes across all valid ballots.

A linear assertion is a statement of the form

for some constants .

Each assertion makes a claim about the ballots, to be tested by the audit. For most social choice functions, the assertions are about proportions rather than tallies. Typically these proportions are of the total number of valid votes, , in which case we can restate the assertion in terms of tallies by multiplying through by .

For example, a pairwise majority assertion is usually written as , stating that candidate got a larger proportion of the valid votes than candidate . We can write this in linear form as follows. Let and be the tallies of votes in favour of candidates and respectively. Then:

Another example is a super/sub-majority assertion, , for some threshold . We can write this in linear form similar to above, as follows:

For a given linear assertion, we define the following function on ballots, which we call a proto-assorter:

where is a given ballot, and are the votes contributed by that ballot to the tallies respectively.666Note that for any invalid ballot , based on previous definitions.

Summing this function across all ballots, , gives the left-hand side of the linear assertion. Thus, the linear assertion is true iff . The same property holds for the average across ballots, ; the linear assertion is true iff .

To obtain an assorter in canonical form, we apply an affine transformation to such that it never takes negative values and also so that comparing its average value to determines the truth of the assertion. One such transformation is

(1)

for some constant .777Note that if ballot has no valid vote in the contest. There are many ways to choose . We present two here. First, we determine a lower bound for the proto-assorter, a value such that for all .888If the votes are bounded above by and below by zero, then a bound (not necessarily the sharpest) on is given by taking just the votes that contribute negative values to , setting all of those votes to , and setting the other votes to 0:

Note that in all interesting cases: if not, the assertion would be trivially true () or trivially false (, with for all ). If , simply setting produces an assorter: we have , and iff . Otherwise, we can choose , giving

(2)

(See [5, Sec. 2.5].) To see that is an assorter, first note that since the numerator is always non-negative and the denominator is positive. Also, the sum and mean across all ballots are, respectively:

Therefore, iff .

3.1 Example: Pairwise difference assorter

To illustrate the approach, we will now create an assorter for a fairly complex assertion for quite complicated ballots. We consider a contest where each ballot can have multiple votes for multiple entities; the votes are simple—not ranks or scores. Let be the maximum number of votes a single ballot can contain for that contest. We can use the above general technique to derive an assorter for the assertion . In Section 4 we will use this for auditing Hamiltonian free list contests, where and will be parties. This assertion checks that the proportion of votes has is greater than that of plus a constant, . This constant may be negative.

We start with the assertion . We can rewrite this in terms of tallies as we did in the previous examples, giving the following linear form:

The corresponding proto-assorter is

If the votes are bounded above by then this has lower bound given by

Therefore, an assorter is given by

When this reduces to the pairwise difference assorter for ‘simple’ Hamiltonian contests, where each ballot can only cast a single vote [1]. When this reduces to the pairwise majority assorter in the more general context where we can have multiple votes per ballot.

4 Case study: 2016 Hesse local elections

In the local elections in Hesse, Germany, each ballot allows the voter to cast direct votes, where is the number of seats in the region. Each party can have at most candidates on the ballot. Voters can assign up to three votes to individual candidates; they can spread these votes amongst candidates from different parties as they like. Voters can cross out candidates, meaning none of their votes will flow to such candidates. Finally a voter can select a single party. The effect of this selection is that remaining votes not assigned to individual candidates are given to the party. At the low level these votes are then spread amongst the candidates of the party (that have not been crossed out) by assigning one vote to the next (uncrossed out) candidate in the selected party, starting from the top, and wrapping around to the top once we hit the bottom, until all the remaining votes are assigned. Budurushi [3] provides a detailed description of the vote casting and vote tallying rules.999The description is based on the (German only) official information from Hesse, see https://wahlen.hessen.de/kommunen/kommunalwahlen-2021/wahlsystem, last accessed 24.07.2021.

Example 3

Consider a contest in a region with 12 seats, and a ballot with 4 parties. The Greens have five candidates appearing in the order Arnold, Beatrix, Charles, Debra, and Emma. Consider a ballot that has 3 votes assigned directly to Beatrix, Charles crossed out, three votes assigned directly to Fox (a candidate for another party), and the Greens party selected.

Since 6 votes are directly assigned, the Greens receive the remaining 6 votes. We start by assigning one vote of the 6 to the top candidate, Arnold, then one to Beatrix, none to Charles, one to Debra, one to Emma, another to Arnold, and another to Beatrix. In total, the ballot assigns 2 votes to Arnold, 5 to Beatrix, 1 to Debra, 1 to Emma, and 3 to Fox. ∎

The social choice function involves two stages. In the first stage, the entities we consider are the parties. This stage determines how many seats are awarded to each party. Each party is awarded the total votes assigned on a ballot to that party via individual candidates votes and the party selection remainder. There is a Hamiltonian election to determine the number of seats awarded to each party. Given seats in the region, we award to each party . The remaining seats are awarded to the parties with greatest remainders . Let be the total number of seats awarded to party (which is either or ).

In the second stage, seats are awarded to individual candidates. For each party awarded seats, those candidates in the party receiving the most votes are awarded a seat.

Performing a risk-limiting audit on a Hesse local election involves a number of assertions. The first stage is a Hamiltonian election. The assertions required to verify the result are described by Blom et al. [1]. For each pair of parties we need to test the assertion

(3)

While Blom et al. [1] define an assorter for this assertion, it is made under the assumption that each ballot contains a vote for at most one entity. The assorter defined in Section 3.1—with , and —is more general and allows for multiple votes per ballot.

These (All-Seats) assertions may require large samples to verify. We can verify a simpler assertion—that each party deserved to obtain at least seats—using the assertion . We check this with an ‘All-But-Remainder’ audit.

The second stage of the election is a multi-winner first-past-the-post contest within each party: party ’s seats are allocated to the individual candidates with highest tallies. An audit would require comparing each winner’s tally to each loser’s. The margins are often very small—the example data includes margins of only one vote—so these allocations are likely to require a full recount, and we have not included them in our simulations.

For experiments we consider a collection of 21 local district-based elections held in Hesse, Germany, on March 6, 2016. An ‘All-But-Remainder’ audit checks that each party deserved the seats awarded to it in the first phase of distribution (), excluding those assigned to parties on the basis of their ‘remainder’. An ‘All-Seats’ audit checks , i.e. all of the seats awarded to party , including their last seat awarded on the basis of their remainder (if applicable).

Across the 21 district contests in our case study, the number of seats available varied from 51 to 87, the number of parties from 6 to 11, and the number of voters from 39,839 to 157,100. For each assertion, we estimate the number of ballot checks required to audit it, assuming no errors are present between each paper ballot and its electronic record. Table 1 shows the number of ballot checks required to audit the most difficult assertion in each of these contests as the contest’s ASN (average sample number) for the two levels of auditing (All-But-Remainder and All-Seats). An ASN of indicates that a full manual recount would be required. We record the ASN for risk limits, of 5% and 10%. The Kaplan–Kolmogorov risk function (with = 0.1) was used to compute ASNs, given the margin for an assertion, following the process outlined in Section 4.1.

Table 1 shows that an All-Seats audit can be challenging in terms of the sample size required, but that an All-But-Remainder audit is usually quite practical. The estimated sample size required in an audit depends on the margin of each assertion being checked. Where these margins are small—for example, where two parties receive a similar remainder—the average sample size is likely to be large. This is an inherent property of the auditing, not a failure of our method. For example, the All-Seats audit for Limburg-Weilburg has an infinite ASN. The vote data shows why: the lowest remainder to earn an extra seat is the CDU Party’s, with a remainder of 24,267 votes; the highest remainder not to earn an extra seat is the FW Party’s, with 24,205 votes. An audit would need to check that the FW did not, in fact, gain a higher remainder than the CDU. However, a single ballot can contain up to 71 votes, so this comparison (and hence the electoral outcome) could be altered by a single misrecorded ballot. An electoral outcome that can be altered by the votes on one ballot requires a full manual count in any election system, regardless of the auditing method.

Even the All-Seats audit is quite practical when the margins represent a relatively large fraction of ballots. This is consistent with prior work ([1]) on US primaries, showing that an All-Seats audit is quite practical in that context.

4.1 Estimating an initial sample size using a risk function

We use the margin of the assorter for each assertion to estimate the number of ballot checks required to confirm that an assertion holds in an audit. As defined in [5], the margin for assertion is 2 times its assorter mean, , minus 1.

Let the total number of valid ballots and be the total number of invalid ballots cast in the contest. Note that the sum may differ from the total number of votes, , since there may be multiple votes expressed on each ballot.

For an All-But-Remainder assertion indicating that party received more than proportion of the total vote, , the assorter mean is

where is the total number of votes for all candidates in party . We compute for a given assertion as follows:

For an All-Seats comparative difference assertion between two parties, and , we need to test a pairwise difference assertion where the difference is given by

The assorter mean for testing this assertion is given by

Once we have computed the assorter mean for an assertion, we use functionality from the SHANGRLA software implementation,101010TestNonnegMean.initial_sample_size() from https://github.com/pbstark/SHANGRLA/blob/main/Code/assertion_audit_utils.py, last accessed 24.07.2021. using the Kaplan–Kolmogorov risk function with , and an error rate of 0.

All-But-Rem. All-Seats
District RL 5% RL 10% RL 5% RL 10%
ASN ASN ASN ASN
Marburg-Biedenkopf 81 92k 8 88k 8 128 99 56 2,004 1,544
Fulder 81 95k 8 91k 8 27 20 56 34,769 28,142
Wetterau 81 122k 11 115k 11 26 20 110 12,570 9,790
Groß Gerau 71 85k 11 80k 11 291 224 110 7,844 6,101
Limburg-Weilburg 71 67k 7 64k 7 879 677 42
Kassel 81 100k 7 95k 7 1,180 909 42 4,580 3,540
Darmstadt-Dieburg 71 113k 8 107k 8 39 30 56 86,480 76,879
Bergstrasse 71 101k 9 96k 9 19 14 72 5,329 4,123
Werra-Meißner 61 45k 6 42k 6 8 6 30 3,252 2,522
Hersfeld-Rotenburg 61 52k 7 50k 7 29 23 42 5,173 4,026
Offenbach 87 119k 9 113k 9 35 27 72 25,691 20,323
Rheingau Taunus 81 78k 7 74k 7 27 21 42 4,382 3,392
Lahn-Dill 81 88k 8 83k 8 50 38 56 2,752 2,124
Waldeck-Frankenberg 71 65k 8 62k 8 234 180 56 1,508 1,162
Main-Taunus 81 95k 8 91k 8 66 51 56 23,669 18,808
Schwalm-Eder 71 82k 8 78k 8 24 18 56 35,724 29,301
Odenwald 51 40k 7 38k 7 74 57 42 933 719
Main-Kinzig 87 157k 10 148k 10 15 12 90 4,105 3,165
Landkreis Gießen 81 103k 8 98k 8 41 24 56 8,324 6,464
Hochtaunus 71 94k 8 90k 8 83 64 56 36,978 30,069
Vogelsberg 61 50k 7 47k 7 10 8 42 9,668 7,624
Table 1: Estimates of audit sample sizes for each local district election held in Hesse on March 6th, 2016. We record the number of assertions to be checked in an All-But-Remainder and All-Seats audit, alongside the estimated number of ballot checks required to complete these audits for risk limits of 5% and 10%, assuming no discrepancies are found between paper ballots and their electronic records. is the number of seats, is the total number of ballots cast, is the total number of parties, and is the total number of valid ballots. and are recorded to the nearest thousand.

5 Example: Assorters for D’Hondt and related methods

Risk-limiting audits for D’Hondt and other highest averages methods were developed by Stark and Teague [4]. In this section we show how to express those audits in the form of assertions, and develop the appropriate assorters.

5.1 Background on highest averages methods

Highest averages methods are used by many parliamentary democracies in Europe, as well as elections for the European Parliament (which uses D’Hondt).111111https://www.europarl.europa.eu/RegData/etudes/BRIE/2019/637966/EPRS_BRI(2019)637966_EN.pdf, last accessed 24.07.2021.

Highest averages methods are similar to Hamiltonian methods in that they allocate seats to parties in approximate proportion to the fraction of the overall vote they won. They differ in how they allocate the last few seats when the voting fractions do not match an integer number of seats.

A highest averages method is parameterized by a set of divisors where is the number of seats. The seats are allocated by forming a table in which each party’s votes are divided by each of the divisors, then choosing the largest numbers in the whole table—the number of selected entries in a party’s row is the number of seats that party wins. The divisors for D’Hondt are , . Sainte-Laguë has divisors , for .

Let for entity and seat . The Winning Set is

This can be visualised in a table by writing out, for each entity , the sequence of numbers , and then selecting the largest numbers in the table. Each party receives a number of seats equal to the number of selected values in its row.

Like Hamiltonian methods, highest averages methods can be used in a simple form in which voters choose only their favourite party, or in a variety of more complex forms in which voters can express approval or disapproval of individual candidates. We deal with the simple case first.

5.2 Simple D’Hondt: Party-only voting

In the simplest form of highest averages methods, seats are allocated to each entity (party) based on individual entity tallies. Let be the number of seats won and the number of the first seat lost by entity . That is:

If won some, but not all, seats, then .

The inequalities that define the winners are, for all parties with at least one winner, for all parties (different from ) with at least one loser, as follows:

(4)

Converting this into the notation of Section 3, expressing Equation 4 as an linear assertion gives us,

From this, we define the proto-assorter for any ballot as

where (resp. ) is 1 if there is a vote for party (resp. ), 0 otherwise.

The lower bound is clearly . Substituting into Equation 2 gives

Note that order matters: in general, both and are necessary—the first checks that party ’s lowest winner beat party ’s highest loser; the second checks that party ’s lowest winner beat party ’s highest loser.

5.3 More complex methods: Multi-candidate voting

Like some Hamiltonian elections, many highest averages elections also allow voters to select individual candidates. A party’s tally is the total of its candidates’ votes. Then, within each party, the won seats are allocated to the candidates with the highest individual tallies. The main entities are still parties, allocated seats according to Equation 4, but the assorter must be generalised to allow one ballot to contain multiple votes for various candidates.

The proto-assorter for entities (parties) is very similar to the single-party case, but votes for each party ( and ) count the total, over all that entity’s candidates, and may be larger than one.

The lower bound is , again substituting in to Equation 2 gives

Note this reduces to the single-vote assorter when ().

6 Conclusion & future work

SHANGRLA reduces RLAs for many social choice functions to a canonical form involving ‘assorters.’ This paper shows how to translate general linear assertions into canonical assorter form for SHANGRLA, illustrated by developing the first RLA method for Hamiltonian free list elections and the first assertion-based approach for D’Hondt style elections.

We show that party-list proportional representation systems can be audited using simple assertions that are both necessary and sufficient for the reported outcome to be correct. In some settings, including in Hesse, elections are inherently expensive to audit because margins are frequently small, both between parties vying for the seats allocated by remainder, and between candidates in the same party.

There are social choice functions for which no set of linear assertions guarantees the reported winner really won, for instance, social choice functions in which the order of in which the votes are tabulated matters or that involve a random element. Some variants of Single Transferable Vote (STV) have one or the other of those properties.

Other variants of STV might be amenable to RLAs and to SHANGRLA in particular: the question is open. We conjecture that STV is inherently hard to audit. Although a sufficient set of conditions is easy to generate—simply check every step of the elimination and seat-allocation sequence—this is highly likely to have very small margins and hence to require impractical sample sizes. We conjecture that it is hard to find a set of conditions that imply an STV outcome is correct and that requires reasonable sample sizes to audit. Of course, this was also conjectured for IRV and turns out to be false.

References

  • [1] M. Blom, P. B. Stark, P. J. Stuckey, V. Teague, and D. Vukcevic (2021) Auditing Hamiltonian elections. arXiv 2102.08510. External Links: Link, 2102.08510 Cited by: §1.5, §1.5, §3.1, §4, §4.
  • [2] M. Blom, P. J. Stuckey, and V. Teague (2019) RAIRE: risk-limiting audits for IRV elections. arXiv 1903.08804. External Links: Link Cited by: §1.1, footnote 4.
  • [3] J. Budurushi (2016-02) Usable security evaluation of easyvote in the context of complex elections. Ph.D. Thesis, Technische Universität Darmstadt, Darmstadt. External Links: Link Cited by: §4.
  • [4] P. B. Stark, V. Teague, and A. Essex (2014-12) Verifiable European elections: risk-limiting audits for D’Hondt and its relatives. USENIX Journal of Election Technology and Systems (JETS) 3 (1), pp. 18–39. Cited by: §1.5, §5.
  • [5] P. B. Stark (2020) Sets of half-average nulls generate risk-limiting audits: SHANGRLA. In Financial Cryptography and Data Security, M. Bernhard, A. Bracciali, L. J. Camp, S. Matsuo, A. Maurushat, P. B. Rønne, and M. Sala (Eds.), Cham, pp. 319–336. External Links: ISBN 978-3-030-54455-3 Cited by: §1.1, §1, §3, §4.1.