1 Introduction
Voting is an essential element of the current social and political systems. Individual voting result is a personal privacy which voters may be unwilling to release publicly, especially to the supports of the lost candidate or proposal. The prevailing of electronic voting systems with tracable electronic traces increase the concern of privacy disclosure.
Currently there are many popular privacy-preserving techniques, such as multi-party computation and differential privacy. However, in real-world voting, the security of the former methods relies on splitting an authority into multi-parties, which usually lacks effective supervisions. Meanwhile, The decision-making of a voting process is threshold-based. Therefore, it is hard to deploy differential privacy-based methods, which requires a less sensitive information accuracy of individual votes.
In this paper, we present two approaches towards privacy-preserving electronic voting system. The first approach is a blind signature based voting (BSV), where we apply blind signature to ensure each vote only comes from eligible voters and the government is not able to access the content. The second approach is homomorphic encryption based voting (HEV), where we use homomorphic encryption to calculate the encrypted votes directly and hide individual votes. We also update HEV into homomorphic encryption voting with sampling (HEVS), which has resistance on malicious voters involved who try to ruin the result.
2 Background
In our setting, voting consists of at least two parties: voters and the government. Voters are the main content of voting who submit ballots. The government is the authority determining the eligibility of voters, counting ballots, and publishing the results. We assume that government is able to identify all eligible voters and ignores the votes from any other ineligible voters.
Privacy-preserving voting aims at protecting voters’ voting result(e.g. for or against) from any others, including the government. The government gets the overall result only, instead of individual results.
Our voting environment is electronic, where voters and the government communicate through electronic devices111computers, databases, etc.. During the communication, everything, including ballots, can be recorded. Thus, anyone who has recorded individual data can never revealed the result of its vote.
Threat Models
Our solution assumes two levels of adversaries: semi-honest voters and semi-honest government will cooperate together to get information but do not deviate from the protocol specification; malicious voters and malicious government may arbitrarily deviate from the protocol execution and do whatever they want.
Before we introduce our protocol, we argue that both of our protocols are privacy-preserving under two levels of adversaries. However, there are ways for malicious adversaries to halt the whole process. We provide combats in Section 4
3 Two Privacy-preserving Approaches
3.1 Blind Signature Based Voting
Blind signature is a form of digital signature where messages are signed without revealing the content to the signer [chaum1983blind]. The message is blinded using a secret blinding factor from the user prior to the signature. The signer generates a signature for the blinded message. Then the user uses the blinding factor to unblind the message and the signature, resulting in the original message with a legitimate signature. Throughout the process, the signer has no knowledge about the message without accessing the blinding factor. Blind signature is supported by many cryptographic algorithms including RSA and Elliptic Curve.
Protocol Blind Signature Voting (BSV)
Notation.
is the group of eligible voters. , voter holds a vote .
denoted the government who can identify these voters.
denoted a decentralized blockchain.
Protocol detail in appendix A
In Protocol A, every voter do a blind signature protocol with the government. Then voters publish their unblinded vote with the signature of the government to a blockchain through some anonymous channel. The blockchain is used to make the published ballots unmodifiable. After a period of time, stop collecting ballots and anyone can check the blockchain to know the voting result.
There are many projects of secure voting system using blind signature and blockchain [8726645] [liu2017voting].
3.2 Homomorphic Encryption Based Voting
Homomorphic encryption is a cryptographic tool of encryption model that allows computation on ciphertexts [gentry2009fully]. To achieve a privacy preserving electronic voting system, encryption of ballots is needed. And the ballot counting is an additive computation. Therefore, homomorphic encryption can be used to compute ciphertexts 222the ballots directly, with every ballot encrypted, which inherently protects voters’ privacy. Moreover, additive homomorphic encryption is the only encryption involved.
In Protocol B, we use a modified version of Elgamal encryption with threshold decryption [10.1007/3-540-39568-7_2]. Specifically, eligible voters generate secret key pieces and use them to generate a public key with the cooperation of the government . Then voters use the public key to submit their vote, or . The government aggregates the encrypted result and starts a threshold decryption to decrypt the aggregated result. During the entire process, the individual secret key is not exposed to anyone. Therefore, individual result is safe and private in all circumstances.
Protocol Homomorphic Encryption Voting (HEV)
Notation.
is the group of eligible voters. , voter holds a vote .
denoted the government who can identify these voters.
is a cyclic group of order with generator .
is the aggregated result from .
Protocol detail in appendix B
Theorem 3.1.
If the protocol is strictly followed, (Proof in appendix C).
After the threshold decryption 5, the government collects the result , and what we want is . As a discrete log problem, this is a -bounded problem. However, the result can be verified by matching the result with . Generally, is bounded by the population, and the computation and comparison can be paralleled. Thus the result can be retrieved with reasonable cost.
4 Discussion and Improvements
In this section we address potential issues in the design and possible solutions.
4.1 Potential Attacks on BSV
Traffic Analysis
In BSV protocol, the voter have to publish the ballot to the block chain for it to be counted. This creates the risk of traffic analysis which an adversary can correlate the ballot with the voter by monitoring the network traffic from the voter to the blockchain. There are several ways to evade such attack. A Mixnet system can be used in conjunction with the blockchain. Using DC-net system can achieve similar effect but its limited through put make it less suitable for large scale voting. To further increase the anonymity set, one can designate all ballots to be casted in a certain period of time to ensure that there would be a sufficient number of voters posting their ballot at any moment. Depending on the implementation, the voting authority
can reveal the voter’s identities through timing analysis if the ballots are casted immediately after the voters request signatures from . Thus it is recommended to introduce a large, random delay between the signing and posting of the ballot. This can be done by making the ballot signing timeframe and voting timeframe non-overlapping.4.2 Potential Attacks on HEV
The stability of homomorphic encryption voting protocol relies on the honesty of all parties involved. The design of HEV protects the privacy of voters against dishonest authority. On the other hand, the voting result can be easily invalidated by repeated voting from malicious voters. The tentative combats on these threats are explained in the following.
Extra Votes
In the basic encryption setting, each individual voter is supposed to encrypt for against and for support. However, theoretically, voters are free to encrypt other numbers like or to earn extra votes. For example, with a voter group of 3, encrypted 3 in its voting message would dominate the aggregated result if all the other voters are honest. Since the message is encrypted by individual voters, there is no way for the government to tract this kind of cheating.
One way to combat this is to add a zero-knowledge proof on top of the voting procedure to prove that the voter is really sending a or instead of other invalid numbers. To make the failing probability negligible, multiple tests of zero-knowledge proof are needed, in the cost of extra network traffic.
Cooperation Interruption
The original HEV requires all voters to remain cooperative through the entire protocol. However, this may not hold due to various active and passive actions. For instance, voters turn offline after voting submissions and during threshold decryption (one potential reason is the network disconnetion). Moreover, some voters can decide to not send the decryption result in threshold decryption or even send something random instead of . The final result is hard to obtain in either way (or being inaccurate) because the government requires decryption results from all voters to compute . We believe a stochastic approach in the protocol helps mitigate the interruption effects.
Assumption
For simplicity, we assume that malicious voters behave like honest voters and vote against () before threshold decryption but interrupt the cooperation in threshold decryption by providing fake data (instead of using their own secret keys). We also assume that the result becomes unreliable if any malicious voters are involved in the communication. If the result we get matches the result with same group of voters and all being honest, it is considered as reliable.
Partial Public Key Generation
A primary solution is based on partial public key generation. Specifically, the public key generated only relies on a sampled subset of the voter group, instead of the entire group as in the original HEV protocol. To achieve this, the government randomly samples public key pieces from pieces collected from voters, and generates the public key only based those pieces. Therefore, even though there are voters being interrupted, the probability of having a reliable result is . For instance, when , (), on expectation , , which means even if malicious voters are not majority, the probability of getting a reliable result is almost .
K-Sampling
To improve the performance of the HEV protocol with sampling, we can upgrade the primary solution. In Protocol 4, instead of sampling once and generate one public key, samples times and generates public keys. After the threshold decryption gets results, then mode is selected as the final result. We argue that if the honest voters are the majority, there are ways to get reliable result with almost probability by adjusting the parameter .
Protocol 1 HEV with Sampling |
---|
Additional Notations. is the total number times of sampling does, also being the number of public key pieces uses |
is the number of keys chosen from voters in the -th sampling (with replacement). |
is the set of chosen public key pieces in the -th sampling, where , can equal for some . |
Procedure:
|
Theorem 4.1.
If there are more (semi-)honest voters than malicious voters, the Mode of is the reliable result.
Proof.
Instead of a rigorous proof, we present idea of the proof.
According to assumption 4, if the -th sampling contains honest voters only, then the result is reliable; otherwise, if the -th sampling contains any number malicious voter, we assume the result is unreliable (We ignore fake positive cases333For example, two malicious voters occasionally do the decryption for each other.). The idea is that, even if the same malicious voter participated in two generation steps of public keys and and the corresponding fake threshold decryption responses and are identical. Since and are different, their result and map to two different elements in the group :
Suppose and , where is a random number, then the two results don’t equal.
Thus we can assume that the results of different failing sampling groups (results are unreliable) are different. If you have the same results, there is negligible probability that they are related to malicious voters. Thus, even two or three consistent results can be considered as the reliable result. Therefore choosing the Mode of is a reliable result. ∎
5 Experimental Results
According to our assumption 4 and Theorem 4.1, we simulate experiments based on the number of voters , the probability that a voter is malicious , and the number of sampling . Figure 1 shows the performance of getting at least two consistent results and Figure 2 shows the performance of getting at least three consistent results. The result is averaged from random seeds. Our protocol achieves almost accuracy for no more than voters when when sampling more than times. The protocol is almost accurate for no more than voters when if sampling more than times, which improves the performance of almost zero for the primary solution instance. For more than half malicious voters, the protocol doesn’t remain effectiveness.
6 Conclusion
This work provides two approaches for privacy-preserving electronic voting system. Both of them are privacy preserving under semi-honest government. The BSV protocol widely accepted in current literature, which is stable under semi-honest and malicious voters. The HEV protocol is stable under semi-honest voter and the updated HEVS provides stability under malicious voters. There is no solution for a malicious government who rejects all communications.
References
Appendix A Protocol BSV
Protocol 2 Blind Signature Voting (BSV) |
---|
Notation. is the group of eligible voters. , voter holds a vote . denoted the government who can identify these voters. |
denoted a decentralized blockchain |
Objective. Parties jointly compute without revealing any . |
Procedure:
|
Appendix B Protocol HEV
Protocol 3 Homomorphic Encryption Voting (HEV) |
---|
Notation. is the group of eligible voters. , voter holds a vote . denoted the government who can identify these voters. |
is a cyclic group of order with generator . |
is the aggregated result from . |
Objective. Parties jointly compute without revealing any . |
Procedure:
|
Appendix C Proof of Theorem 3.1
Proof.
Then if
∎
Appendix D Experiments
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |