zksk: A Library for Composable Zero-Knowledge Proofs

11/06/2019 ∙ by Wouter Lueks, et al. ∙ 0

Zero-knowledge proofs are an essential building block in many privacy-preserving systems. However, implementing these proofs is tedious and error-prone. In this paper, we present zksk, a well-documented Python library for defining and computing sigma protocols: the most popular class of zero-knowledge proofs. In zksk proofs compose: programmers can convert smaller proofs into building blocks that then can be combined into bigger proofs. zksk features a modern Python-based domain-specific language. This makes possible to define proofs without learning a new custom language, and to benefit from the rich Python syntax and ecosystem.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Privacy-preserving systems use zero-knowledge proofs to prove that outputs have been computed correctly, without revealing sensitive information about inputs. In online voting systems, voters prove that they correctly encrypted their vote, without revealing any information about the selected candidate (AdidaMPQ09). In anonymous authentication systems, users prove that they have access to a resource, without revealing any information that could reveal their identity or make accesses linkable (DavidsonGSTV18; HenryG13).

Implementing zero-knowledge proofs is tedious. Academic papers often use high-level Camenisch-Stadler notation (CamenischS97) to succinctly specify the intent of the proofs. The concrete implementations of proofs, however, often require hundreds or thousands lines of code. While not necessarily difficult, these implementations are tedious and require implementing the same primitives repeatedly.

Implementations are also error-prone. To simplify deployment, most systems use the Fiat-Shamir heuristic 

(FiatS86). Incorrectly applying this heuristic, however, can lead to serious vulnerabilities. For instance, in both the Helios and SwissPost/Scytl voting systems, incorrect application of the Fiat-Shamir heuristic led to accepting incorrect encrypted votes (BernhardPW12; LewisPT19outcome; LewisPT19outcomeaddendum).

We propose zksk, the Zero-Knowledge Swiss Knife, a Python library for defining and computing sigma protocols – the most popular class of zero-knowledge proofs. The library provides a simple API to define proofs inspired by the Camenisch-Stadler notation (CamenischS97). Additionally, the zksk library protects the programmer against mistakes. It applies the Fiat-Shamir construction correctly automatically; and it refuses to compute or-proofs that would reveal secrets. Consider the zero-knowledge proof that an additive ElGamal ciphertext (ElGamal85) for a public key encrypts the value or . In Camenisch-Stadler notation, we would write:

to denote the proof of knowledge of a secret such that the expression after the colon holds. In our library we write (using additive notation, and G and H for and ):

r =  Secret()
enc0 =  DLRep( c1,  r *  G) &  DLRep( c2,  r *  H)
enc1 =  DLRep( c1,  r *  G) &  DLRep( c2 -  G,  r *  H)
stmt =  enc0 |  enc1

to define the statement stmt of the same proof.

Systems often compose simpler zero-knowledge proofs to achieve their purpose. In voting systems, the voter’s vote is usually represented by a vector of ciphertexts, each corresponding to a candidate. To ensure correctness, voters prove that each ciphertext encrypts a bit

and that not too many bits are set. In anonymous authentication systems, users prove that they have a credential and that this credential has not yet been revoked. Zero-knowledge proofs defined in zksk support composition. For example, the statement stmt in the example composes the two disjuncts enc0 and enc1.

The zksk library also makes it easy to define new building blocks that can themselves be composed. For example, zksk already defines a range-proof construction. Consider a voting scheme in which the voter can select at most 5 candidates. Then the voter must show that the sum of votes, another ElGamal ciphertext , encrypts such that . We express this in zksk as:

r =  Secret()
m =  Secret()
enc_stmt =  DLRep( c1,  r *  G) &  DLRep( c2,  m *  G +  r *  H)
# Prove that c2 commits to m (bases G, H) and 0 <= m < 5
range_stmt =  RangeStmt( c2,  G,  H, 0, 5,  m,  r)
stmt =  enc_stmt &  range_stmt

Existing libraries and compilers. In this work, we focus on sigma protocols. The tools for defining and computing generic zero-knowledge proofs, such as zk-SNARKs (Ben-SassonCTV14), are thus out of scope. See Table 1 for a comparison of sigma-protocol libraries.

The Secure Computation API (SCAPI) (EjgenbergFLL12; SCAPI) is a C++ library that provides a small set of sigma-protocol primitives. Programmers write C++ code to define and compute proofs. The high-level interface is well-documented, but to use individual primitives programmers must read the source code. Primitives can be composed using AND and OR constructions, but SCAPI’s composition is limited: the programmer cannot specify that the same variable occurs in multiple conjuncts. Instead, the programmer must define a new primitive from scratch. The emmy Go library also implements several sigma-protocol primitives. Programmers write Go code to define and compute proofs, but the primitives cannot be combined with conjunctions or disjunctions to form bigger proofs, and documentation is minimal.

YAZKC (Yet Another Zero-Knowledge Compiler) (AlmeidaBBKSS10) and Cashlib (MeiklejohnEKHL10) instead use a custom language for defining zero-knowledge proofs, and provide a compiler that transforms the specification into code to compute and verify proofs. The YAZKC and Cashlib DSLs resembles the notation of Camenisch and Stadler. Cashlib does not support OR constructions, YAZKC does. In both cases, the DSL does not support defining new high-level building blocks, requiring proofs instead to be (re)written in their entirety in terms of the DSL’s building blocks. We could not fully evaluate YAZKC as the source is no longer available online. Both YAZKC and Cashlib support hidden order groups. Our zksk library does not, as these are nowadays usually replaced by pairings, which we do support. The zkp Rust library (dalek) provides a Rust DSL to define simple proofs. These proofs cannot be combined in disjunctions or conjunctions.

AND OR Cmp Int FS Lang DSL Docs
SCAPI (EjgenbergFLL12; SCAPI) ~ C++ C++
Emmy (emmy) Go - min.
YAZKC (AlmeidaBBKSS10) C Custom ~
Cashlib (MeiklejohnEKHL10) C++ Custom ~
zkp (dalek) Rust Rust min.
zksk Python Python
Table 1. Comparison between different zero-knowledge proof libraries. Columns: AND, OR – support for conjunct and disjunct statements, Composing (Cmp) – defined statements compose into bigger statements, Interactive (Int) – interactive prove/verification mode, FS – non-interactive proofs through Fiat-Shamir heuristic, Language (Lang) – language in which the tool is implemented, DSL – language in which proofs can be defined, Documentation (Docs) – available documentation.

Contributions. In this paper we make the following contributions.

We present zksk, a well-documented Python library that provides an API for defining and computing zero-knowledge proofs.

zksk protects programmers against common errors and supports full composition (conjunctions and disjunctions). It comes with useful building blocks that can be used to instantiate many existing zero-knowledge proofs: proofs of signatures, range proofs, and inequality proofs. Users can also define their own building blocks.

2. Background

Throughout this paper, let be a cyclic group of prime order generated by . Let be the integers modulo . We write for the concatenation of two strings, and to denote that is drawn uniformly at random from the finite set . We call an expression a discrete logarithm representation of with respect to the bases .

In this paper, we focus on sigma protocols (Damgard10): 3-move zero-knowledge proofs (GoldwasserMR89), that provide honest-verifier zero-knowledge. The most well-known example is Schnorr’s proof of identification (Schnorr89), see Figure 1. It proceeds in three phases: (1) the prover sends a commitment, the value ; (2) the verifier sends a challenge ; and (3) the prover sends a response . Finally, the verifier checks the correctness of the response. Every sigma protocol follows this structure. In Camenisch-Stadler notation (CamenischS97), we express the proof statement as to denote that the prover proves knowledge of the (secret) value such that .

Common input: group and
Prover Verifier
Input:

Verify
Figure 1. Schnorr’s proof of identity. A simple sigma protocol that proves knowledge of such that .

The protocol is zero-knowledge because transcripts that are accepted by the verifier can be simulated without knowledge of the secret (Schnorr89). Intuitively, the verifier can therefore not convince anybody else of the veracity of the statement.

Sigma protocols can be combined using conjunctions, e.g., prove knowledge of the discrete logarithms of and (with respect to and ), and in fact that they are the same:

To do so, run two Schnorr identification protocols in parallel, using the same randomizer for the secret . This approach enables proving arbitrary conjunctions. See the full paper (fullversion) for the details.

Sigma protocols can also be combined using disjunctions, e.g., we can prove knowledge of the discrete logarithm of or :

The OR construction (Damgard10), simulates the untrue disjunct, while honestly proving the true disjunct. See the full paper (fullversion) for the details.

The Fiat-Shamir heuristic (FiatS86) turns interactive protocols into non-interactive proofs. With this heuristic, the prover computes the challenge by hashing its commitments together with the proof statement. For Schnorr’s protocol in Figure 1, she would compute . Including the proof statement (in this case including suffices) is essential, lest the prover can fake proofs (BernhardPW12).

3. zksk Design And Implementation

In this section we overview the core functionalities of zksk

from a user’s perspective, and then outline how they are implemented. Code examples are written in Python. The library is open source and is extensively documented.

111https://github.com/spring-epfl/zksk The compiler relies on the petlib222https://github.com/gdanezis/petlib Python bindings to OpenSSL to support elliptic curves and pairings.

3.1. Components

Discrete logarithm representations. The zksk library makes it easy to express equations about discrete logarithm representations, the basic building block of sigma protocols. Listing 1 shows how to express the statement , and how to construct and verify the corresponding proof. We assume the values C, G, and H are defined and in scope. First, the prover defines the values of which it will prove knowledge (lines 2–3). Note that it passes in the real values of the secrets. Then it expresses the proof statement (line 4). The first argument of DLRep is the left-hand side of the discrete logarithm representation, the right-hand side expresses the left-hand side in terms of the bases and secrets. Finally, it constructs the (non-interactive) proof (line 5). The verifier first defines the proof statement (lines 8–9), and uses it to verify the proof (line 10).

1  # Prover
2   x =  Secret(20)
3   r =  Secret(1337)
4   stmt =  DLRep( C,  x *  G +  r *  H)
5   proof =  stmt. prove()
6
7  # Verifier
8   x_prime,  r_prime =  Secret(),  Secret()
9   stmt_prime =  DLRep( C,  x_prime *  G +  r_prime *  H)
10   assert  stmt_prime. verify( proof)
Listing 1: Using zksk to prove and verify the statement . In the code C is the commitment .

Conjunctions and disjunctions. We can combine statements into conjunctions or disjunctions. The & (and) operator combines statements into a conjunction: the library will prove that secrets that appear in multiple statements are the same. Similarly, the | (or) operator combines statements into a disjunction. Checking which disjunct is true is computationally expensive. Thus, zksk requires the prover to indicate whether a disjunct is true or simulated:

   x  =  Secret(12345)
   stat =  DLRep( X1,  x *  G1,  simulated= True) |
          DLRep( X1,  x *  G2,  simulated= False)

Defining and using primitives. The zksk library includes several useful primitives to define more complicated statements: proofs of knowledge of a BBS+ signature (AuSM06), inequality of discrete logarithms (HenryG13), and range proofs (BellareG97; Schoenmakers05). The syntax is as follows:

  # Possession of BBS+ signature over messages:
   msgs = [ Secret() for  _ in range(4)]
   stmt1 =  BBSPlusSignatureProof( msgs,  pk)
  # Inequality of discrete logs (see below):
   stmt2 =  DLRepNotEqual([ Y1,  G1], [ Y2,  G2])
  # Let com = x * G + r * H be a commitment to x
  # Proof that x lies in the range [a, b):
   x,  r =  Secret(),  Secret()
   stmt3 =  RangeStmt( com,  G,  H,  a,  b,  x,  r)

Users can easily define new primitives. These primitives could require extra computations and verifications. We take as an example the proof of inequality of two discrete logarithms by Henry and Goldberg (HenryG13), proving the statement . This statement cannot be directly translated into the primitives we have defined before. Instead, we follow Henry and Goldberg’s approach. The prover first picks a randomizer and computes the value , and then proves:

where and . When verifying the proof, the verifier needs to additionally check that .

Primitives in zksk extend the ExtendedProofStmt class, provide a constructor, and override the construct_stmt to return a proof statement. Moreover, they can override precommit to compute a precommitment, and validate to perform post-validation.

Listing 2 shows how to implement the DLNotEqual proof. First, we define the constructor. It stores the arguments (lines 3–5), computes some convenience values (lines 7–8), and defines the secrets alpha and beta (line 11). Then, we override the precommit function to compute the value that acts as a precommitment (lines 13–21). This function also sets the values of the secrets alpha and beta. The function construct_proof returns the proof statement defined above (lines 23–29). Finally, validate (lines 31–33) verifies that is not the unity element.

1class  DLNotEqual( ExtendedProofStmt):
2  def  __init__( self,  valid_pair,  invalid_pair,  x):
3     self. lhs = [ valid_pair[0],  invalid_pair[0]]
4     self. bases = [ valid_pair[1],  invalid_pair[1]]
5     self. x =  x
6
7     self. infty =  self. bases[0]. group. infinite()
8     self. order =  self. bases[0]. group. order()
9
10    # The internal proof uses two constructed secrets
11     self. alpha,  self. beta =  Secret(),  Secret()
12
13  def  precommit( self):
14     blinder =  self. order. random()
15
16    # Set the value of the two internal secrets
17     self. alpha. value =  self. x. value *  blinder %  order
18     self. beta. value = - blinder %  order
19
20     precommitment =  blinder * ( self. x. value *  self. bases[1] -  self. lhs[1])
21    return  precommitment
22
23  def  construct_stmt( self,  precom):
24     p1 =  DLRep( infty,
25                self. alpha *  self. bases[0] +
26                self. beta *  self. lhs[0])
27     p2 =  DLRep( precom,  self. alpha *  self. bases[1] +
28                        self. beta *  self. lhs[1])
29    return  p1 &  p2
30
31  def  validate( self,  precommitment):
32    if  self. precommitment ==  self. infty:
33      raise  ValidationError("Invalid precommitment")
Listing 2: Full implementation of the DLNotEqual primitive.

New primitives created by extending ExtendedProofStmt compose as any other proof statement. However, they cannot be themselves used in the constructed proof of other new primitives using ExtendedProofStmt. We aim to add this functionality soon.

Input: a (compositional) proof statement stmt
Output: a non-interactive
1 Recursively call precommit() on all parts of stmt
2 Let precommitment be the combined precommitments
3 Create constructed proofs for all parts of stmt
4 Verify that secrets inside OR clause do not appear elsewhere
5 Let be the unique secrets in all parts of stmt
6 Pick randomizers for the secrets
7 Compute commitment for stmt recursively using
8 Let
9 Compute for each secret using chal and .
10 Return
Listing 3: Computing a non-interactive proof

3.2. Implementation

Robust statement identifiers. The correct application of the Fiat-Shamir heuristic mandates including a representation of the statement in the hash function. Computing such a representation, however, is difficult. Consider the two statements:

   x,  y =  Secret(),  Secret()
   p1 =  DLRep( A,  x *  G) &  DLRep( B,  x *  H) &  DLRep( C,  y *  Z)
   p2 =  DLRep( A,  x *  G) &  DLRep( B,  y *  H) &  DLRep( C,  y *  Z)

The zksk library must differentiate between the secret x appearing twice, and the secret y appearing twice. Moreover, this method must be robust even if the prover’s and verifier’s statement definition execute on different machines. Hence, we cannot use Python’s built-in object identifiers, as they can change across executions.

One way for differentiating the secrets would be to assign a canonical unique name to each secret. Our initial experiments showed, however, that manually assigning names to secrets results in cumbersome code. Therefore, zksk automatically assigns identifiers to secrets in the order in which they first occur in the statement. See the full paper (fullversion) for the details.

Computing proofs. A call to stmt.prove(), see Listing 1, computes a non-interactive proof as in Listing 3. First we compute the precommitments and the concrete constructed proofs for the custom primitives (lines 1–3). Lines 5–10 then execute the proof, similarly to Figure 1.

The library tries to actively prevent programmer errors. Line 8 applies the strong Fiat-Shamir heuristic (BernhardPW12): it adds the statement’s representation as input to the hash function.

The library also detects dangerous OR proofs. Consider a simplification of the statement on page 1 that encrypts a bit:

A naïve application of steps 5–10, which picks one randomizer for and uses that randomizer in both conjuncts results in a proof that reveals itself. Let be the challenge received from the verifier. Suppose the first disjunct is true. To prove the full statement, the naïve approach first simulates the second disjunct, obtaining a transcript with a challenge . The challenge for the first disjunct is then . As a result, the naïve approach uses the challenge for the first conjunct () and the challenge for the first disjunct (). However, given responses for secret for two different challenges with the same randomizer, an attacker can trivially extract , violating the zero-knowledge property.

The zksk library prevents this flaw by requiring that secrets that appear in OR clauses cannot also appear elsewhere. It is up to the programmer to resolve this problem when detected. On page 1, we moved the first conjunct inside, creating a disjunctive normal form. An alternative is to bind the offending secret to a Pedersen commitment, and then repeat that commitment inside the OR clause.

4. Evaluation

To determine the overhead of using a Python library when computing proofs, we compare it to the time of computing the proofs with the underlying cryptographic library. We first measure the running time of proving knowledge of a BBS+ signature using zksk. We then compute a lower bound on the cost without zksk by counting the number of group operations the proof takes and multiplying it by the cost of group operations in the underlying cryptographic library. This lower bound does not include hash functions nor modular arithmetic. We find that 90% of the running time for zksk is due to the group operations. We conclude that the overhead of using a Python library is small. See the full paper (fullversion) for more details.

We did a small literature study to determine the usefulness of zksk. We explored papers in the last two editions of relevant academic conferences: PETS, ACM CCS, WPES, and NDSS, and found 7 papers that use sigma protocols. All of these protocols can be implemented with zksk. See the full paper (fullversion) for more details.

5. Conclusions

We presented zksk, a Python-based library for defining and computing zero-knowledge proofs based on sigma protocols. Unlike existing libraries, zksk does not rely on a custom language to define proofs, but on an easy-to-use Python-based DSL. It provides several high-level primitives, and makes it easy to define new high-level primitives, all of which can be composed to construct bigger proofs.

A small literature study shows that zksk is indeed sufficient to implement sigma protocols encountered in real research papers. We hope that zksk will be a valuable tool to make defining and evaluating such protocols easier.

Acknowledgements

We thank Ian Goldberg and Nick Hopper for pointing out the problem with naïvely composing OR proofs.

References

Appendix A Details of evaluation

Overhead in computing proof of knowledge of a BBS+ signature. We consider a BBS+ signature with 10 messages, and construct a proof of knowledge of that signature that hides all 10 messages using zksk. For a zksk-compatible implementation of cryptographic pairings, we use the bplib333https://github.com/gdanezis/bplib library. Proving takes about 146 ms, whereas verification takes about 160 ms.

BBS+ signatures require a pairing setting with groups and . Based on a manual implementation of the zero-knowledge proof in C, we conclude that zksk must compute 8 exponentiations in , 14 exponentiations in and 15 pairing computations to compute a proof; and 6 exponentiations in , 15 in , and 13 pairings to verify the proof. Based on measurements on the underlying bplib

library, we estimate the lower bound on the running time of proving and verifying at 139 ms and 146 ms respectively. Therefore, we conclude that the raw cryptographic operations account for more than 90% of the running time in

zksk.

Sigma protocols in recent papers. We explored published papers in the last two years of PETS, ACM CCS, WPES, and NDSS. We found 7 papers that use sigma protocols. All of them can be implemented using zksk. Privacy Pass (DavidsonGSTV18), however, uses an optimized batch verification protocol that zksk currently does not support. The zksk library does support the basic version of the protocol, and can be used to define a new primitive that supports batch verification.