Zero knowledge proofs are powerful cryptographic tools by which a person (prover) can convince another person (verifier) that an assertion about some secret information is true, without revealing the secret information itself or revealing any other information beyond the fact that the assertion is true. Consider, for example, the assertion that ”I know a secret number that is quadratic non-residue mod ”. The prover can convince the verifier that he knows such a number without giving additional knowledge to the verifier.
The zero knowledge proofs were initially introduced in 1980s, in  and . After that, a lot of efforts have been dedicated to make it more efficient. For example, Kilian  has introduced the succinct interactive zero knowledge proof system, where the load of communication between the prover and the verifier can be less than the size of the corresponding arithmetic circuit, a diagram which representing the original computation. Micali  has developed zero-knowledge succinct non-interactive argument of knowledge (zkSNARK), which is a type of zero knowledge proof systems in which the prover just sends one message to the verifier. For a survey about zero knowledge proof systems see  and .
zkSNARKs have been extensively explored in the literature [4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]. Also zkSNARKs have been used in many applications, for example in authentication systems , construction of various types of cryptographic protocols , privacy preserving crypto-currencies , smart contracts [26, 27, 28], miscellaneous applications in Blockchain technology[29, 30, 31, 32], verifiable outsourcing of the computation [33, 34], and many other areas . In addition, various toolboxes have been created to implement zkSNARKs, e.g., [36, 37, 38, 39].
In , an efficient zkSNARK is introduced which is based on a specific encoding of the arithmetic circuit to a polynomial form called quadratic arithmetic program (QAP). QAP-based zkSNARKs have several properties that make them attractive in practice. In particular, the size of the proof is always constant, e.g., 8 elements in , and 3 elements in  regardless of the size of the arithmetic circuit. In addition, the verifier’s work (the number of addition and multiplication operations) is of the order , where is the aggregated number of public inputs and outputs of the arithmetic circuit (see section II). However, the prover’s work in QAP-based zkSNARKs is dominated with computation cost, where is the number of multiplication gates in the arithmetic circuit. When the arithmetic circuit is large, e.g., with more than several billions of gates, a prover with a limited computing resource cannot generate the proof himself, and he may need to offload this computation to some other servers.
Several papers have investigated the problem of outsourcing the task of generating proof.
Nectar : Nectar which is a smart contract protocol, uses zkSNARKs for verification of the correct execution of the smart contracts. In Nectar, the prover delegates his task to a powerful trusted worker, and sends all of its secret inputs to that worker to generate the proof. The disadvantage of this system is the need to trust the external server. In addition, that server needs to be powerful enough to handle the computation.
DIZK : This work proposes an algorithm for delegation of the prover task to several trusted machines by using a Map-Reduce framework. The advantage is that each server is responsible to execute part of the prover task, which is assigned to it based on its processing and storage resource. However, if some of the servers are untrusted, DIZK framework cannot be used.
SPARKs : This work breaks the computation task into a sequence of smaller sub-tasks, and delegates the proof generation for the correctness of each sub-task to one server. The weakness is that the servers must be trusted. On top of that, SPARKs deal with the computation task in detail to be able to split it into sub-tasks.
To offload computation task on private data to some external untrusted nodes, a fundamental approach is multi party computation. The multi party computation was initially introduced in the early 1980s by Yao , and has been followed in many works [48, 49, 46, 50, 51]. For a survey about multi party computation see 
. Multi party computation has been used in many areas such as machine learning, secure voting , securing database  and Blockchain technology .
In this paper, our objective is to design an offloading mechanism that has two properties:
The load of computation per server is a faction of the full load of generating proofs. The reason is clear: the load of computation for prover task is beyond what one server can affords.
The servers are not required to be trusted. In particular, we assume a subset of size of the servers may collude to gain information about the input of the prover.
Current solutions have only one of the above properties. In particular, Trinocchio  works even if some of the servers are curious and colludy. However, the load of computation per server is the same or even more than the load of prover task. In DIZK , on the other hand, the load of computation per server can be a fraction of the load of the prover task. However, it does assume that all the servers are trusted. Our approach is based on ideas from multiparty computation. However, we cannot use an off-the-shelf multiparty computation scheme and apply it to the computation task of the prover. The reason is that the prover task has been hand designed to be very efficient. If we apply an MPC scheme blindly, we lose the efficiency in computation, and the computation task of each server becomes even more that the original computation. In this paper, we design a multi-party scheme for the prover task that works with servers, such that, even if of them collude for some , they gain no information about the secret inputs (the second property). In addition, the computation load of each server is (the first property).
, we review fast Fourier transform () and some secret sharing schemes. In the section IV, we explain the main challenge and then we detail the proposed scheme. Section V is dedicated to discussion and conclusion.
We denote vectors by lowercase bold letters such as. The notation means is a vector of length and is its th coordinate. There are some vectors in this paper that has a large mathematical symbol, representing their construction path. To show the th coordinate of these vectors we use the notation . For example is the th coordinate of the vector .
We denote matrices by bold uppercase letters, e.g. . We denote sets by uppercase calligraphy letters and use to show the elements, for example in , is a set containing elements , , and . We use double bracket for encryption. For more detail see Cryptographic operations in Section II-B.
Ii Background on QAP-based zkSNARK
Ii-a The story of zkSNARK
Suppose that there is a globally known function consisting of only multiplication and addition operations. The prover is a person that has calculated this function with inputs and , and has obtained . The input and the output are publicly available. The prover wants to convince the verifier that he know such that . However, one of the main constraints is that is a private parameter, and the prover doesn’t want to reveal it to the verifier. The second constraint is that the verifier wants to verify this computation with negligible load of computation and communication.
A non-interactive zero-knowledge proof system allows the prover to make a string , called proof, that if the prover sends it along with public input and output to the verifier, the verifier will be convinced that the prover knows a as the input of such that the calculation of has been done correctly without obtaining any other information about .
In order to generate and verify proofs, a process must be performed in advance based on the structure of function . This process is called the setup phase. A third party, often called as the trusted party, runs the setup phase and generates two public parameters Evaluation Key () and Verification Key (). and depend on the structure of the function , and are independent of , , . Then the prover generates proof using and the result of the calculation . The verifier verifies the prover claim using and the proof . The size of the proof and the computation load of verifying should be negligible.
We note that the setup phase is a one-time process. In other words, and can be used many times as long as the function remains the same. As a result, the computation cost of the setup phase amortizes over many zkSNARK sessions about by different provers. It is worth mention that in the setup phase, the trusted party uses some intermediate parameters to develop and . These parameters are used only once, and must be deleted after that; otherwise if someone has access to these parameters can cheat and generate counterfeit proofs (See Fig. 1).
As an example, in the Zcash Blockchain which uses zkSNARK to support anonymous transactions, the setup phase had been run before the network started, and two sets and made available to everyone. Whenever someone wants to make an anonymous transaction, he should generate a proof using (to prove he has enough money, etc.), then the miners in the network verify the proof using . For more details see .
zkSNARKs have some properties that are informally mentioned below:
Zero-knowledge: Verifier obtains no information about beyond the fact that .
Succinctness: Size of the proof is constant, no matter the size of . This feature makes zkSNARK a good tool in practice (e.g. cloud computing).
Non-interactive: Verifier doesn’t send anything to the prover.
Publicly verifiable: This property allows many people to check the proof and it is useful for applications such as Blockchain.
Correctness: If zkSNARK is executed honestly and a proof is generated honest, verifier(s) will always detect it correctly.
Knowledge soundness: Polynomial-time adversary who doesn’t know some that holds in , can not generate a valid proof.
Ii-B Main components of zkSNARK
Arithmetic circuit: It’s a diagram, consists of wires and multiplication and addition gates, to represent the function and its intermediate calculations. For example, Fig. 1(a) corresponding to the function . As it turns out, this diagram represents not only the function, but also represents the process of computation. For a computation to be correct, every computation in this graph must be correct. As you will see later, this structure would allow us to develop the proof of correct execution of the function . The arithmetic circuit that corresponds to a function is not unique but a valid representation is enough for us.
Equivalent Quadratic Arithmetic Program (QAP): In this step, we represent the structure of the arithmetic circuit using some polynomials. Suppose that the arithmetic circuit of , has multiplication gates. We assume is a power of 2. If it is not the case, we add some operations to the arithmetic circuit to make become a power of 2. To develop the corresponding polynomials we need to label the multiplication gates and wires of the circuit. Let be a primitive th root of unity in , i.e., . We label the multiplication gates of the arithmetic circuit by the set in an arbitrary order.
Now consider the wires that are input of the arithmetic circuit and the wires that are output of the multiplication gates. We index them in an arbitrary order by the set . As a convention, there is a wire in the arithmetic circuit that always carries 1. We assign index to that wire as well. As you can see, we neither label the addition gates nor index their output wires. Later, we will explain how to treat those.
Now we are ready to represent the structure of the arithmetic circuit in polynomials. For each wire, indexed by , , we define three polynomials of the degree at most , denoted by , and as follows:
The symbol means the value of the polynomial does not matter at this point.
These polynomials can be developed simply by Lagrange interpolation.
Recall that we don’t label the output wires of the addition gates. To cover the addition gates and their outputs, in the above definitions, we extend the notation of being a right input or left input of a multiplication gate as follows: If an indexed wire goes through one or more addition gates, and eventually becomes the right (left) input of a multiplication gate, then we also consider that indexed wire as a right (left) input of that multiplication gate. By going through an addition gate, we mean it is an input of that addition gate. For example in the Fig. 2, we say wire 1 and wire 2 both are the left inputs of multiplication gate.
We also define polynomial, , called target polynomial, which is of the degree , and is divisible by the label of each multiplication gate. In other words, if , we have for some , so .
A QAP over the finite field is the set of target polynomial and three sets of polynomials. We note that QAP of an arithmetic circuit completely describes the structure of that circuit.
Fig. 2: The arithmetic circuit of the function and the corresponding QAP.
Polynomial Representation of the Correctness of the Operations: Recall that the advantage of the arithmetic circuit of a function is that it represent all the intermediate operations in calculating . Let us assume that we calculate , and in this process, we also calculate the value that is carried by each indexed wire , denoted by , . For final results to be correct, we need the calculation in each multiplication gate to be correct. To verify that, one needs to verify operations one by one, which would be very difficult. An interesting aspect of QAP is that we can use it to represent all of these operations with one polynomial equation, as follows. A polynomial equation can easily be verified reasonably as well as will be explained later.
Recall that all of the polynomials , and are of the degree . We define the polynomials of the degree and of the degree as,
where is the value of the indexed wire . An important observation is as follows. Let be the label of multiplication gate . Then, one can see that is equal to (the summation of) the values of wires that are left inputs of gate . Similarly, and are equal to (the summation of) the values of the wires that are right inputs and the output wire of gate , respectively. Thus for the calculation at gate to be correct, we need to have . As the result, if the prover wants to prove that the calculation of the entire arithmetic circuit has been done correctly, it’s sufficient to show , . Equivalently, it is sufficient to show that the target polynomial divides . In other words, the prover needs to show that there is a polynomial of the degree at most that .
The main idea behind zkSNARK is that (i) the prover finds polynomial and then (ii) the verifier checks the equation in a point chosen uniformly at random from . If the equation
doesn’t hold, the verifier will detect it with high probability. This is because two different polynomials of degreecan have equal values in at most different points, and assuming , the probability that be one of those points is that is negligible. The important note is that the prover should not know the value of , otherwise, it can introduce invalid polynomials , , and , such that the identity holds only for . This is why some cryptographic operations are needed to verify the equation for encrypted numbers.
Cryptographic operations: In QAP based zkSNARK we rely on elliptic curve cryptography to protect the private data and soundness of the algorithm. Let be an additive group, developed based on an elliptic curve defined over the finite field , and be a generator of this group. To encrypt a scalar , we calculate with appearance of in the additive group , and denote it as or . We note that finding number from is computationally infeasible. In addition, different inputs lead to different outputs. Moreover, the encryption operation is linear, i.e., , for two integers and .
Consider three integers , , and , and assume that we only have access to . Let us assume that we aim to verify if . This can be simply done by checking if . Now let us assume that we want to check of . This is not straight-forward, and is done through the notation of pairing .
Let be a non-trivial bilinear map from two groups and to a group , and be generators of respectively. It has three properties:
, where is ,
is efficiently computable.
Now if we have , and , the encrypted versions , and respectively, we can check by checking the equation . We note that and can be the same group, with as the generator. In that case, we check if
As mentioned, we use double brackets to show the encrypted version of the scalars, but vectors can also be represented in this notation, for example . Using this notation, we also have .
For more details see .
Ii-C Three main algorithms in zkSNARK
There exist various versions of zkSNARK. Here we focus on the version proposed by Groth in  which is one of the most efficient and popular QAP-based zkSNARKs. However, the schemes proposed in this paper can be applied to other variations of QAP-based zkSNARKs. In the following, we review three algorithms included in zkSNARK, the setup phase algorithm which is done only once by an entity called as the trusted party, the prover algorithm which is done by the prover, and the verifier algorithm which is done by the verifier.
Setup phase algorithm: This algorithm takes the function and the security parameter as the input, and outputs and . The security parameter specifies the size of the finite field . If is large, the algorithm is more secure, at the cost of increasing the computation load.
The setup phase algorithm is presented in Algorithm 1. We note that in Line 3, some random parameters are chosen. These random parameters are used to develop and and will be deleted at the end of set-up phase. , in Line 4, is the set of indices of the wires that are the public input or the output of the arithmetic circuit. , in Line 5, is the set of indices of the wires that are not the public input nor the output of the arithmetic circuit. It is obvious that .
In Line 9 and Line 10, and are generated respectively. All values generated during the algorithm except and are known as toxic waste, and must be deleted at the end of the algorithm for ever. Because if someone has access to them, he can produce fake proofs.
It is worth noting that Algorithm 1 is heavy in terms of computation load. However, this phase is done for function only once, and is not function of the values of the wires, inputs, or outputs. This means that the cost of this algorithm amortizes over many proof generations about . Thus, in this paper we do not deal with the setup phase. To see how setup phase calculations can be done in a multiparty protocol, refer to .
Prover algorithm: Prover algorithm is presented in Algorithm 2. This algorithm contains three main parts. In the first part the prover builds the arithmetic circuit and calculates the values of all wires. Remember that the multiplication gate labeled by multiplies and , and outputs .
In the second part, the prover runs the function to calculate coefficients of the polynomial . In the last part the prover runs the function in order to generate the proof.
Function , which is to calculate coefficients of the polynomial , needs some explanation. Recall that . If the prover has the values of and in some distinct points , , he can calculate in by simply dividing by for those points. Thus, he can recover the coefficients of by some interpolation. In this algorithm, we set where . We will explain the reason for this choice later. Recall that, the prover does not even have the coefficients . Instead, he has the values , , in . Thus, we take the following steps:
The coefficients of , and are calculated by interpolation over the values of , and in . This is done efficiently by taking the of the vectors , and , containing the values of , and in respectively (see Lines 4-9 of Algorithm 2).
The values of , and on the set is obtained by take of their coefficients (see Lines 10-12 of Algorithm 2).
is calculated at points of .
For each , calculate (see Line 13 of Algorithm 2).
Take of the values of on the set to obtain the coefficients of . (see Line 14 of Algorithm 2).
We note that by definition, and both are zero on the set . Therefore, we need , otherwise calculating in becomes undefined. Choosing for some has the following advantages:
. The reason is that if there is at least a member , where and . Therefore and so , that implies . It means that , a contradiction.
Recall that . So it would take less than operations to compute at points , because given , can be computed by one multiplication and one addition.
For each we have . The reason is that is of the degree so it can have at most distinct roots. On the other hand, we know that on the set which has nothing in common with .
Computing fast Fourier transform on and is easy. See Subsection III-A.
Now we focus on the computational cost of each step. In Algorithm 2, Lines 1 and 2 incur computation of the order where is the number of multiplication gates. Lines 18 - 21 incur operations, where is the number of index wires, and is the security parameter. If the security parameter is too large, these lines can be dominant in terms of computational cost. Recall that it needs at most multiplications to calculate on set , so Line 13 needs operations. Lines 7 - 12 and 14 are usually the bulk of computation, and incur computation cost of the order . Other lines of Algorithm 2 incur a small amount of computation cost. So the computation cost of the prover is of the order .
In this paper we propose a scheme by which the prover can delegate his task to semi-honest servers, where at most of them may collude. Each machine will have computation cost of the order and the prover’s computation cost will be of the order .
Verifier algorithm: Verifier algorithm is presented in Algorithm 3. As you can see, computation cost of Line 1 in this algorithm is proportional to which is the number of public wires. Line 2 incurs constant computation cost.
Iii-a Fast Fourier transform () over finite field
Definitions of this subsection is taken from .
is a primitive th root of unity in a computation structure (e.g. finite field), if but for no such that is .
Let be a primitive th root of unity. Fourier transform of an dimensional vector over is equal to , where .
The specific structure of the Fourier transform allows us to develop it with the complexity , known as fast Fourier transform denoted by .
The Fourier transform can be shown in matrix form as,
The inverse of Fourier transform, denoted by , is equal to,
where is equal to
Equation (3) can also be represented by,
Let us define and , then Fourier transform of can be written as:
where and the symbol denotes element-wise multiplication. This recursive structure has been used to develop algorithms that can do Fourier transform with complexity .
Calculating , where , incurs operations too. Because if be a diagonal matrix, whose th diagonal entry is , we have the followings,
Calculating the elements of or requires multiplications, so the computation complexity of and is of the order .
Iii-B Lagrange Sharing 
Consider a system including a master and a cluster of servers. The master aims to share the set of private vectors , , for some finite field , with those servers. The sharing must be such that if any subset of servers collude, they gain no information about the input data. Various approaches, such as ramp sharing  and Lagrange sharing  have been proposed for such sharing. In this paper, we use Lagrange Sharing , which works as follows:
Let and be two sets of some publicly known distinct non-zero points in the finite field such that .
To code the secret inputs for , first the master chooses vectors for , independently and uniformly at random from . Then it forms the Lagrange coding polynomial , defined as,
Finally in this stage, the master sends to Server .
Iv The Proposed Scheme
Consider a system including a prover, semi-honest servers, and a globally known function that the prover wants to generate a zkSNARK proof about it (see Section II). Suppose that the function has a large arithmetic circuit and therefore it is a difficult task to produce proof about . Therefore the prover may not be able to do the task alone, and he needs to delegate his task to the servers. By semi-honest, we mean that the servers follow the algorithms correctly, but a subset of up to of them may collude to gain information about secret data. Note that if some of the servers are adversarial meaning that they don’t follow the algorithm, the generated proof will fail the verification. Thus the prover itself can detect it, using the inherent zkSNARK verification ability. In the other words, if for any reason, the generated proof is not valid, the prover will find out by running the verifier algorithm for his own.
As mentioned in Section I, the advantage of Trinocchio  algorithm is that the prover can delegate his task to several untrusted servers. On the downside, in Trinocchio, the computation complexity of each server is equal to the main task, which is assumed to be large. On the other hand, the advantage of DIZK  algorithm is that it partitions the task of the prover and gives a part to each server. The main disadvantage of DIZK is that the servers must be trusted. In this section, we design an algorithm that has the advantage of both DIZK and Trinocchio.
Recall that the computation complexity of zkSNARK is equal to , which is driven by computing , , , in Lines 7-12 and also in Line 14. We note that is can be easily in the order of or . Thus, the factor is equal to which is considerable. If we can replace this factor by constant number, say 5, we can reduce the execution time six times which is very important.
Iv-a The multiparty algorithm for computing
Suppose that the prover has a large secret vector of dimension , and aims to compute , using the cluster of semi-honest servers, where up to may collude. One approach would be consider as a matrix multiplication problem , as defined in Subsection III-A. Then, we can use secure multiparty computation for massive data or secure matrix multiplication methods [62, 63] by partitioning and securely sharing vector and partitioning and sharing matrix with the servers. Then each server simply multiply what it received and sends it back the servers. However, using this approach, we lose the Fourier transform structure, and the computation complexity would be of the order for each server.
Here we propose an alternative approach in Algorithm 4 such that the complexity of computation in each server is equal to , and the complexity of computation in the prover is equal to .
In step 1, vector of length is partitioned into vectors of length . We have assumed that and are powers of 2, so