The Power of Distributed Verifiers in Interactive Proofs

12/28/2018
by   Moni Naor, et al.
0

We explore the power of interactive proofs with a distributed verifier. In this setting, the verifier consists of n nodes and a graph G that defines their communication pattern. The prover is a single entity that communicates with all nodes by short messages. The goal is to verify that the graph G belongs to some language in a small number of rounds, and with small communication bound, i.e., the proof size. This interactive model was introduced by Kol, Oshman and Saxena (PODC 2018) as a generalization of non-interactive distributed proofs. They demonstrated the power of interaction in this setting by constructing protocols for problems as Graph Symmetry and Graph Non-Isomorphism -- both of which require proofs of Ω(n^2)-bits without interaction. In this work, we provide a new general framework for distributed interactive proofs that allows one to translate standard interactive protocols to ones where the verifier is distributed with short proof size. We show the following: * Every (centralized) computation that can be performed in time O(n) can be translated into three-round distributed interactive protocol with O( n) proof size. This implies that many graph problems for sparse graphs have succinct proofs. * Every (centralized) computation implemented by either a small space or by uniform NC circuit can be translated into a distributed protocol with O(1) rounds and O( n) bits proof size for the low space case and polylog(n) many rounds and proof size for NC. * We show that for Graph Non-Isomorphism, there is a 4-round protocol with O( n) proof size, improving upon the O(n n) proof size of Kol et al. * For many problems we show how to reduce proof size below the naturally seeming barrier of n. We get a 5-round protocols with proof size O( n) for a family of problems.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

12/06/2020

Compact Distributed Interactive Proofs for the Recognition of Cographs and Distance-Hereditary Graphs

We present compact distributed interactive proofs for the recognition of...
06/29/2020

Shared vs Private Randomness in Distributed Interactive Proofs

In distributed interactive proofs, the nodes of a graph G interact with ...
08/09/2019

Trade-offs in Distributed Interactive Proofs

The study of interactive proofs in the context of distributed network co...
05/21/2020

Distributed Verifiers in PCP

Traditional proof systems involve a resource-bounded verifier communicat...
07/03/2018

Interactive Certificates for Polynomial Matrices with Sub-Linear Communication

We develop and analyze new protocols to verify the correctness of variou...
05/12/2020

Compact Distributed Certification of Planar Graphs

Naor, Parter, and Yogev (SODA 2020) have recently demonstrated the exist...
12/06/2021

Distributed Interactive Proofs for the Recognition of Some Geometric Intersection Graph Classes

A graph G=(V,E) is a geometric intersection graph if every node v ∈ V is...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

That rug really tied the room together.

Big Lebowski, Coen Brothers, 1991

Interactive proofs are an extension of non-determinism and have proven to be a fundamental tool in complexity theory and cryptography. Their development has led us, among others, to the exciting notions of zero knowledge proofs [GMR89, GMW91] and probabilistically checkable proofs (s).

Interactive proof is a protocol between a randomized verifier and a powerful but untrusted prover. The goal of the prover is to convince the verifier regarding the validity of a statement, usually stated as membership of an instance to a language . The two main requirements of the protocol are: completeness

: a verifier should accept with high probability (or probability one if we want perfect completeness) a true statement if the prover is honest, and

soundness: if the statement is false, then for any dishonest (unbounded) prover participating in the protocol the verifier should reject with high probability (over its internal random coins). In the classical case, the prover is computationally all powerful and the verifier runs in polynomial time. In a celebrated result, interactive proofs are proved to be very powerful allowing for efficient verification of any language in with a polynomial verifier [LFKN92, Sha92]. Another striking result was the [GMW91] protocol for Graph Non-isomorphism (GNI).

Interactive proofs are largely concerned with verifiers that are computationally bounded, but are relevant for verifiers with any sort of limitation (e.g., finite automata [DS92, Con92]). They have been studied in other settings such as communication complexity [BFS86, GPW18] and their connection to circuit complexity [KW90, AW09, Wil16] and property testing [RVW13, GR18]. Of particular interest to us are interactive proofs for graph problems in P with a presumably weaker verifier (e.g. in NC[GKR15, RRR16] and a polynomial prover (i.e., prover restricted to polynomial computation). Our results however also capture problems that go beyond P.

One schism in interactive proofs is whether the verifier has some private coins where the prover does not get to see them (as in the original [GMR89]) or if all coins are public (as in [BM88]), usually denote with AM for Arthur-Merlin. Goldwasser and Sipser [GS89] gave a compiler for converting private-coins into public-coins that is relevant for polynomial-time verifiers. When applied to the protocol of [GMW91] for Graph Non-isomorphism it yields a two round public-coins (AM) protocol for showing that two graphs are not isomorphic.

In this work we study interactive proofs where the verifier is a distributed system: a network of nodes that interact with a single untrusted prover. The prover sees the entire network graph while each node in the network has only a local view (i.e., sees only its immediate neighbors) in the graph. The goal of the prover is to convince the nodes of a global statement regarding the network. The two main complexity measures of the protocol (which we aim to minimize) are the number of rounds, and the size of the proof (i.e., communication bound between the network and the prover). In this context, we ask:

What is the power of interactive proofs with a distributed verifier?

The notion of interactive proofs with a distributed verifier was introduced recently by Kol, Oshman and Saxena [KOS18] as a generalization of its non-interactive version known as “distributed NP” proofs (in its various versions, e.g., [KKP10, GS16, FKP11]). The prover interacts with the nodes of the network in rounds. In each round, a node sends the prover a random challenge . Then, the prover responds by sending each node its respond . Nodes can exchange their proof only with their immediate neighbors in the network in order to decide whether to accept the proof. For accepting a proof all nodes must accept and to reject it is enough that one node rejects.

A simple example for a “distributed NP” proof is 3-coloring of a graph: the prover gives each node in the graph its color, and nodes exchange colors with their neighbors to verify the validity of the coloring. In such a case, we say that the proof size is a constant (each color can be described using two bits). Korman et. al. [KKP10] introduced this notion as a “proof labeling scheme” and showed that there is a long list of problems for which a short distributed proof exists. Other problems (see also e.g. [GS16]) requires proofs with bits, and thus cannot be distributed in any non-trivial manner.

There is a long line of research on the power of distributed proofs focusing on different notions of “proof”. For example, Göös and Suomela [GS16] studied distributed proofs that can be verified with a constant-round verification algorithm. Baruch et al. [BFP15] studied the power of a randomized verifier in distributed proofs and Fraigniaud et al. [FHK12] studied the effect of such proofs when nodes are anonymous. Feuilloley et al. [FFH16] considered the first interactive proof system which consists of three players: a centralized prover, disprover and a distributed network verifier. We further discuss these works in Section 1.3.

Kol et al. [KOS18] took an important step towards understanding the power of interaction in distributed proofs. As an analog to the class (Arthur-Marlin), they defined the class to contain all -vertex graph problems that admit a two-message protocol where the communication between the prover and each node in network is bounded by . As in AM, the protocols in this class must be “public-coins”, that is, the node’s messages to the prover are simply independent random bits (no other randomness is allowed). The class is defined similarly for three-message protocols (and so forth), and in general denotes protocols with rounds and communication complexity bounded by .

Their main positive results are for two problems Sym and GNI which have an -bit lower bound in the non-interactive setting [GS16]. In the problem of Sym, the network should decide whether the network graph has a non-trivial automorphism. In GNI problem, the goal is to decide whether the network graph is not isomorphic to an additional input graph. Specifically, they show that and that . This is a huge improvement over the lower bound for the non interactive version of this problems. On the hardness/impossibility side they show an (unconditional) lower bound for the Sym problem for two-message protocols: if then .111The authors of [KOS18] also reported an improvement of , see [Osh18].

1.1 Our Results

The Model. We follow the distributed interactive proof model of Kol et al. [KOS18]: The protocol proceeds in rounds in which nodes exchange (short) messages with the prover as well as with their neighbors in the graph. The messages that the nodes send to each other are essentially the proofs they received from the prover. Thus in the model of [KOS18] the nodes are assumed to get from the prover their own proof as well as the proofs of their neighbors. Note that in the distributed interactive setting, a proof size of bits is a trivial upper bound for all graph problems, since the nodes are computationally unbounded (i.e., only their information on the network is bounded). Our key results demonstrate that many natural graph problems on sparse graphs admit logarithmic-size proofs. In addition, it is also possible to go below the -regime (even for dense graphs), and obtain -size proofs for a wide class of problems.

General Compiler for RAM-Verifiers.

One of our key contribution is in presenting general methods for converting “standard” interactive proofs (i.e., proofs where the verifier is a centralized algorithm) to protocols where with distributed verifier. The cost of this transformation in terms of the proof-size depends on the computational complexity of the centralized verification algorithm. Our first result concerns a RAM-verifier (i.e., where the verification algorithm runs a RAM machine). We show a general compiler that takes any -protocol with a RAM-verifier with verification complexity (i.e., the time complexity of the verification algorithm is operations over words on length ) and transforms is into an -round distributed interactive protocol with proof-size . Specifically, for a verifier that runs in time a distributed protocol.

Theorem 1.

Let be an -round public-coin protocol for languages of -vertex graphs where the verifier is a RAM program with running time , then . In particular, if and the verifier runs in time then .

The benefit of this compiler is in its generality: the transformation works for any problem while paying only in the running-time of the verifier. This is particularly useful when the graph is sparse. For instance, it is possible to verify whether a graph is planar in using the linear time algorithm for planarity [HT74]. Any other linear time algorithms on sparse graphs can be applied as well. As we will see next, this compiler is used as a basic building block in many of our protocols. Even for those that concern dense graphs, and even for those the go below the regime. On of the most notable example for the usefulness of this compiler is for the problems of graph non-isomorphism and related variants.

Graph Isomorphism and Asymmetry with -bit Proofs.

We combine our linear-RAM compiler with the the well-known Goldwasser-Sipser GNI protocol [GS89, GMW91]. Note that the GNI problem involves two graphs, and its definition in the distributed setting might be interpreted in two ways (either the second graph is also a communication graph, or not). For this reason, we start by considering an almost equivalent problem of “graph asymmetry” (Asym) where the prover wishes to prove that the communication graph has no (non-trivial) automorphism. The protocol for GNI can be naturally augmented to this problem we well. Since the running time of the verifier in the (centralized) protocol is linear in the size of the graph, applying our compiler immediately yields that and (for any definition of GNI) which matches the result of [KOS18] for the same problem.

To achieve the desired bound of proof-size, we will not use the compiler as a black-box. Instead our strategy is based on first reducing the problem to one that is verifiable in linear-time (in the number of vertices) using -bits of proofs. Then in the second phase, we will apply the RAM-compiler on this reduced problem, using proofs of size -bit again. Our end result is a protocol for Asym, an exponential improvement over Kol et al. protocol. This also applies to the GNI problem where both graphs are part of the communication network (see in case they do not correspond to the communication graph below). In contrast, for proof labeling schemes there is an lower bound [GS16].

Theorem 2.

, and .

One of the tools used for the compiler is a protocol for the permutation problem Permutation. Here, each node has a value and we need to verify that these values form a permutation over . We give an dAM protocol for this problem using proofs of size . This was posed as an open problem by the authors of [KOS18].222The problem was posed in the Interactive Complexity workshop at the Simon’s Institute [Osh18].

Theorem 3.

.

Compilers for small space and low depth verifiers.

If we allow even more rounds of communication, then we can capture a richer class of languages. Specifically, we show how to leverage the RAM-compiler to transform the protocols of Goldwasser, Kalai and Rothblum [GKR15] and Reingold, Rothblum and Rothblum [RRR16] into distributed protocols. The result is that any low space (and poly-time) computation can be compiled to constant-rounds distributed protocols with proof size and any “uniform NC” (circuits of polylog depth, polynomial size and unbounded fan-in) computations can be compiled into a distributed protocol with rounds and proof size. The main work performed by the verifier in both of these protocols is interpreting the input as a function and evaluating its low degree extension at a random point. We show how to implement this using a distributed verifier. See more details in Section 6. This is true also for the case when the computation verified can be performed by a low depth (uniform) circuit, but in this case we need a number of rounds proportional to the depth of the circuit.

Theorem 4.

Let be a language.

  1. There exists a constant such that if can be decided in time and space then .

  2. If is in uniform NC then .

This can be used in turn for the GNI problem and obtain a even for the case where one of the graphs does not correspond to the communication graph. Another example is verifying the a tree is a minimal spanning tree (MST). One can verify that a tree is a MST by a centralized algorithm with small space. Thus, we get that . Without interaction, there is a matching upper bound and lower bound of , where is an upper bound on the weights [KK07].

1.2 Below the -Regime

At this point, there is still a gap between our above mentioned results and the lower bound of [KOS18]. One reason for this gap is that constructing protocols with proofs seems quite hard. The prover is somewhat limited as the basic operations such as pointing a neighboring node, counting, specifying a specific node ID, all require bits.

Perhaps surprisingly, we show that using our RAM compiler with additional rounds of interaction can lead to an exponentially improvement in the proof size for a large family of graph problems. Obtaining these improved protocols calls for developing a totally new infrastructure that can replace the basic -primitives (i.e., with a logarithmic proof size) with an equivalent -primitives (e.g., verifying a spanning tree). While these do not yield a full RAM compiler, they are indeed quite general and can be easily adapted to classical graph problems. Two notable examples are DSym and problems that can be verified by computing an aggregate function of the vertices.

The DSym problem is similar to the Sym problem except that the automorphism is fixed and given to all nodes. This problem was studied by [KOS18] where they showed that but any distributed NP proof for requires a proof of size . We show that using a five message protocol we can reduce the proof size to :

Theorem 5.

.

Depending on the problem, our techniques can be used to get even smaller proofs. In particular, if the aggregate function is over constant size elements then the proof be of constant size. For example, we show that the Clique problem can be solved using a proof of size in only three rounds. In contrast, without interaction, there is an lower bound [KKP10].

Corollary 1.

.

For instance, we show an protocol for proving that the graph is not two-colorable. This is in contrast to non-interactive setting [GPW18] that requires bits for this problem. Another interesting example is the “leader election” problem where it is required to verify the exactly one nodes in the network is marked as a leader. As this problem can also be casted as an aggregate function of constant sized element, we get:

Corollary 2.

.

Argument Labeling Schemes.

Can the interaction be eliminated? The simple answer for that is no

! We have observed by now several examples where few rounds of interactions break the non-interactive lower bounds (e.g. for Symmetry and Asymmetry). However, this does not seem to be the end of the story. In the centralized setting there are various techniques for eliminating interaction from protocols, especially public-coins ones. A “standard” such technique is the Fiat-Shamir transformation or heuristic that converts a public-coins interaction to one without an interaction. Here, we assume that parties have access to a

random oracle, and that the prover is computationally limited: it can only perform a bounded number of queries to the random oracle. In such a case, we end up with an “argument” system rather than with a “proof” system. In an argument system proofs of false statements exist but it is computationally hard to find them. Therefore, such protocols do not contradict the lower bounds for proof labeling schemes. We call such a protocol an “argument labeling scheme”. These systems can have significant savings in distributed verification systems. More details are in Section 8.

1.3 Related Work

The concept of distributed-NP is quite broad and contains (at least) three frameworks. This area was first introduced by Korman-Kutten-Peleg [KKP10] that formalized the model of proof-labeling schemes (PLS). In their setting, communications are restricted to happen exactly once between neighbors. A more relaxed variant is locally checkable proofs (LCP) [GS16] introduced by Göös and Suomela which allows several rounds of verification in which nodes can also exchange their inputs. The third notion which is also the weakest is non-deterministic local decision (NLD) introduced by Fraigniaud-Korman-Peleg [FKP11]. In NLD the prover cannot use the identities of the nodes in its proofs, that is the proofs given to the nodes are oblivious to their identity assignment.

We note that when allowing prover–verifier interaction some of the differences between these models disappear. At least in the -proof regime, using more rounds of interactions allows the nodes to send their IDs to the prover, and the prover can use these IDs in its proofs. Our protocols with -bit proofs are not based on the actual identity assignment, but rather only on their port ordering.

Prior to the distributed interactive model of [KOS18], Feuilloley, Fraigniaud and Hirvonen [FFH16] considered the first interactive proof system which consists of three players: a centralized prover, a decentralized disprover and a distributed verifier (the network). This model gives considerably more power to the verifier as it can get some help from the strong disprover. [FFH16] showed that such interaction between a prover and a disprover can considerably reduce the proof size. The most dramatic effect is for the nontrivial automorphism problem which requires bits with no interaction, but can be verified with bits with two prover–disprover rounds.

Very recently, Feuilloley et al. [FFH18] considered another generalization of [KOS18] where instead of allowing several rounds of interaction between the prover and the verifier, they allow several verification rounds. That is, the prover gives each node a proof at the first round, it then disappears and the nodes continue to communicate for many rounds. They showed that for several “simple” graph families such as trees, grids, etc. every proof labeling with bits, can be made an -bit proof when allowing verification rounds. Note that our distributed protocols can simulate such a scheme, but since our protocols use a small number of interactive rounds, the reduction in the proof size that we get from the framework of [FFH18] in negligible.

2 Our Techniques

2.1 The RAM Program Compiler

Many non-interactive distributed proofs (known as “proof labels”) [KKP10] are based on the basic primitive of verifying that a given marked subgraph is a spanning tree [AKY97]. In particular, in most of these applications, the subgraph itself is given as part of the proof to the nodes (i.e., a vertex gets its parent ). I.e., the prover computes a spanning tree for the vertices to facilitate the verification of the problem in hand (e.g., cliques, leader election etc.). Indeed, throughout we will use the prover to help the network compute various computations to facilitate the verification of the problem in hand. We start by briefly explaining the proof labeling of spanning trees, which becomes useful in our compiler as well.

A Spanning Tree.

The proof contains of several fields, which will be explained one by one along with their roles. The first field in the proof given to is its parent in the tree . This can be indicated by sending the port number that points to its parent. Let be the graph defined by the pointers. We must verify that is indeed a tree (i.e., contains no cycles) and that it spans . To verify that there are no cycles, the second field in the proof of contains the distance to the root in the tree, . The root should be given distance 0, and each node verifies with its parent in the tree that . If there is a cycle in , then no value for can satisfy this requirement for all nodes on the cycle. Finally, to be able to verify that spans , the third field in the proof is the ID of the root. Nodes verify with their neighbors that have the same root ID. If does not span then there must be two trees with two different roots. Since the graph is connected there must be an edge from one tree to the other which will spot the inconsistency of the root IDs.

The tree is used as a basic component in many protocols as it allows summing values held by each node (or computing other aggregative functions). For example, suppose we want to use compute where is some number that is known to node . Let be the subtree of rooted at . We can use the prover to help us in this computation. Since the prover is untrusted, we will also need to verify this computation. This is done as follows. The prover sends the value , the sum of the values in the subtree . Then, verifies that is consistent with the values given to its children in the tree. That is, where are its children (the leaves have no children). If all values are consistent then we know that the root of the tree has the desired value . We call such a procedure summing up the tree” as it will be useful later on in different contexts.

A Reduction to Set Equality.

Our main observation is that obtaining a general RAM compiler translates into a specific problem of Set Equality. Let be a standard interactive protocol (with a centralized verifier). We construct a distributed protocol as follows. First, we let the prover compute a spanning tree of the graph as described above, and assign IDs in the range of to for the nodes in the graph. The correctness of the spanning tree computation is verifies is in the labeling schemes described above. We later describe how to also verify the correctness of the consecutive IDs in . We will also solve this by a reduction to set equality.

The high level idea is to use the fact that the protocol is public-coin, and thus allows the prover to run the centralized verifier on its own. We now need the prover to convince the network that it simulated the verification algorithm correctly. For that purpose the verification of the RAM computation made by the prover is distributed among the nodes. Since the centralized RAM program consists of steps, each vertex can be in charge of locally verifying constant number of steps in this program. To verify that the computation is correct globally, we will reduce the problem to Set Equality.

We now explain it in more details. Let be the nodes ordered by their assigned IDs. Given this ordering, we can split the communication between the prover and verifier in to equally parts where node is responsible to communicate and store the responses of the chunk of the messages. Since is a public coin protocol, the messages to the prover from each node are simply random coins. Finally, we need to simulate the verifier of by a distributed protocol. We assume that the verifier is implemented by a RAM program.

Consider a RAM program . An execution of can be described as a sequence of read and write instructions to a memory with cells, where each operation consists of a short local state and a triplets where is the value (value read from memory or written to memory), is the address in the memory, is the timestamp of memory cell (i.e., when it was last changed). We set the size of a cell to be bits such that each tuple the state and the triplets can be represented by bits.

Let be the set of all read triplet operations and let be the set of all write triplet operations. Note that in general it might be that , e.g., if a cell is written once but read multiple times. Following the steps of [BEG94] in the context of memory checking, we can transform any program to a canonical form where while paying only a constant factor in the running time. We assume from hereon that is given in this canonical form. Thus, we have that and describe an honest execution of the program if and only if .

With this in mind, we can design the final step. Let be the running time of verifier . The prover runs the verifier and writes the list of triplets and local states of the program . Each node is responsible for steps of the program, and the prover divides the triples and states of each instruction to the nodes. Each node that is responsible of step verifies that the states and triples are consistent with the instructions of step in the program . What the node cannot verify locally is that the values read from the memory are consistent with the program. That is, we are left to verify is that the two sets and defined by the triplets are equal (as multisets). That is, we need a protocol for the problem SetEquality.

A Protocol for Set Equality.

As we have shown a protocol for Set Equality is the basis for the compiler. Actually, this protocol is used for other problems as well, and we describe it in its generality. Assume each node has an input and where are long bit strings, and let and similarly . We want to verify that as multisets. We will describe here an protocol for this problem, which captures the mains ideas. In Section 4.1 we show how this protocol can be compressed to two message , and we also show how to support each node holding two lists of up to elements (instead of single elements and ).

In the first message, we let the prover compute a spanning tree of the graph along with a proof as described above. Then, to check that we define a polynomial and over a field of size as follows:

As we show in the analysis, it holds that if and only if . Moreover, since the polynomials have low degree compared to the field size ( vs. ) in order to check if they are equal it suffices to compare them on a random field element (if the two polynomials are different they can agree on at most element in the field).

We let the root of the tree sample a random field element , and send it to the prover. The prover sends to all nodes of the graph. Nodes compare with their neighbors to verify that everyone has the same element . Then, we are left with evaluating the two polynomials and . To compute these polynomials we use a spanning tree , and compute them “up the tree”. We let the prover give each node the evaluation of the polynomials on the subtree , that is, and . Nodes check consistency with their children in the to assure that all partial evaluations are correct. That is, they check that

where are the children of in the tree. Finally, the root of the tree holds the two complete evaluations of polynomials and and verifies that . If all verifications pass then we know that with high probability .

Assigning IDs.

In the description above we assumed that unique IDs in the range of to are honestly generated. We show that this assumption is without loss of generality. Let the ID of node be . Each node verifies that ( is known to all nodes). We want to verify that the are all distinct. That is, we want to verify that the ’s are a permutation of . This is also called the Permutation problem.

This is solved by reducing it to the Set Equality problem. Each node sets . Let and . Our key observation here is that the ’s are distinct if and only if . Thus, we run the set equality protocol on and . Note that this can be performed in parallel to the compiler’s protocol, and thus does not add to the round complexity.

2.2 Asymmetry and Graph Non-Isomorphism

We first give a short description of a standard (centralized) interactive protocol for Asym which is a simple adaptation of the public-coin protocol for graph non-isomorphism [GS89, GMW91] (see also [BM88]). From hereon we denote this protocol by the “GNI protocol”. Then we show how to transform it to a distribute protocol.

Let be the set of all graphs that are isomorphic to . That is, . The main observation of the GNI protocol which follows here directly is that if has no (non-trivial) automorphism then while if does have an automorphism then

. Thus, the focus of the protocol is on estimating the size of

.

The verifier samples a hash function , where is roughly and sends it to the prover. The prover seeks for a graph such that . The main observation is that the probability that such a graph exists is higher when is larger which allows the verifier to distinguish between the cases of . That is, the verify will accept if .

Let us begin with an immediate solution for sparse graphs. Suppose that the graph is sparse (has edges) and thus can be represented by bits. One can observe that in this case the total communication of the “GNI protocol” is linear in the input size, that is, and thus can be distributed among the nodes such that each node gets bits. Finally, the verifier is required to compute the hash function . We need a very fast (linear-time) pairwise hash function for this. Luckily, Ishai et al. [IKOS08] (see Corollary 3) constructed such a hash function that can be computed in operations over words of size . Thus, applying our RAM compiler with this hash function gives a protocol for the problem: the first message is sending and messages 2-3 are sending and verifying that .

The protocol above of course works only for sparse graphs as they had a small representation. While graphs in general have representation of size roughly , since the size of the set is at most , any graph in can be indexed to have size . Thus, we want to hash the set using a hash function to a set such that and each elements in is represented using bits. While this approach is simple, it has a major caveat: computing is exactly the task we wished to avoid! However, there is an important difference: the function had exactly bits of output where has for a large constant . This slackness in the constant lets us compose a special hash function that can be computed locally. Them, we will apply to the smaller elements of and compute it using the RAM compiler as before. Together, we will verify that .

In more details, our hash function will be composed of hash functions. Each node chooses a seed for an -almost-pairwise hash function

where . The seed length of is bits. Let be the chosen hash function ordered by the index of the nodes. Let where

is the indicator vector for the neighbors of node

in . Then, we define a hash function as

Using we can define the set . It is easy to see that . The fact that is locally computable means that it has a very bad collision probability. If two inputs are differ only on a single bit then the probability that they collide depend only on a single which rather small compared to the total range of . To show that there will be no collisions under under we exploit the specific properties of . The key point is that contains only graphs that are all isomorphic to each other and hence there are not many isomorphic graphs that differ only on a small part. This lets us bound the collision probability of two graphs as a function of their hamming distance and union bound over the number of isomorphic graphs of distance . We show that with high probability we have that and thus we can apply the protocol for instead of .

Graph Non-Isomorphism.

The end result is a protocol for Asym in . In Section 5.1 we show how to adapt this protocol for GNI, where we assume that in the GNI problem formulation nodes can communicate on both graphs and . We note that while this improves upon the of [KOS18], our protocol works only when the GNI problem is defined such that nodes can communicate on both graphs and . The protocol of [KOS18] works also on the definition GNI where only is the communication graph and is given as input nodes. That is, each node is given a list of its neighbors in but cannot communicate with them directly. This is not an issue when the proof complexity is as the prover can send each node its neighbors in the graph . However, when restricting the communication size to this raises many difficulties, which seem hard to overcome.

2.3 A Compiler for Small Space and Low Depth

We describe how to get a compiler for small space computation (Item 1 in Theorem 4). The main tool behind the construction is the interactive protocol of Reingold, Rothblum and Rothblum [RRR16]. They show that for every statement that can be evaluated in polynomial time and bounded-polynomial space there exists a constant-round (public-coin) interactive protocol with an (almost) linear verifier. This is an excellent starting point for us, as our RAM compiler is most efficient for linear verifiers.

There is a subtle point here however. A linear-time in [RRR16] is with respect to the size of the graph, i.e., , whereas a linear time for our RAM compiler is with respect to the number of vertices . To handle this, we first reduce the running time of the centralized verifier to before applying our RAM compiler. Indeed, as already observed in [RRR16], the running time of the verifier can be made sublinear (e.g.,  for some small constant ) if the verifier is given an oracle access to a low degree extension of the input (the input is the graph and possibly additional individual inputs held by each node). Our protocol will run the RAM-compiler on this sublinear version of the verifier while providing it this query access. Luckily, evaluating a point of a low degree extension of the input is a task that is well suited for a distributed system, as it is a linear function of the input and hence can be computed “up the tree” using the prover. Thus, the [RRR16] protocol can be compiled to a distributed one with constant number of rounds and proof size.

A protocol with the same properties is given by Goldwasser, Kalai and Rothblumin [GKR15] in the context of low depth circuits (as opposed to small space). Let the class “uniform NC” be the class of all language computable by a family of -space uniform circuits of size and depth . They showed the any languages computable by “uniform NC” there is a public-coin interactive protocol where verifier runs in time given oracle access to a low degree extension of the input and the communication complexity is . Using the same approach as we did for the [RRR16] protocol, we can also compile this protocol to a distributed one with polylogarithmic number of rounds and proof size.

2.4 Below the Barrier

To construct protocols with proofs, we need to re-develop the basic “distributed NP” primitives only with a proof size in the required regime. Similar to the generality of the basic tree construction in distributed NP proofs, these tools are useful for many problems.

Constructing a Spanning Tree.

We begin by showing how to compute a spanning tree in the graph using only bits. We let the prover compute a BFS tree in the graph. However, the prover cannot even give a node its parent in the graph, let alone prove its validity.

We take a different approach, using the specific properties of a BFS tree. If a node is in level in the BFS tree, then its neighbors are all in level , or . Thus, we let the prover give each node its distance from the root modulo 3. This gives each node sufficient information to divide its neighbors into three groups: neighbors in the same level as , neighbors that are one level closer to the root, , and neighbors that are one level below, . The node defines its parent to be its neighbors in level with the minimal port number (all neighbors of each node are ordered by an arbitrary port numbering that is known to the prover). This way, each node has a defined parent in the graph, except if it had no neighbors of level which means that it is the root.

Let be the graph defined by . As in the standard proof labeling scheme for verifying a spanning tree, we first verify that is a tree (has no cycles), and that verify that it is also spanning.

First, we verify that there are no cycles in . Towards this end, we let each node sample a uniform bit and send it to the prover. Let be the path in the tree that the prover computed from to the root. The prover sends each node the number , that is the sum of the ’s on the path from to the root modulo 2. Nodes exchange this value with their parent in the tree. Each node verifies that . In the analysis, we show that if contains a cycle, then with probability the nodes will reject (this happens when the sum of the

values on a cycle is odd).

By now, we know that contains no cycles. However, it might still be the case that is a forest. In such a case it will contain more than one root node. To eliminate this, we have the prover broadcast the value where is the root of the tree he computed. If there are more than one root in , then with probability their values will be different and thus nodes will detect this inconsistency. This insures that has no cycles and a single root thus it must be a spanning tree of . Of course, the soundness can be amplified by standard (parallel) repetition.

A corollary of the constructing such a tree is that the root of the tree is a unique chosen node in the network. Thus, this protocol also solves the “Leader Election” problem (LeaderElection) with a constant size proof in 3-rounds.

Super Protocols.

Our next step is to show how to run what we call “super protocols”. A super protocol simulates a protocol with proof size using only bits, by making computation on a super graph that contains super-nodes. The super graph is defined by decomposing the graph into blocks of size roughly such that each block will simulate a single node in the protocol. The benefit of this approach is that a block has a proof capacity of by having each node get only a single bit. In other words, a super-node (that corresponds to the block of nodes) can be given a proof of size in a distributed manner: giving a single bit proof for each of node in that block.

This brings along several challenges as no node knows the proof, but rather it is distributed among several nodes. To be able to work with these “fragmented proofs” we will need to come up with protocol that work on the super graph. Suppose a node in the super graph represents a block . To simulate a local verification of in the super graph , we need all nodes to cooperate to perform this verification. Towards this end, we will use the RAM compiler on a program that performs the verification, but we run the compiler only on the block , as if it was the entire graph. Since the size of the block is the cost of this compiler is only ! Furthermore, the node performs consistency checks with its neighbors in . Here again we use the RAM compiler, but on a graph that contains and a child of . The graph of these two blocks is connected, and of size . This is carefully performed in parallel for all children .

This was a very high level overview, and we proceed with formally explaining how to defines the blocks and the corresponding super graph. The spanning tree (whose construction was described before) is partitioned into edge-disjoint subtrees , which we call blocks. The precise protocol for this decomposition is given in Section 7.3. The main point here is that at the end of the protocol, each node knows its neighbors within the block.

Using the block decomposition, we show how to reduce the proof size in the protocol for SetEquality to , albeit at the expense of more rounds. The prover orders the nodes inside each block and sends each node its index inside the block. Since the blocks are of size the index requires only bits. To verify that the indexes are indeed a permutation, we apply the permutation protocol described above. However, we run it on each block separately as if the block was the whole graph. Since each block is of size the final cost of this protocol within each block is only !

We wish to run this protocol in parallel for all blocks in the graph. This works if the blocks vertex disjoint, however, the block we have are only edge disjoint. Nodes that participate in several blocks will get a proof for each block which blows up the proof size. Instead, we show how such node get divide their proofs among the blocks. At the end, we are able to run the protocols in parallel without paying an additional cost for these nodes.

The next step of the SetEquality protocol, is to have the root choose a field element described by bits. Let be the root of the tree and let be the block containing . We let the block to distributively choose , where each node picks a single bit. The prover reconstructs and can continue with the protocol. The main challenge now is that no individual node knows , only the prover.

After has been chosen and sent to the prover, the next step of the protocol is to compute the products and and verify that they are equal. First, we compute each product within a block. Let be a block rooted at , then we want the block to compute . Thus, we let the prover compute and send it to the block . To verify this, we can the RAM compiler on the block for a program that reconstructs , computes and finally compares it to (and similarly for the ’s). Again, this is performed for all blocks in parallel and has a cost of bits.

Each node in the super graph now has the value , and we verified that is indeed the product of all elements inside this block. Now, the prover computes the values where is the subtree of rooted at , and sends to the block (and similar for ). Now, node needs to verify this value by computing the product of for all its children .

We note that the block of and its children blocks are connected. Assume for simplicity, that has only a constant number of children blocks. Let be the graph that contains all these blocks. Then, we have that consists of vertices. Thus, we run the RAM compiler on this graph, for a program that on input all the values of the nodes, collects the bits of and reconstructs it, then reconstructs and for all the children blocks, and verifiers . The size of the graph is and thus again running this will cost bits.

This worked since we assumed that there are only a few child blocks, however the number of such blocks in general might be large. In such a case, we compute by computing them in pairs , such that for each pair the graph is always of size . This takes some delicate care of details. While this process is sequence and will take many iterations (as the number of children) we show how to parallel this using the prover.

There many technical challenges to make this plan go through and we refer the reader to Section 7 for the full details. The result is a five message protocol: first the prover sends the tree (and it is verified in messages 2-3), then the network chooses and then we run the RAM compiler in messages 3-5.

Once we have a protocol for SetEquality using bits of proof, we immediately get a protocol for DSym. In this problem, the nodes know a permutation and need to verify that it is an automorphism. We simply run the SetEquality protocol on the two sets of edges for and .

A Protocol for .

We describe a protocol for the clique problem, where the goal is to prove that the graph contains a clique of size where is known to all. The prover marks a clique of size selects one of the nodes in the clique to be a leader. We run the leader protocol described above to verify that indeed a single leader is selected. Finally, each marked nodes verify that indeed of its neighbors are marked and that one of them is the leader. This assures that there are exactly marked nodes and that they form a clique.

3 Definitions

3.1 Interactive Proofs with a Distributed Verifier

Our definition follows the definition in [KOS18]. An interactive proof is a protocol between a verifier and a powerful prover, where the goal of the prover is to convince the verifier that for some common instance and language . Usually, the verifier and prover are turning machines with different computational power. Here, we consider the case where the verifier is distributed.

Our model consists of a network of computation units that communicate in synchronous rounds. The communication pattern between the units is defined by an -vertex graph . In additional, each node may hold an additional input . Let be the set of all inputs. Then, the graph and the inputs define an instance , and the goal of the network is to determine if for some language , where is a family of vertex graphs and is a set of inputs where is the input of node .

The network is equipped with an one extra entity, , which we call the prover. This prover is connected to all the vertices in , and knows the entire input instance . Roughly speaking, the goal of this powerful prover is to convince the network that , where if we ask that the network will not be convinced no matter what the prover does. The prover knows the entire graph: it knows the ordering or the neighbors for each node in the graph.

The Complexity Measures.

Our primary goal in this paper is to minimize the bandwidth, that is, the size of messages sent in each round (within the network and also between the nodes and the prover). The total amount of messages sent is called the proof size (or proof complexity) of the protocol.

The class :

Let be a language of graphs and inputs and let be two parameters. For a verifier and a prover we let denote the protocol between them and we let be final output of the vertex in the protocol. We say that if there exists an -round protocol (i.e.,  messages) with verifier with the following properties:

  1. Completeness: For every , there exist a prover such thats for it holds that .

  2. Soundness: For every and every prover , we have for it holds that .

The probabilities are taken over the random coins of the nodes of the distributed verifier in the protocol between verifier and the prover .

When and prover goes first, this is the standard notion of distributed proofs (or proof labeling schemes). When the verifier sends the first message this is the analog of the AM calls and denoted as . Similarly, we define for three rounds and for four message and so on.

3.2 Limited Independence

A family of functions mapping domain to range is -almost pairwise independent if for every , , we have

Theorem 6.

There exists a family of -almost pairwise independent functions from to such that choosing a random function from requires bits.

Circuits.

Some of our results used the notions of circuit. In this work, we consider circuits of constant fan-in and fan-out. The term “linear size” circuits refers to circuits whose size is linear in the sum of their input size and output size.

Linear Hash Functions.

Ishai et al. [IKOS08] showed how to construct a pairwise independent hash function that can be computed by a linear-sized circuit. Specifically:

Corollary 3.

[IKOS08, Follows from Theorem 3.3] Let be a field of size . There exists a family of pairwise independent hash functions from to such that choosing a random function from requires field elements and evaluating any can be performed by an -sized circuit with gates that operate over .

Definition 1 (Aggregate Function).

We say that a function is an aggregate function if there exists a function such that where for and , and is computable in by a RAM program with operations over words of length .

3.3 Graph Definitions

We usually denote the graph by where is the set of vertices and is the set of edges. We let denote the neighborhood of in . We also call the vertices in nodes.

Definition 2 (Isomorphism).

We say that two graphs and are isomorphic if there exists a bijection between and such that for any two nodes it holds that if and only if . We denote this by .

Definition 3 (Automorphism).

A graph has an automorphism if there exists a non-trivial permutation such that for every it holds that if and only if (we call such a graph Symmetric).

4 A RAM Program Compiler

In this section we show our RAM program compiler. We take standard interactive protocols over -vertex graphs and transform them into distributed protocols. The cost of the distributed protocol depends on the running time of the verifier in the protocol when implemented as a RAM program.

A construction of a spanning tree in the graph is a basic tool in distributed proofs in general [KKP10] and in our context as well. Here, we let the prover compute a spanning tree rooted at an arbitrary node and send each node its parent in the tree (the parent of the root is ). Note that once each node knows its parent in the tree, it also knows its children in the tree.

Then, to prove that this is indeed a tree, the prover additionally gives each node its distance from the root, in the tree . Each node verifies consistency with its parent, i.e.,  (the root verifies that ). One can observe that verifying the distances from the root assures that there are no cycles in as otherwise there must be a node and its parent with inconsistent distances. Finally, to prove that the tree is spanning the prover gives each node the ID of the root where nodes verify consistency of the ID with their neighbors.

Using this tree, we develop an interactive protocol for a new problem we call SetEquality (defined next). This protocol will be used several times in our compiler (and later on) and in particular is used in a protocol for the Distinctness problem and Permutation program (also defined next). Next, we describe the SetEquality problem.

4.1

The SetEquality equality checks the equality of two (multi)sets and is formally defined as follows.

Definition 4 (SetEquality).

In this problem each node holds two lists of elements and where for all it holds that for some constant and . Let and be two multisets. The goal of the SetEquality problem is to prove that as multisets.

Let be an -vertex graph and let be a field of size . We interpret the elements of and as elements in the field . To check that (as multisets) we define a polynomial and according to the elements of and respectively. That is, we define

Note that and are polynomial of degree at most . We show that if and only if . Since the polynomials have low degree (compared to the field size), in order to check if they are equal it suffices to compare them on a random field element. For clarity of presentation, let us assume that nodes have shared randomness. At the end, we show to sample this shared randomness using the prover.

Thus, let be a random field element defined from the shared randomness. Then, we are left with evaluating the two polynomials and . To compute these polynomials we use a spanning tree construction, as described above. We let the prover compute a spanning tree and prove its validity. We use the tree to compute the two polynomials on . Towards this end, the prover sends each node the evaluation of the polynomials on the subtree : and . Nodes check consistency with their children in the to assure that all partial evaluations are correct. That is, they check that

where are the children of in the tree. Finally, the root of the tree holds the two complete evaluations of polynomials and and verifies that .

This completes the description of the protocol assuming the element is shared randomness. To construct such shared randomness we do the following. We let each node sample at random, along with a random number . The node with the minimal “wins” in terms that we set and (observe that we cannot have the prover decide who wins, as otherwise could be biased). The prover will announce to everyone the winning and . Nodes verify the consistency of and with their neighbors, and thus assure that a nodes in the graph has the exact same elements and . We are left to verify that indeed is the minimal one value.

To verify this, each node will check that indeed where we expect exactly a single node to have equality. We count the number of such nodes by having the prover send each node the number of nodes that have equality in its subtree. That is, the prover sends node the value where if and 0 otherwise. The nodes check consistency of the with their children in the tree and finally the root verifies that . This assumes a common random string . The formal protocol is given in Figure 1.

A protocol for set equality. Input: each node has elements and V P (message 1): Each node samples , and and sends it to the prover. P V (message 2): The prover sends a spanning tree along with a proof. P V (message 2): Let and let . The prover sends each node the following The values and . The values and (computed over ). The value . Local: nodes exchange their proofs and verify that proofs for . Let be the children of in the tree . Then, verifies that and that . , and the root verifies that .

Figure 1: A distributed AM protocol for checking the equality of two multi-sets.

We show correctness and soundness of the protocol.

Correctness.

The protocol succeeds as long as the is uniquely the minimal value. However, it is easy to see that . Thus, we continue the analysis as if all the ’s are distinct. Assume that as multisets. Then for any it holds that . For any tree with root it holds that and also that . Thus, the root will output 1, and in addition all intermediate nodes will output 1 after their local verification.

Soundness.

Assume that as multisets. Suppose that . In order for the prover to cheat, it must give the root values such that either or , since otherwise the node will output 0. However, since the node performs the local check with its neighbors in the tree, it holds that the prover must give wrong values to one of its children as well. This continues until the prover gives a wrong value to a leaf, where the leaf can verify locally and output 0 indicating that it revived a wrong proof.

Thus, we bound the probability that the two products collide (notice that the sets are fixed before the choice of ). Consider the polynomial , which is of degree at most over the field .

Claim 1.

is not the zero polynomial.

Proof.

We know that . Suppose that there exists an element . Then, we get that and , therefore and thus is not the zero polynomial. A similar arguments holds if . Since and are multisets there is a third possibility that the multisets share the same elements only with different multiplicities. Let be the multiset of their intersection . Define

It suffices to show that is not the zero function. Define and . For these subsets we know that there must be an element that is in one set and not in the other. Assume without loss of generality that there must exist an element . Then, since we get that and therefore is not the zero polynomial. ∎

The polynomial has at most roots and since the field is of size we get that

Communication Complexity.

Computing the tree and its proof take proof size, as shown in [KKP10]. Elements in the field are represented using and each node is given a constant number of elements (). We have that and thus also has short representation. The are a constant number of bits. Altogether, each node sends and receives bits.

4.2 Distinctness

In the Distinctness problem each node has a single value and the goal to verify that all values are distinct. That is, the output of the protocol is 1 if and only if it holds that for all such that we have that .

We show that this problem can be actually reduced to the SetEquality problem. Assume that the values are sorted such that . The prover sends node the value . Denote by the actual value received by a node . Then, node sets a bit to be 1 if and only if .

Let be all the original values and let be the set of all values given by the prover. Then, we run the protocol for SetEquality to verify that . Moreover, we run a sum protocol to verify that .

A protocol for distinctness. Input: each node has a value , and assume . Output: 1 if and only if all values are distinct. P V (message 1): prover gives node the value . Let be the received value. Local: node sets if and only if . P V (message 1): prover sends a proof that . P V (messages 2-3): prover and verifier interact to assure that , where and .

Figure 2: A distributed MAM protocol for checking that each node in the graph has a unique identity .

Completeness.

If all values are distinct then the honest prover will set . Thus, we will have that for all and and therefore . Moreover, we have that the values of are exactly the values of shifted by 1. That is, as sets we have that and thus the SetEquality protocol will pass as well.

Soundness.

To show soundness we define an -vertex directed graph with nodes being . Since the SetEquality protocol have passed successfully, we know that that as multi-sets. It therefore holds that for any there exists a such that . We then add the directed edge to the graph.

By the construction, we get that the in-degree and the out-degree of each node in this graph are exactly the node’s multiplicity in , which is at least 1. Thus, by an Euler argument, the graph can be decomposed to edge-disjoint cycles. However, since we know that all but one edge are strictly increasing in values. Thus, the decomposition can contain only a single cycle, and thus the in-degree and out-degree are exactly 1, which means that the values of are all distinct.

The Permutation Problem.

A specific instance of the Distinctness problem is when for