In the simultaneous message passing (SMP) model of communication, introduced by Yao [Yao79], Alice and Bob separately receive inputs and to a function . They send messages to a third party, called the referee, who knows and must output
(with high probability) using the messages. But what if the referee doesn’t know ? Can they still compute ? Yes: Alice can include in her message a description of , and then the referee knows it; however, if is restricted, they can sometimes do much better. Here is a simple example: the players receive vertices in a graph of maximum degree 2, and want to decide if is an edge in . Sharing a source of randomness, Alice and Bob randomly label each vertex of with a number up to 200; Alice sends the label of both neighbors of and Bob sends the label of . The referee says yes if one of Alice’s labels matches the label of , no otherwise. They will be correct with probability at least , and the referee never needs to learn . This is also an example where the referee can decide many problems using only one strategy. In this work we will see that more interesting families of graphs, such as trees, planar graphs, and distributive lattices, also exhibit these phenomena, even when we wish to compute distances instead of just adjacency.
To study this, we introduce the universal SMP model, which operates as follows. Fix some family of functions. Alice and Bob receive a function and inputs , and they use shared randomness to each send one message to the referee. The referee knows the family and the size of the inputs, but doesn’t know or the shared randomness, and must compute with high probability. By choosing the family to be the singleton family, one sees that this model includes standard SMP. As in the earlier example, we will be studying communication problems on graphs, but this is not a significant restriction: every Boolean-valued communication problem is equivalent to determining adjacency in some graph (use as the adjacency matrix), so we will treat as a family of graphs.
A surprising but intuitive application of universal SMP is that it connects two apparently disjoint areas of study: communication complexity and graph labeling. For a graph family , the graph labeling problem (introduced by Kannan, Naor, and Rudich [KNR92]) asks how to assign the shortest possible labels to each vertex of a graph , so that the adjacency (or some other function [Pel05]) of vertices can be computed from by a decoder that knows . We observe the following principle (Theorem 1.1):
If there is a (randomized) universal SMP protocol for the graph family with communication cost , then there is a labeling scheme for graphs with labels of size , where is the number of vertices.
Common variants of graph labeling are distance labeling [GPPR04], where the goal is to compute from the labels, and small-distance labeling, where the goal is to compute if it is at most and output “” otherwise [KM01, ABR05]. This is similar to the well-studied -Hamming Distance problem in communication complexity, where the players must decide if their vertices have distance at most in the Boolean hypercube graph. A natural generalization of the Boolean hypercube is the family of distributive lattices (which also include, for example, the hypergrids). We demonstrate that techniques from communication complexity can be used to obtain new graph labelings, by adapting the -Hamming Distance protocol of Huang et al. [HSZZ06] to the universal SMP model, achieving an protocol for computing and the corresponding -distance labeling scheme with label size . It is interesting to note that, in contrast to the standard application of communication complexity as a method for obtaining lower bounds, we are using it to obtain upper bounds.
Generalizing in another direction, we ask: for which graphs other than the Boolean hypercube can we obtain efficient communication protocols for -distance? For constant , -Hamming Distance can be computed with communication cost ; which other graphs admit a constant-cost protocol? To approach this question, we observe that many (but not all) graph families known to have efficient adjacency labeling schemes also admit an universal SMP protocol for adjacency. Commonly studied families in the adjacency and distance labeling literature are trees [KNR92, KM01, ABR05, AGHP16, ADK17] and planar graphs [KNR92, GPPR04, GL07, GU16, AKTZ19]. We study the -distance problem on these families and find that trees admit an protocol, while planar graphs admit an protocol for 2-distance; this implies a new labeling scheme for planar graphs.
Further motivation for the universal SMP model comes from universal graphs. Introduced by Rado [Rad64], an induced-universal graph for a set is one that contains each as an induced subgraph. An efficient adjacency labeling scheme for a set implies a small induced-universal graph for that set [KNR92]. Deterministic universal SMP protocols are equivalent to universal graphs (Theorem 1.7), and we introduce probabilistic universal graphs as the analogous objects for randomized univeral SMP protocols. We think probabilistic universal graphs are worthy of study alongside universal graphs, especially since many non-trivial families admit one of constant-size.
The universal SMP model is also related to a recent line of work studying communication between parties with imperfect knowledge of each other’s “context”. The most relevant incarnation of this idea is the recent work [GS17, GKKS18], who study the 2-way communication model where Alice and Bob receive functions and respectively, with inputs and , and must compute under the guarantee that and are close in some metric. In other words, one party does not have full knowledge of the function to be computed. The universal SMP model provides a framework for studying a similar problem in the SMP setting, where the players know the function but the referee does not; the similarity is especially clear when we define the family to be all graphs of distance to a reference graph in some metric (we discuss this situation in more detail at the end of the paper). This could model, for example, a situation where the clients of a service operate in a shared environment but the server does not; or, a situation in which the clients want to keep their shared environment secret from the server, and their inputs secret from each other. This suggests a possible application to privacy and security. A relevant example is private proximity testing (e.g. [NTL11]), where two clients should be notified by the server when they are at distance at most from each other, without revealing to each other or the server their exact locations.
The Discussion at the end of the paper highlights some interesting questions and open problems.
A universal SMP protocol decides -distance for a family if for all graphs and vertices , the protocol will correctly decide if , with high probability. A labeling scheme decides -distance if can be decided from the labels of . Below, the variable always refers to the number of vertices in the input graph.
Implicit graph representations.
The main principle connecting communication and graph labeling is:
Any graph family with universal SMP cost has an adjacency labeling scheme with labels of size . In particular, if the universal SMP cost for is then has an adjacency labeling scheme.
Adjacency labeling schemes of size are of special interest because is the minimum number of bits required to label each vertex uniquely, and they correspond to implicit graph representations, as defined by Kannan, Naor, and Rudich [KNR92] (we omit their requirement that the encoding and decoding be computable in polynomial-time). Section 2.3 elaborates further. To obtain implicit representations, we can relax our requirements:
For any constant , any graph family where each has a public-coin 2-way communication protocol computing adjacency with cost has an implicit representation.
Distributive & Modular Lattices.
Distributive and modular lattices are generalizations of the Boolean hypercube and hypergrids (see Section 3 for definitions). We define a weakly-universal SMP protocol as one where the referee shares the randomness of Alice and Bob. For distributive lattices we get the following:
The -distance problem on the family of distributive lattices has: a weakly-universal SMP protocol with cost ; a universal SMP protocol with cost ; and a size labeling scheme.
Modular lattices are a superset of distributive lattices, but they do not admit -distance protocols with a cost independent of ; we show that any universal SMP protocol (and any labeling scheme) deciding 2-distance must have cost (Theorem 3.14). To our knowledge, there are no known labeling schemes for distributive or modular lattices. Our adjacency labeling scheme (i.e. for ) requires space to store the whole lattice; this can be compared to Munro and Sinnamon [MS18], who present a data structures of size for distributive lattices that supports meet and join operations (and therefore distance queries, due to our Lemma 3.5). However, these are not labelings, so the result is not directly comparable.
Planar graphs and other efficiently-labelable families.
When they introduced graph labeling, Kannan, Naor, and Rudich [KNR92] studied trees, low-arboricity graphs (whose edges can be partitioned into a small number of trees), and planar graphs, and interval graphs (whose vertices are intervals in , with an edge if the intervals intersect), among others. These families have adjacency labeling schemes. Trees, low-arboricity graphs, and planar graphs have constant-cost universal SMP protocols for adjacency. Trees admit an efficient -distance protocol:
The family of trees has a universal SMP protocol deciding -distance with cost and a labeling scheme deciding -distance.
Planar graphs admit an efficient 2-distance protocol, which implies a new 2-distance labeling scheme:
The 2-distance problem on the family of planar graphs has a universal SMP protocol with cost and a labeling scheme of size .
On the other hand, a universal SMP protocol deciding 2-distance on the family of graphs with arboricity 2 has cost at least (Proposition 4.4), and a universal SMP protocol deciding adjacency in interval graphs has cost (Proposition 4.5).
Gavoille et al. [GPPR04] showed that trees have an labeling allowing to be computed exactly from labels of , and gave a matching lower bound; Kaplan and Milo [KM01] and Alstrup et al [ABR05] studied -distance for trees, with the latter achieveing a labeling scheme. For planar graphs, [GPPR04] gives a lower bound of for computing distances exactly, and an upper bound of , which was later improved to in [GU16].
Our lower bounds are achieved by reduction from the family of all graphs, which has complexity , in contrast to the upper bound of for the standard SMP cost of computing adjacency in any graph (since Alice and Bob can send bits to identify their vertices).
For the family of all graphs, the universal SMP cost of computing adjacency in is .
The basic relationships between universal SMP, standard SMP, and universal graphs are as follows. Below, we use and for the deterministic and randomized (standard) SMP cost of computing adjacency on , and for the deterministic and randomized universal SMP cost for computing adjacency in the family . We use the term “-universal graph” as opposed to “induced-universal” to denote a slightly different object that allows non-injective embeddings (see Section 2 for definitions).
For a set , the following relationships hold. Let range over the set of all -universal graphs:
with equality on the left iff such that , can be embedded in . For ranging over the set of all probabilistic universal graphs:
Randomized and deterministic universal SMP satisfy
The above results on graph labeling are proved through the relationship between randomized and deterministic universal SMP. We obtain this relationship by adapting Newman’s Theorem [New91], a standard derandomization result in communication complexity. Finally, we note the interesting fact that universal SMP characterizes the gap between standard SMP models where the referee does or does not share the randomness with Alice and Bob:
Proposition 1.8 (Informal).
Let be a family of graphs and let be a weakly-universal SMP protocol for , which defines a distribution over the referee’s decision functions , which we interpret as the adjacency matrices of graphs. Let be the family on which this distribution is supported. Then, taking the minimum over all such protocols ,
1.2 Other Related Work
Randomized labeling schemes for trees have been studied by Fraigniaud and Korman [FK09], who give a randomized adjacency labeling scheme of bits per label that has one-sided error (i.e. it can erroneously report that are adjacent when they are not), and they show that achieving one-sided error in the opposite direction requires a randomized labeling with bits. They also give randomized schemes for determining if is an ancestor of , but they do not address distance problems. Spinrad’s book [Spi03] has a chapter on implicit graphs and Alstrup et al. [AKTZ19] for a recent survey on adjacency labeling schemes and induced-universal graphs. We know of no labeling schemes for lattices, but Fraigniaud and Korman [FK16] recently studied adjacency labeling schemes for posets of low “tree-dimension”.
labeling studies an opposite problem to -distance labeling, where distances must be accurately reported when they are above some threshold . Recent work includes Alstrup et al. [ADKP16].
To our knowledge, -distance or even 2-distance has not been studied for planar graphs, but there are many results on other types of planar graph labelings with restrictions at distance 2. An example is the frequency assignment problem or -labeling problem, which asks how to construct a labeling assigning integers to vertices of a planar graph so that and , with various optimization goals. See [Cal11] for a survey.
There are several works studying communication problems where the parties do not agree on the function to be computed, starting with Goldreich, Juba, and Sudan [GJS12] who studied communication where parties have different “goals”. Canonne et al. [CGMS17] study communication in the shared randomness setting where the randomness is shared imperfectly. Haramarty and Sudan [HS16] study compression (á la Shannon) in situations where the parties do not agree on a common distribution. As mentioned earlier, Ghazi et al. [GKKS18] and Ghazi and Sudan [GS17] study 2-way communication where the parties do not agree on the function to be computed.
means . The letter always denotes the number of vertices in a graph. We use the notation iff the statement holds, and otherwise. For a graph , is the set of vertices and is the set of edges. For vertices , we write for the entry in the adjacency matrix of . For an undirected, unweighted graph and vertices is the length of the shortest path from to .
For any graph and integer , we denote by the -closure of , where two vertices are adjacent iff in ; it is convenient to require that each vertex is adjacent to itself in . For a set of graphs , .
is the deterministic SMP cost of the function and is the randomized SMP cost of the function , in the model where Alice and Bob share randomness but the deterministic referee does not.
2 Universal Communication and Universal Graphs
In this paper we focus on deciding adjacency. Every Boolean communication problem on finite domains is equivalent to the adjacency problem on the graph with vertex set and . We may either allow self-loops in if or take to be bipartite. We will generally permit graphs to have self-loops.
A family of graphs is a sequence of sets indexed by integers , along with a strictly increasing size function , so that is a set of graphs with vertex set . If has size then we write .
Definition 2.2 (Universal SMP and Variations).
Let be a family of graphs with size function and let be an operation taking size graphs to size graphs. Let and let be a constant. An -error, cost sequence of universal SMP communication protocols for is as follows. For any , a protocol for is a triple where:
Alice and Bob receive respectively, where and ;
Alice and Bob share a random string and compute messages , respectively;
For each , the (deterministic) referee has a function , called the decision function. must satisfy:
If are adjacent in then ; and
If are not adjacent in then .
A universal SMP protocol is symmetric when the functions computed by Alice and Bob are identical and the function satisfies for all messages . We write for the communication complexity in the universal SMP model of computing adjacency in graphs , where is the allowed probability of error. We write for . If no operation is specified, it is assumed to be the identity.
It is also convenient to define a weakly-universal SMP protocol as a universal SMP protocol where the referee can see the shared randomness, so the choice function is of the form for random seed , graph , and . We denote the -error complexity in this model with .
Finally, we write for the deterministic universal SMP complexity.
We include the operator in the definition to emphasize that the players are given the original graph , not the graph ; for example, the players are not given (from which it may be difficult to compute ), but are instead given .
2.1 Deterministic Universal Communication and Universal Graphs
We will show that a deterministic universal SMP protocol is equivalent to an embedding into a -universal graph, which we we define using the following notion of embedding (following the terminology of Rado [Rad64]):
For graphs , a mapping is an embedding iff , . If such a mapping exists we write .
For a set of graphs , a graph is -universal if ; i.e. there exists an embedding . For a family of graphs , a sequence is a -universal graph sequence if for each , is -universal for .
Define an equivalence relation on by iff , i.e. have identical rows in the adjacency matrix. For a graph , define the -reduction as a graph on the equivalence classes of with adjacent iff such that are adjacent.
An embedding is not the same as a homomorphism since we must map non-edges to non-edges, and is not the same as being an induced subgraph of since the mapping is not necessarily injective. Therefore a universal graph by our definition is not the same as an induced-universal graph, where must exist as an induced subgraph. We could for example map the path — — — — . This difference between definitions is captured by the relation between vertices. It is necessary to allow self-loops, otherwise the relation is not transitive. The important properties of , and -reductions are stated in the next proposition; the proofs are routine and for completeness are included in the appendix. The relation is the isomorphism relation on graphs.
The following properties are satisfied by the relation, the relation, and -reductions:
For any graph and , iff there exists and an embedding such that .
For any graph .
For any graph and .
For any graphs , iff .
For any graphs , iff is an induced subgraph of .
These properties allows us to prove relationships between the standard SMP model, deterministic universal SMP, and -universal graphs. First we show that deterministic universal SMP protocols can always be made symmetric111Note that this does not imply that every deterministic SMP protocol is symmetric, since in this paper we are only concerned with adjacency on an undirected graph, for which the communication matrix is symmetric. This proposition shows that for symmetric communication matrices, the deterministic SMP protocol is symmetric..
If is a deterministic universal SMP protocol for the set , then there exists a deterministic universal SMP protocol that is symmetric and has the same cost as .
Let and let be the encoding functions for and the decision function for graphs of size . The restriction of to the domain is injective so it has an inverse that satisfies ; the same holds for . Define the encoding function as and define the decision function . Then for any so this is a valid protocol. Since we can write for every so , thus so the protocol is symmetric. ∎
The standard deterministic SMP complexity measure can be expressed in terms of -reductions:
For all graphs , .
It is well-known that for any function , where is the number of distinct columns in the communication matrix of , and is the number of distinct rows [Yao79]. The communication matrix of the function is the adjacency matrix of , which is symmetric, and two rows (or columns) indexed by are distinct iff ; so the number of distinct rows is the size of . ∎
The analogous fact for universal SMP is that the deterministic universal SMP cost is determined by the size of the smallest universal graph.
For any graph family ,
Let be any graph such that for all and for each let be the embedding . Consider the protocol where on inputs , Alice and Bob send using bits and the referee outputs . This is correct by definition so .
Now suppose there is a protocol for with cost and decision function , and let . By Proposition 2.6 we may assume that on inputs Alice and Bob share the encoding function . Let be the graph with vertices and . Then so (by transitivity). Now so . ∎
It is easy to see that can be used as a lower bound on but such lower bounds are tight only when the family is essentially a “trivial” family of equivalent graphs.
For any family , let be the smallest -universal graph sequence for . Then
with equality holding on the left iff such that .
The equality on the right holds by the two prior propositions. The lower bound follows from the fact that any protocol for in the universal model can be used as a protocol in the SMP model. Now we must show the equality condition. Let be a graph maximizing over all graphs in , and suppose , so . Then there exists such that and . Since is an induced subgraph of and we must have so . ∎
2.2 Randomized Universal Communication
Just as deterministic universal communication is equivalent to embedding a family into a universal graph, we will define probabilistic universal graphs and show that they are tightly related to universal communication with shared randomness.
For graphs , a random mapping (i.e. a distribution over such mappings) is an -error embedding iff ,
We will write if there exists an -error embedding . A graph is -error universal for a set of graphs if . is an -error universal graph sequence for the family if for each , is -error universal for .
In the randomized setting we obtain equivalence (up to a constant factor) between universal SMP protocols and probabilistic universal graphs.
For any graph family and any , if there exists a -error universal SMP protocols for with cost , then there exists a -error symmetric universal SMP protocols for with cost at most .
On input , and random string , Alice and Bob send the concatentations and . Then the referee computes
It is clear that is symmetric. If are adjacent then
and if are not adjacent then, by the union bound,
Applying this symmetrization, we get a relationship between universal SMP protocols and probabilistic universal graphs.
Let be a graph family and . Then
There is an -error universal graph sequence of size at most ; and
If there is an -error universal graph sequence of size then .
If is an -error symmetric universal protocol for then there exists a function such that for every there is a random such that . Using as an adjacency matrix, we get a graph of size at most , where is the cost of , such that for all . Then is an -error probabilistic universal graph sequence. By Lemma 2.11 we obtain an -error symmetric protocol with cost , so we have proved the first conclusion. The second conclusion follows by definition. ∎
The basic relationships to standard SMP models follow essentially by definition and from the above lemma.
Let be any graph family and let . Let be an -universal graph sequence for , and an -error universal graph sequence. Then
The inequalities on the left follow the definitions and from the above lemma. On the right, we can obtain a universal SMP protocol by choosing for each a (deterministic) embedding and then using the randomized SMP protocol for . ∎
Universal graphs describe an interesting relationship between weakly-universal and universal SMP protocols (and therefore between standard SMP protocols where the referee does and does not share the randomness); namely, the optimal universal protocol is obtained by finding the smallest universal graph for the family of protocol graphs (decision functions) defined by a weakly-universal protocol.
Let be a family of graphs, let , and let be the set of all -error weakly-universal SMP protocols for . For each let be the family of graphs where is the decision function of . Then
Let ; we will construct a universal SMP protocol as follows. On input , Alice and Bob use shared randomness to simulate and obtain vertices in some graph with . They now simulate the deterministic universal SMP protocol, i.e. an embedding for some graph that is -universal for , and send to the referee who computes .
Now let be an -error universal SMP protocol. Then and for each , , where is the graph of the decision function. , which is the cost of , so . ∎
Newman’s Theorem for public-coin randomized (2-way) protocols is a classic result that gives a bound on the number of uniform random bits required to compute a function in terms of the size of the input domain [New91]. In the universal model, the input size can be very large since the graph (function) itself is part of the input. However, the shared part of the input does not contribute to the number of random bits required in the universal SMP model.
Lemma 2.14 (Newman’s Theorem for universal SMP).
Let and suppose there is an -error universal SMP protocol for the family . Then there is an -error universal SMP protocol for the family that uses at most bits of randomness and has the same communication cost.
Fix , let be the deterministic decision function for , and let be Alice and Bob’s encoding functions for the random seed . For and we will say a seed is bad for if , and we will call this event .
Let be independent random seeds, and let be uniformly random, where . Then for every , the expected number of vertex pairs for which the strings fail is
The sum has mean , so by the Chernoff bound, the probability is at most
Since the expected number of pairs where choosing fails with probability more than is less than 1, there must be some values of with no bad pairs for . So for every we may choose so that choosing uniformly at random is the only random step; since this requires at most random bits. ∎
With this result, we can conclude the proof of Theorem 1.7 in the next lemma.
For any family with size function ,
The upper bound is clear, so we prove lower bound. Let be a sequence of randomized universal SMP protocols for . By Newman’s theorem, we may assume that uses at most random bits for some constant and has error probability . Let be the decision function of , let be the cost of , and let . To obtain a deterministic protocol, we can define the decision function on messages of bits as . Alice and Bob iterate over all random strings and send for each. Since the probability of error is at most when is uniform, at least of the functions will give the correct answer. This proves that . ∎
In this paper we show lower bounds for a family by giving embeddings of an arbitrary graph into , so we need to know the complexity of the family of all graphs with vertices. For our purposes, it is convenient to require that each graph has for all (i.e. all self-loops are present). However, since equality can be checked with cost , the presence or absence of self-loops does not affect the complexity.
For the upper bound, consider the (deterministic) protocol where on input , Alice and Bob send and and the respective rows of the adjacency matrix of . This has cost and the referee can determine by finding in the row sent by Alice.
Let be any protocol for with cost . By Lemma 2.11, we may assume that is symmetric. Let be the decision function for graphs on vertices and let with vertex set . defines a distribution over functions so that for all . Therefore, for drawn uniformly from , . Therefore, for every graph there is a function such that for uniformly at random, . Write . There are at most functions and there are simple graphs on so there is some function where the number of graphs such that is at least . Let be any two such graphs. Then
So differ on at most pairs. However, the largest number of graphs that differ from any graph on at most pairs of vertices is at most
Therefore we must have
so . ∎
Recall the example in the first paragraph of the introduction, for which we observed that a single decision function would work for many problems. We now make a note about this phenomenon. A communication protocol for a graph family is really a sequence of protocols, one for each set of graphs with vertices. Our next proposition addresses the uniformity of the sequence of protocols, that is, the question of how the protocols are related to one another as the size of the input grows. In general, we ask the question: If the family has some relationship between and , what does this imply about the relationship between the protocols for and ? The families of graphs we study in this paper have constant-cost protocols and they are also upwards families, which we define next. These families have enough structure so that there exists a single, one-size-fits-all probabilistic universal graph, into which all graphs can be embedded regardless of their size; in other words, the referee can be ignorant not only of the graph and vertices , but also of the size of the graph, without increasing the cost of the protocol.222Any family with a constant-cost protocol can be turned into a protocol ignorant of the size by requiring that Alice and Bob tell the referee which of the