Reachability and Shortest Paths in the Broadcast CONGEST Model

10/12/2019 ∙ by Shiri Chechik, et al. ∙ 0

In this paper we study the time complexity of the single-source reachability problem and the single-source shortest path problem for directed unweighted graphs in the Broadcast CONGEST model. We focus on the case where the diameter D of the underlying network is constant. We show that for the case where D = 1 there is, quite surprisingly, a very simple algorithm that solves the reachability problem in 1(!) round. In contrast, for networks with D = 2, we show that any distributed algorithm (possibly randomized) for this problem requires Ω(√(n/ logn) ) rounds. Our results therefore completely resolve (up to a small polylogarithmic factor) the complexity of the single-source reachability problem for a wide range of diameters. Furthermore, we show that when D = 1, it is even possible to get a 3-approximation for the all-pairs shortest path problem (for directed unweighted graphs) in just 2 rounds. We also prove a stronger lower bound of Ω(√(n) ) for the single-source shortest path problem for unweighted directed graphs that holds even when the diameter of the underlying network is 2. As far as we know this is the first lower bound that achieves Ω(√(n) ) for this problem.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Reachability and shortest path are two of the most fundamental problems in graph algorithms. In this paper, we study the single-source reachability (SSR) problem and the single-source shortest path (SSSP) problem in the Broadcast CONGEST model of distributed computing.

The CONGEST model [18] is one of the most studied message-passing models in the field of distributed computing. In this model, a synchronized -vertex communication network is modeled by an undirected graph whose vertices correspond to the processors in this network and whose edges correspond to the communication links between them. Each vertex has a unique -bit identifier initially known only to itself and its neighbors in . The vertices communicate in discrete rounds, where in each round each vertex receives the messages that were previously sent to it, performs some unbounded local computation and then sends messages of bits to all or some of its neighbors. The vertices work together on some common task (such as computing distances in the network) and the complexity is measured by the number of communication rounds needed to complete this task. The Broadcast CONGEST model is a more restrictive variant of the CONGEST model where every vertex has to send (broadcast) the same message to all of its neighbors in each round.

In this paper we focus on directed and unweighted graphs. In the SSR problem, we are asked to identify all the vertices in a given graph for which there is a directed path from some designated vertex called the source. In the SSSP problem, we are further asked to compute for each such vertex its distance (the number of edges in a shortest path) from the source . In the CONGEST model as well as in other similar message-passing models, we assume that the communication network is identical to the underlying graph of (where is the input graph for the SSR\SSSP problem). We also assume that the communication between the vertices is bi-directional (regardless of the directions of the edges in ). Initially, each vertex in the network knows whether it is the source or not, and it also knows its set of incoming and outgoing edges in . In the distributed SSR problem, each vertex has to determine whether it is reachable from the source or not, and in the distributed SSSP problem, each vertex has to determine its distance from the source.

Related Work

Distance computation problems (such as the SSSP problem) have been widely studied in many models of distributed computing. It is not hard to see that in many synchronous message-passing models, problems such as SSR and SSSP require rounds (where is the diameter of the underlying network). While this lower bound can be easily matched when messages of unbounded size are allowed, the situation for models that require the messages to be of bounded size is far more involved.

In the CONGEST model, the directed SSR problem has been studied in several papers [17, 10, 14]. The latest [14] shows that it is possible to solve this problem in

rounds with high probability. Many variants of the SSSP problem (directed\undirected, exact\approximate etc.) were studied over the years (see, e.g.,

[17, 11, 2, 7, 9, 8]). In particular, for directed and weighted graphs, there is a randomized algorithm that solves the SSSP problem in rounds [8]. We note that many of the above mentioned algorithms (such as [10, 8]) actually work in the more restrictive Broadcast CONGEST model. Regarding lower bounds, Das Sarma et al. [6] showed that in the CONGEST model the time complexity of any (possibly randomized) algorithm for the directed single-source reachability problem is . However, this lower bound was shown only for graphs of underlying diameter for some . For smaller diameters, similar but weaker lower bounds were shown e.g. for graphs of underlying diameter . The smallest constant diameter for which a non-trivial lower bound is known is 3 where it was shown to require rounds [6].

For the related all-pairs shortest path (APSP) problem, many algorithms with near-optimal complexities for the approximate version of this problem and for the case of unweighted graphs were developed over the years (e.g., [12, 15, 16, 17]). Recently, many algorithms with improved complexities for the case of weighted graphs were devised [7, 13, 1] culminating with the -time randomized algorithm of [3].

Our Results

Many typical real-world networks usually have a relatively small diameter (as argued in e.g. [17]) and, in many cases, a simple topology as well. It is thus of particular interest to understand the complexity of many optimization problems in such settings. Motivated by this, we study in this paper the time complexity of the SSR problem and the SSSP problem (for directed unweighted graphs) in the Broadcast CONGEST model for networks of constant diameter. Specifically, we show that even for networks of diameter with a very simple topology, any distributed algorithm (possibly randomized) for the SSR problem requires rounds. In contrast, we show that quite surprisingly for networks of diameter , this problem (or even the more general all-pairs reachability problem) can be solved deterministically in round. Moreover, we show that for networks of diameter one can compute in rounds a -approximation for the SSSP\APSP problem.

The algorithm for the approximate APSP problem (resp. for the all-pairs reachability problem) allows each vertex to compute a -approximation for the distance between every pair of vertices in the graph (resp. determine reachability for every such pair) and not only for the pairs to which it belongs. We note that if one can compute a -approximation for the APSP problem (for some ) such that there is some vertex

that knows the computed estimation for every pair of vertices, then this vertex can recover the whole graph. This means that

must receive in this case bits of information from its neighbors (simply because there are possible graphs on these vertices), but in each round, can get at most bits from its neighbors and so rounds are required for solving this problem.

Our results show a large gap between networks of diameter and . As upper bounds of are already known for the SSR problem when the underlying network has a constant or polylogarithmic diameter (e.g., [10, 8]), we completely resolve (up to polylogarithmic factors) the complexity of the SSR problem when the diameter of the underlying network is in that range. Our algorithms are very simple (we see this as a plus and not a minus). In addition, we show a stronger lower bound of for the SSSP problem for unweighted directed graphs in the Broadcast CONGEST model that holds even when the diameter of the underlying network is 2. As far as we know this is the first lower bound that achieves for this problem.

Further Related Work

A closely related model to the CONGEST when the underlying communication network has diameter 1 is the Congested Clique model. The Congested Clique model is a synchronous message-passing model in which the underlying communication network is the complete graph on vertices but the graph on which the solution needs to be obtained can be an arbitrary graph on vertices (that is, each vertex initially knows its neighbors in and can exchange messages of size with any vertex in the graph even if they are not adjacent in ).

Censor-Hillel et al. [5] adapted parallel matrix multiplication algorithms to this model. Using these algorithms, they obtained better algorithms for subgraph detection and distance computation. In particular, they showed a -round algorithm for solving the APSP problem for weighted directed graphs, and even more efficient algorithms for unweighted undirected graphs or distance approximation. Recently, it was shown [4] that the SSSP problem for weighted undirected graphs can be solved in rounds.

We note that for problems such as SSSP or APSP (for weighted graphs) the Congested Clique model is actually a special case of the CONGEST model when the diameter of the underlying network is 1. To see this, note that one can always transform the input graph into a complete graph by adding edges of very large weight. Therefore, either one can show a constant upper bound for the weighted SSSP problem in the Congested Clique model which will be quite a breakthrough or our upper bound shows a separation between the SSR problem and the SSSP problem for directed weighted graphs of underlying diameter 1 (and even between the all-pairs reachability problem and the SSSP problem).

2 Preliminaries

In the following, we assume that all directed graphs are simple (i.e., they do not contain self-loops or multiple edges, but they may contain anti-parallel edges). For a graph , we respectively denote by and its vertex set and edge set. The out-degree and in-degree of a vertex in a directed graph are denoted by and , respectively. For a directed graph and a vertex in , we denote by its set of outgoing neighbors, and by its set of ingoing neighbors. Given a directed graph and a set , we denote by the set . The underlying diameter of a directed graph is defined to be the diameter of its underlying graph. For a graph and two vertices and in , we denote by the distance from to in . All logarithms in this paper are of base .

The rest of the paper is organized as follows. In section 3, we show an algorithm that solves the all-pairs reachability problem in one round for networks of diameter 1. In section 4, we show that in two rounds one can compute an approximation for the APSP problem (also for networks of diameter 1). In section 5, we prove lower bounds for computing reachability and distances in networks of diameter 2.

3 All-Pairs Reachability for Networks of Diameter 1

In this section we show that when the diameter of the underlying network is 1, the directed single-source reachability problem can be solved in rounds in the Broadcast CONGEST model. In fact, we show that it can be solved in a single round. Furthermore, our algorithm can solve the much more general problem of all-pairs reachability (again in a single round). The algorithm is extremely simple. Every vertex simply sends its in-degree and out-degree to all of its neighbors in the underlying network, and then, by using this information only, each vertex can determine (by a simple computation) which vertex is reachable from which. This requires messages of at most bits where is the number of vertices in the network (moreover, if there are no anti-parallel edges then, as the underlying diameter is 1, the in-degree plus out-degree of every vertex is exactly and therefore it is enough to send only the in-degree and so bits are enough).

The next lemma shows that when the underlying diameter of some directed graph is 1, we can determine if by using the in and out degrees of the vertices in , for every subset of vertices .

Lemma 3.1.

For every directed graph with underlying diameter and every set , we have if and only if .

Proof.

Let be a directed graph with underlying diameter and let be some subset of . We have and similarly . It follows that

(1)

Now, for showing the first direction, assume that . By equation (1), we have . As , we get that and so .

For the second direction, assume that . Since in addition has underlying diameter , every vertex in must have an outgoing edge to every vertex in and so . It follows, by equation (1), that .

The next lemma shows that when the underlying diameter of some directed graph is 1, the in and out degrees of all the vertices in are enough to determine which vertices are reachable from any given vertex in .

Lemma 3.2.

For every directed graph with underlying diameter , every ordering of its vertices such that and every , there exists an index such that the set of reachable vertices from in is equal to . Moreover, is the minimal index in for which .

Proof.

Let be a directed graph with underlying diameter , let be an ordering of its vertices such that and let . Let be the set of all the reachable vertices from in , and note that we must have (as otherwise can reach a vertex from which is of course contradiction to the definition of and ).

Let be the highest index in for which (such an index must exist as ). Clearly, we have . We claim that we must also have . Since and , the set must contain at least vertices (the vertex and its outgoing neighbors). It also follows that every vertex in must have out-degree at least . To see this, note that every vertex in must have an outgoing edge to every vertex in (as and the underlying diameter of is ). Therefore, it must be that for all as for every such . We conclude that .

Now, as we get from Lemma 3.1 that . We are left to show that is the minimal index in with this property. Assume towards a contradiction that there exists such that and . Let . By Lemma 3.1 we get that , and, in particular, that is not reachable from (as and ) which is a contradiction.

Lemma 3.2 can be easily turned into an algorithm that solves the all-pairs reachability problem in one round (when the diameter of the underlying network is 1) as follows. Each vertex in the graph starts by broadcasting the values of and . After receiving the messages, sorts the vertices in non-decreasing order of their out-degree. Let be that ordering. It then finds for every the minimal index such that and deduces by Lemma 3.2 that the set of reachable vertices from is . We conclude the following:

Corollary 3.1.

In the Broadcast CONGEST model, there is a deterministic algorithm that solves the all-pairs reachability problem in one round when the diameter of the underlying network is .

We also note that the time complexity of the internal computation of each vertex is .

4 APSP Approximation for Networks of Diameter 1

In the previous section, we showed that it is possible to solve the all-pairs reachability problem in one round for networks of diameter 1. Here we show that it is actually possible to compute an approximation to the distance between all pairs of vertices in two rounds using messages of at most bits (where is the number of vertices in the network).

Let be a directed graph on vertices and underlying diameter . For every non-negative integer , we let be the set of all the vertices whose out-degree is greater than and that have some in-going neighbor whose out-degree is at most , that is, . We also set to be if and otherwise.

Next, we define for each vertex a sequence of elements by first setting to be , and then for each , we set to be if and to otherwise. We first prove some basic properties.

Claim 4.1.

For every and , if then and .

Proof.

Let and be such that . By definition, we must have and , that is, must be equal to the maximum out-degree of the vertices in . As, by definition, contains only vertices whose out-degree is greater than , it must be that .

Note that the above claim implies that if holds for some and , then we must have for every and . In particular, we must have for every (as otherwise, we would get that which is impossible as the maximum possible out-degree is ).

In the next two lemmas, we show how the defined sequences can be used to estimate the distances between the vertices in the graph.

Lemma 4.1.

For every two different vertices and in , if is reachable from in , then there exists an index such that and . Moreover, if is the minimal index for which this property holds, then the distance from to in is at least .

Proof.

Let and be two different vertices in such that is reachable from , and let be some shortest path from to in .

Let be the highest index in for which (such an index must exist as always holds). We claim that . Indeed, assume towards a contradiction that this is not the case, that is, assume that . As, in addition, we have , it must be that contains an edge such that and . It follows, by definition, that and so that which implies that . By the maximality of , this is possible only if , that is, we must have which is impossible. We conclude that there must be an index in for which and .

Now, let be the minimal index in with this property. We have to show that contains at least edges. As , there must be a vertex which is adjacent to in . By the definition of , we have and . In other words, contains at least two different vertices whose out-degree is at most . We will show that must also contain at least different vertices whose out-degree is greater than . This will imply that contains at least different vertices, and so at least edges.

First, note that holds for every (as ) and so from the minimality of it must be that holds for every . We claim that for every such there must be a vertex in such that . Indeed, let . As and , it must be that contains an edge such that and . By definition, we have and so , that is, contains a vertex such that . It follows that must contain vertices such that , that is, at least different vertices whose out-degree is greater than .

Lemma 4.2.

For every two vertices and in and every index , if and , then contains a path from to of length at most .

We first prove the following auxiliary claim:

Claim 4.2.

For every two vertices and in , if then contains a path from to of length at most .

Proof.

Let and be two vertices in such that and assume towards a contradiction that the claim does not hold. We must have and as otherwise the distance from to would be at most . Since the underlying diameter of is , we get that , and so that which is a contradiction.

Now, we prove Lemma 4.2:

Proof.

Let and be two vertices in . Let be such that and . We first show by induction that for every there is a vertex of out-degree whose distance from is at most .

The base case () holds as, by the definition of , there must be a vertex such that , that is, there must be a vertex of out-degree whose distance from is at most . Assume now that the claim holds for some and prove it for . By the induction hypothesis, there must be a vertex whose out-degree is and whose distance from is at most . We must have (as ) and so, by definition, there must be some edge such that and . It follows that and so from Claim 4.2 the distance from to is at most . We get that the distance from to is at most , and so the distance from to is at most .

We conclude from the above that there must be some vertex of out-degree whose distance from is at most . As and , we get that . It follows, by Claim 4.2, that the distance from to is at most , and so the distance from to is at most .

Lemmas 4.1 and 4.2 give us a way to estimate the distance from every vertex to every other vertex by just knowing and the sequence . To see this, note first that these lemmas imply that is reachable from if and only if there exists an index such that and . Furthermore, these lemmas imply that if is the minimal index with this property then . Therefore, to compute an estimation for the distance , all we need to do is to find the minimal index (if any) such that and , and then set to be if no such an index exists or to otherwise.

Our next goal is to show that in two communication rounds each vertex can learn the values of and of each vertex . To this end, we describe a slightly different but equivalent way to compute the sequence .

For every vertex with positive out-degree, we let be the maximum out-degree of its out-going neighbors, that is, . For each , we set to be , and then we set to be if and otherwise.

Claim 4.3.

For every , we have .

Proof.

Let . We consider separately the cases and . For the first case, note that, by definition, we must have , that is, there cannot be any vertex of out-degree at most that has an out-going neighbor of out-degree greater than . In other words, we must have for every such that , and so which implies that .

In the second case, we have and so, by definition, there must be some vertex of out-degree at most that has an out-going neighbor of out-degree . Moreover, is the largest possible out-degree of any vertex in the out-going neighborhood of any vertex of out-degree at most . This means that and for every such that , and so .

Now, given a vertex , it is easy to see that is equal to if and to otherwise. Moreover, for every we have, by the above claim, if and otherwise. This observation gives rise to the following algorithm. Each vertex first broadcasts its out-degree to all of its neighbors in the underlying network. In the next round, each vertex with positive out-degree finds the maximal out-degree of its outgoing neighbors and broadcasts this value. By using the received information, each vertex can compute for every and the value of and so compute a -approximation to the distance between every pair of vertices. We conclude the following:

Corollary 4.1.

In the Broadcast CONGEST model, there is a deterministic algorithm that solves the 3-approximate APSP problem (for unweighted directed graphs) in two rounds when the diameter of the underlying network is 1.

5 Lower Bounds for Networks of Diameter 2

In this section we prove lower bounds for the single-source reachability problem and the closely related single-source shortest path problem for unweighted directed graphs in the Broadcast CONGEST model that hold even when the underlying network has diameter 2.

5.1 The Single-Source Reachability Problem

We start this section by describing a family (parameterized by two positive integers and ) of directed graphs with underlying diameter at most which we denote by . This family will be used later on to prove the required lower bound.

The Family .

For two positive integers and a -bit string , we define the directed graph to be the graph that consists of:

  • vertex-disjoint directed paths with vertices each (that is, where and for every ).

  • A source vertex that has an outgoing edge to (the first vertex of ) if the -th bit of is , for every .

  • A sink vertex to which and every vertex in has an outgoing edge.

In other words, the vertex set of the graph is and its edge set is (see Figure 1 for an illustration). For two positive integers and , we define the family to be the set .

Figure 1: An illustration of the graph for , and .

Our next goal is to show that any distributed algorithm that solves the single-source reachability problem for all the graphs in requires a significant number of rounds. We start with the following lemma:

Lemma 5.1.

Let and be two positive integers. Let and let be some legal assignment of identifiers to its vertices. Let be some deterministic distributed algorithm (in the Broadcast CONGEST model) that solves the single-source reachability problem on the instance using at most rounds (for some non-negative integer ). For each , the output of the vertex by the end of the last round is just a function of the initial input of the vertices and the sequence of messages that received from .

Proof.

Let . We will show by induction on that by the end of the -th round the state of each vertex is just a function of the initial input of the vertices in its ball of radius (in the underlying graph of ) and the sequence of messages that it received from up to this round.

The base case () clearly holds as the state of each vertex in by the end of round can depend only on its initial input. Assume now that the claim holds for some and prove it for . Let . The state of by the end of the ()-th round is a function of its state at the end of the previous round and the messages that it received from its neighbors in the underlying network (which are , and possibly ).

The messages that has received from its neighbors in are, by the induction hypothesis, functions of the inputs of the vertices in the balls of radius (in ) around these neighbors and the sequence of messages that they received from (up to round ). As broadcasts the same message to all the vertices in each round, we get that these messages are just a function of the inputs of the vertices in the ball of radius around (in ) and the sequence of messages that received from (up to round ). As the previous state of is, by the induction hypothesis, also a function of the initial inputs of the vertices in its ball of radius (in ) and the sequence of message that it received from , the claim follows.

Lemma 5.2.

Let and be two positive integers and let be some legal assignment of identifiers to . For every deterministic algorithm (in the Broadcast CONGEST model), if solves the single-source reachability problem on all the instances in and uses messages of size at most bits (for some ), then requires at least rounds.

Proof.

Let be some deterministic algorithm that satisfies the requirements of the lemma and let be its running time. We can assume that as otherwise there is nothing to show.

For each , we let be the sequence where is the output of when is invoked on , for every . Lemma 5.1 implies that for each the value of is just a function of the initial inputs in of the vertices and the sequence of messages that had broadcast. Since we assumed that , the initial inputs of these vertices is the same in all , and so we can have for two graphs and in only if the sequence of messages that had broadcast in the corresponding invocations was different.

In each round, may send a message that contains at most bits, that is, a message with bits, or with bit and so on. Therefore, there are different messages that may send in each round. It follows that there are at most possible sequences and so we get that . Note also that as for each the output should be different. We conclude that and so .

Corollary 5.1.

In the Broadcast CONGEST model, there is no deterministic algorithm that solves the single-source reachability problem in rounds even when the diameter of the underlying network is always .

Proof.

Assume towards a contradiction that there exists a deterministic algorithm that solves the above problem in rounds. As works in the CONGEST model, there must be some constant such that the number of bits in any message that the algorithm may send (when it is invoked on inputs of size ) is at most .

Since , there must be some integer for which holds for every . Choose an integer such that both and are positive integers. Lemma 5.2 implies that there must be some and some assignment of identifiers to such that invoking the algorithm on requires at least rounds. But, we also have and so the algorithm must take at most rounds on , a contradiction.

In the next lemma, we show that the same lower bound holds for distributed randomized algorithms as well.

Lemma 5.3.

Let and be two positive integers and let be some legal assignment of identifiers to . For every randomized algorithm (in the Broadcast CONGEST model), if correctly solves the SSR problem on each instance in with probability and uses messages of size at most bits (for some ), then requires at least rounds.

Proof.

Clearly, it is sufficient to show that this lower bound holds in a model that generates a public random string first, announces it to every vertex in the graph and then every vertex proceeds deterministically as usual. Let be a randomized algorithm that works in the above model and solves the SSR problem on every instance in with probability . We can assume that its running time is at most . As in the proof of Lemma 5.2, we can show that, for every fixed random string , the algorithm (given that string) can succeed on at most of the instances in .

For each graph in , let be the event that the algorithm fails on . Note that we must have . By assumption, we must also have and so which implies that .

5.2 The Single-Source Shortest Path Problem

The result of the previous section already gives a lower bound of for the directed SSSP problem (or even for the approximate version of it) as, by definition, any algorithm that solves this problem must also solve the SSR problem.

In this section we show a slightly stronger lower bound of for this problem which holds even when the diameter of the underlying network is 2 and even when all the vertices in the input graph are guaranteed to be reachable from the given source. As in the previous section, we start by describing a family of unweighted directed graphs with underlying diameter 2 which will be used to prove the lower bound.

The Family .

For a positive integer and a sequence of numbers from , we define the directed graph to be the graph that consists of:

  • vertex-disjoint directed paths where each contains vertices. For each path , we denote by its first vertex and by its last vertices.

  • A source vertex that has an outgoing edge to the first vertex of every path in .

  • A sink vertex to which and every vertex in has an outgoing edge.

In other words, the vertex set of the graph is and its edge set is (see Figure 2 for an illustration). For a positive integer , we define the family to be the set