Distributed Approximation Algorithms for Steiner Tree in the CONGESTED CLIQUE

07/28/2019
by   Parikshit Saikia, et al.
iit guwahati
0

The Steiner tree problem is one of the fundamental and classical problems in combinatorial optimization. In this paper, we study this problem in the CONGESTED CLIQUE model of distributed computing and present two deterministic distributed approximation algorithms for the same. The first algorithm computes a Steiner tree in Õ(n^1/3) rounds and Õ(n^7/3) messages for a given connected undirected weighted graph of n nodes. Note here that Õ(·) notation hides polylogarithmic factors in n. The second one computes a Steiner tree in O(S + n) rounds and O(S (n - t)^2 + n^2) messages, where S and t are the shortest path diameter and the number of terminal nodes respectively in the given input graph. Both the algorithms admit an approximation factor of 2(1 - 1/ℓ), where ℓ is the number of terminal leaf nodes in the optimal Steiner tree. For graphs with S = ω(n^1/3 n), the first algorithm exhibits better performance than the second one in terms of the round complexity. On the other hand, for graphs with S = õ(n^1/3), the second algorithm outperforms the first one in terms of the round complexity. In fact when S = O( n) then the second algorithm admits a round complexity of O( n) and message complexity of Õ(n^2). To the best of our knowledge, this is the first work to study the Steiner tree problem in the CONGESTED CLIQUE model.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

01/16/2018

Round- and Message-Optimal Distributed Graph Algorithms

Distributed graph algorithms that separately optimize for either the num...
06/06/2019

Quantum Distributed Algorithm for the All-Pairs Shortest Path Problem in the CONGEST-CLIQUE Model

The All-Pairs Shortest Path problem (APSP) is one of the most central pr...
01/16/2018

Round- and Message-Optimal Distributed Part-Wise Aggregation

Distributed graph algorithms that separately optimize for either the num...
08/26/2019

Low-Congestion Shortcut and Graph Parameters

The concept of low-congestion shortcuts is initiated by Ghaffari and Hae...
03/13/2013

Sidestepping the Triangulation Problem in Bayesian Net Computations

This paper presents a new approach for computing posterior probabilities...
08/01/2019

Distributed Data Summarization in Well-Connected Networks

We study distributed algorithms for some fundamental problems in data su...
11/15/2018

Large-Scale Distributed Algorithms for Facility Location with Outliers

This paper presents fast, distributed, O(1)-approximation algorithms for...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The   model (CCM) is one of the fundamental models in distributed computing that was first introduced by Lotker et al. Lotker et al. (2005). In this model nodes can communicate with each other via an underlying communication network, which is a clique. Communication happens in synchronous rounds and a pair of nodes can exchange bits in a round. In this paper we assume that , where is the number of nodes in the communication network. In literature there are two other classic models of distributed computing, namely and . The model of distributed computing mainly focuses on locality111In distributed computing the locality means processors are restricted to collecting data from others which are at a distance of hops in time units. The issue of locality is, to what extent a global solution to a computational problem can be obtained from locally available data Linial (1992). and ignores the congestion by allowing messages of unlimited sizes to be communicated Peleg and Rubinovich (2000); Linial (1992). The model on the other hand simultaneously considers the congestion (by bounding the transmitted message size) and locality. In contrast, the CCM takes locality out of the picture and solely focuses on congestion. Since in CCM the hop diameter is one therefore nodes can directly communicate with each other and in each round they can together exchange bits. Note that in all the above three models of distributed computing, nodes (processors) are considered computationally unbounded.

The nodes and a subset of the edges of the communication network form the input graph in the CCM. In the input graph each node has a unique identity (ID) and it knows the weights of all the edges incident on it. In this paper we study a classical combinatorial optimization problem, the Steiner tree problem, in the CCM. It is defined as follows.

Definition 1.1 (Steiner tree (ST) problem).

Given a connected undirected graph and a weight function , and a set of vertices , known as the set of terminals, the goal of the ST problem is to find a tree such that is minimized subject to and .

The set is known as the set of non-terminals or Steiner nodes. Note that if then the ST problem reduces to the problem of finding shortest-path between two distinct nodes in the network. On the other hand if then the ST problem becomes the minimum spanning tree (MST) problem. Specifically the ST problem is a generalized version of the MST problem. It is known that both MST and the shortest path problem can be solved in polynomial time whereas the ST problem is one of the original 21 problems proved NP-complete by Karp Karp (1972) (in the centralized setting). The best known (polynomial time) approximation ratio for solving the ST problem in the centralized setting is , for due to Byrka et al. Byrka et al. (2010). It is also known that the ST problem can not be solved in polynomial time with an approximation factor unless Chlebík and Chlebíková (2008).

The ST problem finds applications in numerous areas such as the VLSI layout design, communication networks, transportation networks, content distribution (video on demand, streaming multicast) networks, phylogenetic tree reconstruction in computational biology etc. Moreover the ST problem appears as a subproblem or as a special case of many other problems in network design such as Steiner forest, prize-collecting Steiner tree etc. There are many variations of the ST problem such as directed Steiner tree, metric Steiner tree, euclidean Steiner tree, rectilinear Steiner tree, and so on. Hauptmann and Karpinaski Hauptmann and Karpinski (2015) provide a website with continuously updated state of the art results for many variants of the problem.

Motivation. The motivation behind the study of the CCM is to understand the role of congestion in distributed computing. There has been a lot of progress in solving various problems in the CCM including minimum spanning tree (MST) Lotker et al. (2005); Hegeman et al. (2014); Ghaffari and Parter (2016); Jurdziński and Nowicki (2018), facility location Berns et al. (2012); Gehweiler et al. (2014), shortest paths and distances Censor-Hillel et al. (2015); Holzer and Pinsker (2016); Nanongkai (2014), subgraph detection Drucker et al. (2012), triangle finding Drucker et al. (2012); Dolev et al. (2012), sorting Patt-Shamir and Teplitsky (2011); Lenzen (2013), routing Lenzen (2013), and ruling sets Hegeman et al. (2014); Berns et al. (2012). Recently Pai and Pemmaraju Pai and Pemmaraju (2019) studied the graph connectivity lower bounds in the CCM. Specifically they showed that the lower bound round complexity for graph connectivity problem in the BCCM()222

In literature the CCM is classified into two types namely

  model (BCCM) and     model (UCCM) Drucker et al. (2014). In BCCM() each node can only broadcast a single -bit message over each of its incident links in each round. On the other hand the -bits UCCM (UCCM()) allows each node to send a possibly different -bit message over each of its incident links in the network in each round., which is , holds for both deterministic as well as constant-error randomized Monte Carlo algorithms Pai and Pemmaraju (2019). Despite the fact that the ST problem has been extensively studied in the model of distributed computing Chen et al. (1993); Chalermsook and Fakcharoenphol (2005); Khan et al. (2008); Lenzen and Patt-Shamir (2015); Saikia and Karmakar (2019), to the best of our knowledge, such a study has not been carried out in the CCM. The best deterministic round complexity for solving the ST problem in the model was recently proposed by Saikia and Karmakar Saikia and Karmakar (2019), which is with the approximation factor of , where is the shortest path diameter333The term shortest path diameter was first introduced by Khan and Pandurangan Khan and Pandurangan (2008). (the definition is deferred to Section 2) of a graph and is the number of terminal leaf nodes in the optimal ST, which improves the previous best round complexity of the ST problem Lenzen and Patt-Shamir (2015). The MST problem is a special case of the ST problem and has been extensively studied in the model as well as in the CCM. The best deterministic round complexity known so far for solving the MST problem in the CCM is due to Lotker et al. Lotker et al. (2005), which is . The algorithm in Lotker et al. (2005) has a message complexity of . There are other algorithms for the MST problem in the CCM that are randomized in nature with the round complexities of Hegeman et al. (2015) and Pemmaraju and Sardeshmukh (2016); Ghaffari and Parter (2016). The message complexities of the algorithms in Hegeman et al. (2015), Ghaffari and Parter (2016), and Pemmaraju and Sardeshmukh (2016) are , , and respectively. Here . Recently Jurdziński and Nowicki Jurdziński and Nowicki (2018) achieved a randomized algorithm that constructs an MST in

rounds in the CCM with high probability. Therefore an intriguing question is:

“What is the best round complexity that can be achieved in solving the ST problem in the CCM while maintaining an approximation factor of at most ?”

In CCM, one can trivially compute an ST in rounds by maintaining an approximation factor of at most . It can be computed as follows. One can collect the entire topology of the input graph in a special node , which takes rounds. Then we can compute an ST by applying one of the best known centralized ST algorithms Kou et al. (1981); Wu et al. (1986); Byrka et al. (2010) whose approximation factor is at most , and finally inform each of the nodes involved with the resultant ST. Note that the resultant ST has at most edge information which can be decomposed into messages. Therefore can perform the final step in rounds by sending each edge of the resultant ST to a different intermediate node, which will eventually sends to the destined node.


Our contribution. In this work we propose two non-trivial deterministic distributed approximation algorithms for the ST problem in the CCM. Both the algorithms admit an approximation factor of . The first one, which will be denoted by STCCM-A, computes an ST in rounds and messages. We also propose a deterministic distributed shortest path forest (SPF) (the definition is deferred to Section 3) algorithm in the CCM (henceforth it will be denoted by SPF-A) that computes a SPF in rounds and messages which will be used as a subroutine in the proposed STCCM-A algorithm. The SPF-A algorithm is based on an all pairs shortest path (APSP) algorithm in the CCM due to Censor-Hillel et al. Censor-Hillel et al. (2015). The first contribution of this paper is summarized in the following theorem.

Theorem 1.1.

Given a connected undirected weighted graph and a terminal set , there exists an algorithm that computes an ST in rounds and messages in the CCM with an approximation factor of , where is the number of terminal leaf nodes in the optimal ST.

The proposed STCCM-A algorithm is inspired by an algorithm proposed in Saikia and Karmakar (2019). It consists of four steps (each step is a small distributed algorithm)– the first step is to build a SPF of the input graph for a given terminal set , which is essentially a partition of the graph into disjoint trees: Each partition contains exactly one terminal and a subset of non-terminals. A non-terminal joins a partition containing the terminal if and only if , .444 denotes the (weighted) shortest distance between nodes and in graph . In the second step the weights of the edges of with respect to the SPF are suitably changed, which produces a modified graph ; in the third step the MST algorithm proposed by Lotker et al. Lotker et al. (2005) is applied on the graph to build an MST ; and finally some edges are pruned from in such a way that in the remaining tree (which is the required ST) all leaves are terminal nodes.

The second proposed algorithm for the ST problem in the CCM, which will be denoted by STCCM-B, computes a -approximate ST in rounds and messages. Similar to the STCCM-A algorithm, the STCCM-B algorithm also consists of four steps. Except the step , all other steps in STCCM-B algorithm are same as that of the STCCM-A algorithm. For step in the STCCM-B algorithm, which is the SPF construction, we propose another SPF algorithm in the CCM (henceforth it will be denoted by SPF-B) that computes a SPF in rounds and messages. The second contribution of this paper is summarized in the following theorem.

Theorem 1.2.

Given a connected undirected weighted graph and a terminal set , there exists an algorithm that computes an ST in rounds and messages in the CCM with an approximation factor of , where , is the shortest path diameter of  , and   is the number of terminal leaf nodes in the optimal ST.

As a by-product of the above theorem, for constant or sufficiently small shortest path diameter networks (where ) the following corollary holds.

Corollary 1.3.

If   then a -approximate ST can be deterministically computed in rounds and messages in the CCM.

The above round and message complexities of the STCCM-B algorithm almost coincide with the best known deterministic result for MST construction in the CCM due to Lotker et al. Lotker et al. (2005) and the approximation factor of the resultant ST is at most of the optimal.

Related work. Chen et al. Chen et al. (1993) proposed the first deterministic distributed algorithm for the ST problem in the model and achieved an approximation factor of . The time and message complexities of this algorithm are and respectively. Chalermsook et al. Chalermsook and Fakcharoenphol (2005) presented a deterministic distributed -approximate algorithm for the ST problem in the synchronous model (which allows a bounded message size only) with time and message complexities of and respectively. Khan et al. Khan et al. (2008) presented an -approximate randomized distributed algorithm for the ST problem in the model with time complexity and message complexity . Lenzen and Patt-Shamir Lenzen and Patt-Shamir (2015) presented two distributed algorithms for the Steiner forest problem (a more generalized version of the ST problem) in the model: one is deterministic and the other one is randomized. The former one finds, a -approximate Steiner forest in rounds, where is the unweighted diameter and is the number of terminal components in the given input graph. The latter one finds a -approximate Steiner forest in rounds with high probability. Note that if then the Steiner forest problem reduces to the ST problem. In this case the round complexities of the two algorithms in Lenzen and Patt-Shamir (2015), in which one is deterministic and the other one is randomized reduce to and respectively. Saikia and Karmakar Saikia and Karmakar (2019) proposed a deterministic distributed -factor approximation algorithm for the ST problem in the model with the round and message complexities of and respectively, where is the maximum degree of a node in the graph. Recently Bachrach et al. Bachrach et al. (2019) showed that the lower bound round complexity in solving the ST problem exactly in the model is . In the approximate sense on the other hand, no immediate lower bound result exists for solving the ST problem in the model. Since the ST problem is a generalized version of the MST problem, we believe that in the approximate sense the lower bound results for the MST problem555Das Sarma et al. Das Sarma et al. (2011) showed that approximating (for any factor ) MST requires rounds (assuming bits can be sent through each edge in each round) in the model. Kutten et al. Kutten et al. (2015) established that is the message lower bound for leader election in the model (i.e. Knowledge Till radius 0) which holds for both the deterministic as well as randomized (Monte Carlo) algorithms even if the network parameters , , and are known, and all the nodes wake up simultaneously. Since a distributed MST algorithm can be used to elect a leader, the above message lower bound in the model also applies to the distributed MST construction. also hold for the ST problem in the model.

References Model Type Round complexity Message complexity Approximation
Chen et al. Chen et al. (1993) CM DT
Chalermsook et al. Chalermsook and Fakcharoenphol (2005) CM DT
Khan et al. Khan et al. (2008) CM RM
Lenzen and Patt-Shamir Lenzen and Patt-Shamir (2015) CM DT -
RM -
Saikia and Karmakar Saikia and Karmakar (2019) CM DT
this paper CCM DT
Table 1: Summary of results for Steiner tree problem in distributed setting. Here CM = model, DT = deterministic, RM = randomized, , , , , and are the shortest path diameter and the unweighted diameter respectively of a connected undirected weighted graph .

Performances of some of the distributed algorithms mentioned above, together with that of our work, are summarized in Table 1.


Paper organization. The rest of the paper is organized as follows. In Section 2 we define the system model and notations. The description of the SPF-A algorithm is given in Section 3. The description of the STCCM-A algorithm and an illustrative example of it are given in Section 4. The proof of the STCCM-A algorithm is given in Section 5. The description and proof of the STCCM-B algorithm are given in Section 6. A brief description of the Censor-Hillel et al.’s APSP algorithm and Lotker et al’s MST algorithm are deferred to  A and  B respectively. We conclude the paper in Section 7.

2 Model and notations

System model. We consider the CCM as specified in Lotker et al. (2005). This model consists of a complete network described by a clique of nodes. Each node represents an independent computing entity (processor). Nodes are connected through a point-to-point network of bidirectional communication links. The bandwidth of each of the communication links is bounded by bits.

At the beginning of the computation, each node knows its own (unique) identity (ID) which can be represented by bits and the part of the input assigned to it.666In this paper, we assume that initially each node knows only its own identity and weights of all the edges incident on it; nodes do not have initial knowledge of the identifiers of their neighbors and other nodes. This is known as the model, also called the clean network model Peleg (2000). On the other hand if each node has initial knowledge of itself and the identifiers of its neighbors then such a model is known as the model (i.e. Knowledge Till radius 1). In model only the knowledge of the identifiers of neighbors is assumed, not other information such as the incident edges of the neighbors. In the CCM the input graph and the underlying communication network, which is a clique, are not same. The input graph is distributed among the nodes (processors) of the clique via a vertex partition. In this paper we consider that each vertex of along with its incident edges (with their weights) are assigned to a distinct node (processor) in the clique. If an edge does not exist in then the weight of such an edge is considered equal to in the clique. Communication happens in synchronous rounds and a pair of nodes can exchange bits in each round. Nodes communicate and coordinate their actions with other nodes by passing messages (of size bits) only. In general, a message contains a constant number of edge weights, node IDs, and other arguments (each of them is polynomially bounded in ). Note here that each of the arguments in a message is polynomially bounded in and therefore polynomially many sums of arguments can be encoded with bits. We assume that nodes and links do not fail.

An execution of the system advances in synchronous rounds. In each round: nodes receive messages that were sent to them in the previous round, perform some local computation, and then send (possibly different) messages. The time complexity is measured by the number of rounds required until all the nodes (explicitly) terminate. The message complexity is measured by the number of messages sent until all the nodes (explicitly) terminate.

Notations. We use the following terms and notations.

  • denotes the set of edges incident on a node . Similarly denotes the set of edges having exactly one endpoint in a subgraph .

  • denotes the weight of an edge .

  • denotes the state of an edge .

  • denotes the source node of a node .

  • denotes the shortest distance from a node to its source ID .

  • denotes the predecessor node of a node .

  • Let denotes a matrix of size , where is a positive integer. Then denotes the value of an entry located at the row and column in .

  • denotes the message . Here are the arguments of the message . Note that unless it is necessary, arguments of will not be shown in it.

3 SPF construction in CCM

The SPF is defined as follows.

Definition 3.1 (Spf Chen et al. (1993)).

Let be a connected undirected weighted graph, where is the vertex set, is the edge set, and is the non-negative weight function. Given a subset , a SPF for in is a sub-graph of consisting of disjoint trees , such that

  • for all , contains exactly one node of .

  • if then its in is .

  • and for all .

  • .

  • The shortest path between and in is the shortest path between and in .

3.1 SPF-A algorithm

The SPF-A algorithm is used as a subroutine in the proposed STCCM-A algorithm. Here we give a brief description of it. It consists of two parts: the first part constructs an APSP of the input graph and the second part constructs the required SPF from the graph formed by the APSP. Specifically, in the first part we apply the algebraic method due to Censor-Hillel et al. Censor-Hillel et al. (2015). Censor-Hillel et al. showed that in the CCM, the iterated squaring of the weight matrix over the min-plus semiring Fischer and Meyer (1971); Munro (1971) computes an APSP in rounds and messages. One of the fundamental applications of the APSP is the construction of the routing tables in a network. Specifically a routing table entry denoted by is a node such that and lies on a shortest path from to . Censor-Hillel et al. also showed that in the CCM the iterated squaring algorithm can be used to construct routing tables of a network as well. For the sake of completeness a brief description of the iterated squaring algorithm and the procedure of the routing table construction due to Censor-Hillel et al. is provided in A.

Now we describe the second part of the SPF-A algorithm and show that it requires rounds and messages. From the first part we assume that each node in knows the shortest path distances to all other nodes in and its routing table entries . Therefore, by using the shortest path distance information each node can locally choose the closest terminal as its source node . Note that there may be more than one terminal with equal shortest distances for a given non-terminal. In this case, the non-terminal chooses the one with the smallest ID among all such terminals. Once a non-terminal chooses the closest terminal as its source node , by using its own routing table , it can also choose its parent node with respect to . Whenever a non-terminal sets its , it also informs that it has chosen as its parent. It is obvious that to establish the parent-child relationship between a pair of nodes , where , it requires rounds and messages. In this way each node is connected to exactly one tree rooted at some source node . Here each source node is a terminal. Therefore exactly number of shortest path trees are constructed by the above procedure which together form the required SPF. Since the procedure of choosing parent can be started in parallel by all nodes in , for all such pair of nodes it requires rounds and messages. It is clear that the overall round and message complexities of the SPF-A algorithm are dominated by the first part of the algorithm. Therefore the following theorem holds.

Theorem 3.1.

A SPF can be deterministically computed in rounds and messages in the CCM.

The correctness of the SPF-A algorithm directly follows from the correctness of the algorithm proposed by Censor-Hillel et al. Censor-Hillel et al. (2015).

4 STCCM-A algorithm

4.1 Preliminaries

Definition 4.1.

(Complete distance graph) A graph is called a complete distance graph on the node set of a connected undirected weighted graph only if for each pair of nodes , there is an edge in and the weight of the edge is the length of the shortest path between and in .

The approximation factor of the proposed STCCM-A algorithm directly follows from the correctness of a centralized algorithm due to Kou et al. Kou et al. (1981) (Algorithm H). For a given connected undirected weighted graph and a set of terminals , the Algorithm H computes an ST as follows.

  1. Construct a complete distance graph .

  2. Find an MST of .

  3. Construct a sub-graph of by replacing each edge of with its corresponding shortest distance path in .

  4. Find an MST of .

  5. Construct an ST , from by deleting edges of so that all leaves of are terminals.

The running time of the Algorithm H is . Following the principles of both Prim’s and Krushkal’s algorithms, Wu at el. Wu et al. (1986) proposed a faster centralized algorithm (Algorithm M) which improves the time complexity to , achieving the same approximation ratio as that of the Algorithm H. The Algorithm M computes a sub-graph called generalized MST for of which is defined as follows.

Definition 4.2 (Generalized MST Wu et al. (1986)).

Let be a connected undirected weighted graph and be a subset of . Then a generalized MST is a sub-graph of such that

  • there exists an MST of the complete distance graph such that for each edge in , the length of the unique path in between and in is equal to the weight of the edge in .

  • all leaves of are in .

It is clear that is an ST for in and is the actual realization of (Recall that is the MST of ). In summary, the Algorithm M constructs a generalized MST for of as follows. Initially, the set of nodes in are treated as a forest of separate trees and successively merge them until all of them are in a single tree . A priority queue is used to store the frontier vertices of paths extended from the trees. Each tree gradually extends its branches into the node set . When two branches belonging to two different trees meet at some node then they form a path through that node merging these two trees. The algorithm always guarantees to compute only such paths of minimum length for merging trees.

The proposed STCCM-A algorithm also computes a generalized MST for of (which is essentially the required ST with the approximation factor of ) in the CCM. The round and message complexities of the STCCM-A algorithm are and respectively. Saikia and Karmakar Saikia and Karmakar (2019) proposed a -factor deterministic distributed approximation algorithm which also computes a generalized MST for of in rounds and messages. However, the algorithm in Saikia and Karmakar (2019) was proposed for the model, whereas the STCCM-A algorithm is proposed for the CCM.

There are four small distributed algorithms (step to step ) involved in the STCCM-A algorithm similar to the algorithm proposed in Saikia and Karmakar (2019). However, except the step , all other steps in the STCCM-A algorithm are different from the algorithm proposed in Saikia and Karmakar (2019). The step of the STCCM-A algorithm is similar to that of the algorithm proposed in Saikia and Karmakar (2019).

4.2 Outline of the STCCM-A algorithm

Input specification. We assume that there is a special node available at the start of the algorithm. For correctness we assume that is the node with the smallest ID in the system. Initially each node knows its unique ID, whether it is a terminal or a non-terminal, and weight of each edge . Each node in maintains a boolean variable whose values can be either or . Initially is set to for each , whereas throughout the execution of the algorithm the value of is for each . Also each node initially sets its local variable to for each . Recall that denotes the state of an edge .

Output specification. Whenever the algorithm terminates, the pair at each node defines the distributed output of the algorithm. Here . ensures that does not belong to the final ST; in this case . On the other hand ensures that is a part of the final ST; in this case and for each , is set to .

The special node initiates the algorithm. An ordered execution of the steps is necessary for the correct working of the STCCM-A algorithm. We assume that ensures the ordered execution of the steps (step to step ) and initiates the step after the step is terminated. The outline of the proposed STCCM-A algorithm is as follows.

  1. SPF-construction. Construct a SPF for in by applying the SPF-A algorithm described in Section 3.1. Theorem 3.1 ensures that the round and message complexities of this step are and respectively.

  2. Edge Weight modification. With respect to the SPF , each edge of the graph is classified as any one of the following three types.

    1. edge: if .

    2. edge: if and end points are incident in two different trees of the .

    3. edge: if and end points are incident in the same tree of the .

    Now transform into . The cost of each edge in is computed as follows.

    1. if is a edge.

    2. if is an edge.

    3. if is an edge. In this case realizes the weight of a path from the source node to the source node in that contains the edge . Recall that denotes the (weighted) shortest distance between nodes and in .

    The classification of the edges of and the transformation to can be done as follows. Each node of sends a message (say ) on all of its incident edges with respect to the input graph . Let a node receives on an incident edge . If then sets as an edge and to . On the other hand if then can be either a edge or an edge: if or then sets as edge and to . Otherwise, sets as edge and to .

    It is clear that step can be performed in rounds. Also on each edge of , the message is sent exactly twice (once from each end). Therefore, the message complexity of the step is .

  3. MST construction. Construct an MST of . In this step, we apply the deterministic MST algorithm proposed by Lotker et al. Lotker et al. (2005) (a brief description of the algorithm is deferred to  B). To the best of our knowledge, it is the only known deterministic MST algorithm proposed in the CCM till date. All other existing MST algorithms in the CCM Hegeman et al. (2015); Pemmaraju and Sardeshmukh (2016); Ghaffari and Parter (2016); Jurdziński and Nowicki (2018) are randomized in nature. Note that the round and message complexities of the algorithm proposed by Lotker et al. are and respectively.

    We assume that each node ( contains all the nodes of ) knows which of the edges in are in the and for each such edge it sets to . On the other hand for each which is not in the , sets to .

  4. Pruning. Construct a generalized MST by performing a pruning operation on the MST . For correctness we assume that a node in with the smallest ID is the root of . The pruning operation deletes edges from until all leaves are terminal nodes. It is performed as follows. Each sends its parent information (with respect to the rooted at ) to all other nodes. This requires rounds and messages. Now each locally computes the rooted at from these received parent information. Then each node in can locally decide whether it should prune itself or not from the . From the locally known each finds whether it is an intermediate node in between two or more terminals in the or not. If yes, it does not prune itself from the and sets its to . Otherwise, it prunes itself. Whenever a node prunes itself from the it sets its to and for each such that , it sets its to . On each such pruned edge , asks the other end of to prune the common edge . Now the edge weights of resultant ST are restored to the original edge function . Since each node in can start the pruning operation in parallel and in the worst-case the number of pruned edges in the network can be at most , this step takes rounds and messages.

    The overall round and message complexities of the pruning step are and respectively.

The STCCM-A algorithm terminates with the step .

Figure 1: (a) An arbitrary graph and a terminal set . (b) The input graph is distributed among the processors of a complete network via a vertex partition. (c) A SPF for of . The distances of nodes to their respective sources are shown in the table. (d) The modified graph . (e) An MST of . (f) The final Steiner tree (generalized MST) for of .
Figure 1 (continued).

4.3 An illustrating example of the STCCM-A algorithm

Consider the application of the STCCM-A algorithm in an arbitrary graph and a terminal set shown in Figure 1(a). The input graph is distributed among the nodes (processors) of a complete network via a vertex partition which is shown in Figure 1(b). Specifically each vertex and its incident edges (with weights) of are assigned to a distinct processor in the . The weight of an edge in which is not in is considered equal to (not shown in the figure). A SPF for is constructed which is shown in Figure 1(c). In , each non-terminal is connected to a terminal whose distance is minimum to than any other terminal in which is shown in the table of Figure 1(c). The modified graph and labelling of the edge weights according to the definition of are shown in Figure 1(d). Figure 1(e) shows after the application of the Lotker et al.’s MST algorithm on which constructs an MST of . The final Steiner tree for of , which is a generalized MST for of is constructed from the by applying the pruning step of the STCCM-A algorithm, which is shown in Figure 1(f).

5 Proof of the STCCM-A algorithm

Theorem 5.1.

The round complexity of the STCCM-A algorithm is .

Proof.

It is clear that the overall round complexity of the STCCM-A algorithm is dominated by the step . Therefore the round complexity of the STCCM-A algorithm is . The polylogarthmic factors involved with this round complexity is . ∎

Theorem 5.2.

The message complexity of the STCCM-A algorithm is .

Proof.

By Theorem 3.1 the message complexity of the step 1 of the STCCM-A algorithm is . Each of the step 3 and step 4 requires messages. The step requires messages. We know that . All of these ensures that the overall message complexity of the STCCM-A algorithm is dominated by the step . Therefore the message complexity of the STCCM-A algorithm is . The polylogarthmic factors involved with this message complexity is . ∎

Definition 5.1 (Straight path).

Given that is a connected undirected weighted graph and . Let . Then a path between and is called straight only if all the intermediate nodes in (may contain no intermediate node) are in .

Lemma 5.3.

Given that is a connected undirected weighted graph and . Let and there exists a straight path between and in , where is a resultant MST constructed after the consecutive applications of step , , and of the STCCM-A algorithm on the given graph . Then in is the shortest straight path between and in .

Proof.

By contradiction assume that is not the shortest straight path in . Let there exists a straight path between and such that . We show that does not hold. Since and are two different terminals they were in different trees of the SPF, say and , before they are included in . Let such that and contain and respectively as shown in Figure 2. Note that the correctness of the SPF algorithm described in Section 3.1 ensures that , , , and are the shortest paths between the respective nodes in as shown in Figure 2. During the execution of the step of the STCCM-A algorithm, which constructs the MST , finds as its minimum weight outgoing edge (MWOE) and finds as its MWOE. Similarly and finds and as their MWOEs respectively. During the construction of the MST since and are merged through , this indicates that is the MWOE of . Similarly is the MWOE of . Since and are merged along the path to form , this ensures that contradicting the fact . Therefore is the shortest straight path between and in .

Figure 2: A state before merging of two shortest path trees and along a shortest straight path . The edge categories in and are shown according to the graph constructed in step of the STCCM-A algorithm. Note that both the trees and are subgraphs of a complete graph .

Consider a graph whose vertex set and edge set are defined from as follows. For each straight path of , let be an edge of . Then the following lemma holds.

Lemma 5.4.

is an MST of for of graph .

The correctness of the above lemma directly follows from the correctness of the Lemma  in Saikia and Karmakar (2019).

It is obvious that if we unfold the tree then it transforms to a resultant graph , which satisfies the following properties.

  • For each straight path between and in , there exists an edge , where and the length of the straight path between and in is the shortest one between and in .

  • All leaves of are in .

Therefore the following theorem holds.

Theorem 5.5.

The tree computed by the STCCM-A algorithm is a generalized MST for of .

Let denotes the sum of weights of all the edges in a graph . Let denotes the optimal ST. Then the following theorem holds.

Theorem 5.6.

.

The correctness of the above theorem essentially follows from the correctness of the Theorem 1 of Kou et al. Kou et al. (1981). For the sake of completeness here we give the outline of this correctness. Let consists of edges. Then there exists a loop in in such a way that

  • every edge in appears exactly twice in . This implies that .

  • every leaf in appears exactly once in and if and are two consecutive leaves in then the sub-path connecting and is a simple path.

Note that the loop can be decomposed into simple sub-paths (recall that is the number of leaf nodes in the ), each of them connects two consecutive leaf nodes in . By deleting the longest such simple sub-path from the remaining path (say ) in satisfies the followings.

  • every edge in appears at least once in .

  • .

Now assume that is a generalized MST for of . We know that realizes the MST say of the complete distance graph for of . In other words . Note that each edge in , where , is a shortest straight path between and in . This ensures that the weight of an edge in is at most the weight of the corresponding simple path between nodes and in . If we consider all the edges of then . This concludes that .


Deadlock issue. The STCCM-A algorithm is free from deadlock. Deadlock occurs only if a set of nodes in the system enter into a circular wait state. In step

(SPF construction) of the STCCM-A algorithm, a node uniformly distributes its input (a brief description of the input distribution is given in  

A), which is the incident edge weights of it, among a subset of nodes (which are of course withing one hop distances) and never sends resource request message to any other nodes. This ensures that nodes never create circular waiting (or waiting request) for any resource during the execution of the algorithm. In Step , each node independently sends a message (containing its own state) to all of its neighbors only once. Upon receiving messages from all of its neighbors, a node performs some local computation and then terminates itself. Therefore, in step nodes are free from any possible circular waiting. The correctness of the deadlock freeness of the step 3 essentially follows from the work of Lotker et al. Lotker et al. (2005). In step , the pruning operation is performed on a tree structure () and a node never requests and waits for resources holding by any other node in the system. This implies that during the pruning operation, nodes are free from any possible circular waiting. Therefore, the deadlock freeness of all four steps together ensure deadlock freeness of the STCCM-A algorithm.

6 STCCM-B algorithm

The STCCM-B algorithm is a modified version of the STCCM-A algorithm that computes an ST in rounds and messages in the CCM maintaining the same approximation factor as that of the STCCM-A algorithm described in Section 4. Similar to the STCCM-A algorithm the STCCM-B algorithm also has four steps (small distributed algorithms). In the STCCM-B algorithm except the step all other steps are same as that of the STCCM-A algorithm. Recall that step of the STCCM-A algorithm is a construction of a SPF of a given input graph with a terminal set . In contrast for step of the STCCM-B algorithm, which is also the SPF construction, we adapt the SPF algorithm proposed in Saikia and Karmakar (2019) to the CCM. This new SPF algorithm denoted by SPF-B helps achieving a different round and message complexities for -approximate ST construction in the CCM. Specifically for constant or small shortest path diameter networks (where ) the STCCM-B algorithm outperforms the STCCM-A algorithm in terms of round complexity in the CCM.

The SPF algorithm in Saikia and Karmakar (2019) vs. the SPF-B algorithm. The SPF algorithm in Saikia and Karmakar (2019) was proposed for the model, whereas our SPF-B algorithm is devised for the CCM. In the SPF algorithm in Saikia and Karmakar (2019) the terminal nodes need to participate in forwarding messages until the termination of the algorithm. However in the SPF-B algorithm the terminal nodes participate in forwarding messages only once and after that they exchange no further messages in the system. The SPF-B algorithm terminates in rounds, whereas the SPF algorithm in Saikia and Karmakar (2019) terminates in rounds, where is the height of a breadth first search tree of the given input graph . Note here that . Despite the same asymptotic round complexities, required by the both algorithms, which is , they incur different message complexities. The SPF-B algorithm incurs a message complexity of , where , whereas the SPF algorithm in Saikia and Karmakar (2019) has the message complexity of , where is the maximum degree of a vertex in the input graph .


6.1 SPF-B algorithm

In addition to the notations defined in Section 2 the following notations are used in the description of the SPF-B algorithm.

  • denotes the tentative source ID of a node .

  • denotes tentative shortest distance from a node to its .

  • denotes the terminal flag of a node .

  • denotes the tentative predecessor of a node .

  • Let . Then at node , denotes the tentative shortest distance of a node incident on the other end of . Similarly at node , , and denote ID, tentative source ID and terminal flag value respectively of a node incident on the other end of .

Input specification.

  • If then , , , and .

  • If then , , , and .

  • The value of can be either or . A node sets the state of an incident edge as if . Otherwise, at node , the edge has the state value as . It is possible for the edge states at the two nodes adjacent to the edge to be temporarily different.

Output specification. When the algorithm terminates then , , and for each node .

Let be the set of terminals, be the set of messages received by node in a round.

1:upon receiving no message
2:if  then spontaneous awaken of the special node
3:     for all  do here contains edges
4:         send on
5:     end for
6:end if
7:upon receiving
8:for all  do
9:     
10:end for
11:if  then
12:     
13:     for all  do
14:         send on
15:     end for
16:else
17:     
18:end if
19:upon receiving a set of messages
20:for all  such that  do
21:     if  then
22:         
23:     end if
24:     if  then update the tentative source, distance, and predecessor
25:          is a temporary boolean variable
26:         
27:     end if
28:end for
29:if  then
30:     for all  such that  do
31:         send on
32:     end for
33:     
34:end if
SPF-B algorithm. Pseudocode at node upon receiving a set of messages or no message.

Outline of the algorithm. We assume that the SPF-B algorithm is initiated by a special node .