Congested Clique Algorithms for Graph Spanners

05/14/2018
by   Merav Parter, et al.
Weizmann Institute of Science
0

Graph spanners are sparse subgraphs that faithfully preserve the distances in the original graph up to small stretch. Spanner have been studied extensively as they have a wide range of applications ranging from distance oracles, labeling schemes and routing to solving linear systems and spectral sparsification. A k-spanner maintains pairwise distances up to multiplicative factor of k. It is a folklore that for every n-vertex graph G, one can construct a (2k-1) spanner with O(n^1+1/k) edges. In a distributed setting, such spanners can be constructed in the standard CONGEST model using O(k^2) rounds, when randomization is allowed. In this work, we consider spanner constructions in the congested clique model, and show: (1) A randomized construction of a (2k-1)-spanner with O(n^1+1/k) edges in O( k) rounds. The previous best algorithm runs in O(k) rounds. (2) A deterministic construction of a (2k-1)-spanner with O(n^1+1/k) edges in O( k +( n)^3) rounds. The previous best algorithm runs in O(k n) rounds. This improvement is achieved by a new derandomization theorem for hitting sets which might be of independent interest. (3) A deterministic construction of a O(k)-spanner with O(k · n^1+1/k) edges in O( k) rounds.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/07/2018

(Δ+1) Coloring in the Congested Clique Model

In this paper, we present improved algorithms for the (Δ+1) (vertex) col...
07/29/2020

Deterministic Distributed Expander Decomposition and Routing with Applications in Distributed Derandomization

There is a recent exciting line of work in distributed graph algorithms ...
07/25/2019

Fast Deterministic Constructions of Linear-Size Spanners and Skeletons

In the distributed setting, the only existing constructions of sparse sk...
02/21/2018

MIS in the Congested Clique Model in O( Δ) Rounds

We give a maximal independent set (MIS) algorithm that runs in O(Δ) roun...
04/17/2019

Improved Distributed Expander Decomposition and Nearly Optimal Triangle Enumeration

An (ϵ,ϕ)-expander decomposition of a graph G=(V,E) is a clustering of th...
11/14/2019

Graph Spanners in the Message-Passing Model

Graph spanners are sparse subgraphs which approximately preserve all pai...
11/17/2019

Sparse Hopsets in Congested Clique

We give the first Congested Clique algorithm that computes a sparse hops...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction & Related Work

Graph spanners introduced by Peleg and Schäffer [24] are fundamental graph structures, more precisely, subgraphs of an input graph , that faithfully preserve the distances in up to small multiplicative stretch. Spanners have a wide-range of distributed applications [23] for routing [28], broadcasting, synchronizers [25], and shortest-path computations [5].

The common objective in distributed computation of spanners is to achieve the best-known existential size-stretch trade-off within small number of rounds. It is a folklore that for every graph , there exists a -spanner with edges. Moreover, this size-stretch tradeoff is believed to be optimal, by the girth conjecture of Erdős.

There are plentiful of distributed constructions of spanners for both the  and the  models of distributed computing [10, 2, 11, 12, 13, 26, 14, 18]. The standard setting is a synchronous message passing model where per round each node can send one message to each of its neighbors. In the  model, the message size is unbounded, while in the  model it is limited to bits. One of the most notable distributed randomized constructions of spanners is by Baswana & Sen [3] which can be implemented in rounds in the  model.

Currently, there is an interesting gap between deterministic and randomized constructions in the  model, or alternatively between the deterministic construction of spanners in the  vs. the  model. Whereas the deterministic round complexity of spanners in the  model is due to [12], the best deterministic algorithm in the  model takes rounds [15].

We consider the congested clique model, introduced by Lotker et al. [22]. In this model, in every round, each vertex can send bits to each of the vertices in the graph. The congested clique model has been receiving a lot of attention recently due to its relevance to overlay networks and large scale distributed computation [19, 16, 6].

Deterministic local computation in the congested clique model.

Censor et al. [9] initiated the study of deterministic local algorithms in the congested clique model by means of derandomization of randomized  algorithms. The approach of [9] can be summarized as follows. The randomized complexity of the classical local problems is rounds (in both  and  models). For these randomized algorithms, it is usually sufficient that the random choices made by vertices are sampled from distributions with bounded independence. Hence, any round of a randomized algorithm can be simulated by giving all nodes a shared random seed of bits.

To completely derandomize such a round, nodes should compute (deterministically) a seed which is at least as “good”111The random seed is usually shown provide a large progress in expectation. The deterministically computed seed should provide a progress at least as large as the expected progress of a random seed.

as a random seed would be. This is achieved by estimating their “local progress” when simulating the random choices using that seed. Combining the techniques of conditional expectation, pessimistic estimators and bounded independence, leads to a simple “voting”-like algorithm in which the bits of the seed are computed

bit-by-bit. The power of the congested clique is hence in providing some global leader that collects all votes in round and broadcasts the winning bit value. This approach led to deterministic MIS in rounds and deterministic spanners with edges in rounds, which also works for weighted graphs. Barenboim and Khazanov [1] presented deterministic local algorithms as a function of the graph’s arboricity.

Deterministic spanners via derandomization of hitting sets.

As observed by [27, 7, 15], the derandomization of the Baswana-Sen algorithm boils down into a derandomization of -dominating sets or hitting-sets. It is a well known fact that given a collection of sets , each containing at least elements coming from a universe of size , one can construct a hitting set of size . A randomized construction of such a set is immediate by picking each element into

with probability

and applying Chernoff. A centralized deterministic construction is also well known by the greedy approach (e.g., Lemma 2.7 of [7]).

In our setting we are interested in deterministic constructions of hitting sets in the congested clique model. In this setting, each vertex knows a subset of size at least , that consists of vertices in the -neighborhood of , and it is required to compute a small set that hits (i.e., intersects) all subsets. Censor et al. [9] showed that the above mentioned randomized construction of hitting sets still holds with -wise independence, and presented an -round algorithm that computes a hitting set deterministically by finding a good seed of bits. Applying this hitting-set algorithm to compute each of the levels of clustering of the Baswana-Sen algorithm resulted in a deterministic spanner construction with rounds.

Our Results and Approach in a Nutshell

We provide improved randomized and deterministic constructions of graph spanners in the congested clique model. Our randomized solution is based on an -round algorithm that computes the nearest vertices in radius for every vertex 222To be more precise, the algorithm computes the nearest vertices at distance at most .. This induces a partitioning of the graph into sparse and dense regions. The sparse region is solved “locally” and the dense region simulates only two phases of Baswana-Sen, leading to a total round complexity of . We show the following for -vertex unweighted graphs. [hidealllines=true,backgroundcolor=gray!25] There exists a randomized algorithm in the congested clique model that constructs a -spanner with edges within rounds w.h.p.

Our deterministic algorithms are based on constructions of hitting-sets with short seeds. Using the pseudorandom generator of Gopalan et al. [17], we construct a hitting set with seed length which yields the following for -vertex unweighted graphs.

[hidealllines=true,backgroundcolor=gray!25] There exists a deterministic algorithm in the congested clique model that constructs a -spanner with edges within rounds. In addition, we also show that if one settles for stretch of , then a hitting-set seed of bits is sufficient for this purpose, yielding the following construction: [hidealllines=true,backgroundcolor=gray!25] There exists a deterministic algorithm in the congested clique model that constructs a -spanner with edges within rounds.

A summary of our results333[4] does not mention the congested clique model, but the best randomized solution in the congested clique is given by simulating [4]. are given in the Table 1. All results in the table are with respect to spanners with edges for an unweighted -vertex graph .

Stretch #Rounds Type
Baswana & Sen [4] Randomized
This Work
Censor-Hillel et al. [9] Deterministic
This Work
This Work

In what follows we provide some technical background and then present the high level ideas of these construction.

A brief exposition of Baswana-Sen [3].

The algorithm is based on constructing levels of clustering , where a clustering consists of vertex disjoint subsets which we call clusters. Every cluster has a special node that we call cluster center. For each , the spanner contains a depth- tree rooted at its center and spanning all cluster vertices. Starting with the trivial clustering , in each phase , the algorithm is given a clustering and it computes a clustering by sampling the cluster center of each cluster in with probability . Vertices that are adjacent to the sampled clusters join them and the remaining vertices become unclustered. For the latter, the algorithm adds some of their edges to the spanner. This construction yields a spanner with edges in expectation.

It is easy to see that this algorithm can be simulated in the congested clique model using rounds. As observed in [27, 18], the only randomized step in Baswana-Sen is picking the cluster centers of the clustering. That is, given the cluster centers of , it is required to compute a subsample of clusters without having to add to many edges to the spanner (due to unclustered vertices). This is exactly the hitting-set problem where the neighboring clusters of each vertex are the sets to cover, and the universe is the set of centers in (ideas along these lines also appear in [27, 15]).

Our Approach.

In the following, we provide the high level description of our construction while omitting many careful details and technicalities. We note that some of these technicalities stems from the fact that we insist on achieving the (nearly) optimal spanners, as commonly done in this area. Settling for an -spanner with edges could considerably simplify the algorithm and its analysis. The high-level idea is simple and it is based on dividing the graph into sparse edges and dense edges, constructing a spanner for each of these subgraphs using two different techniques. This is based on the following intuition inspired by the Baswana-Sen algorithm.

In Baswana-Sen, the vertices that are clustered in level- of the clustering are morally vertices whose -neighborhoods is sufficiently dense, i.e., containing at least vertices. We then divide the vertices into dense vertices and sparse vertices , where consists of vetices that have vertices in their -ball, and consists of the remaining vertices. This induces a partition of edges into and that contains the remaining -edges, i.e., edges whose both endpoints are dense.

Collecting Topology of Closed Neighborhood.

One of the key-building blocks of our construction is an -round algorithm that computes for each vertex the subgrpah induced on its closest vertices within distance at most in . Hence the algorithm computes the entire -neighborhoods for the sparse vertices. For the sake of the following discussion, assume that the maximum degree in is . Our algorithm handles the general case as well. Intuitively, collecting the -neighborhood can be done in rounds if the graph is sufficiently sparse by employing the graph exponentiation idea of [21]. In this approach, in each phase the radius of the collected neighborhood is doubled. Employing this technique in our setting gives raise to several issues. First, the input graph is not entirely sparse but rather consists of interleaving sparse and dense regions, i.e., the -neighborhood of a sparse vertex might contain dense vertices. For that purpose, in phase of our algorithm, each vertex (either sparse or dense) should obtain a subset of its closest vertices in its neighborhood. Limiting the amount collected information is important for being able to route this information via Lenzen’s algorithm [20] in rounds in each phase.

Another technicality concerns the fact that the relation “ is in the nearest vertices to ” is not necessarily symmetric. This entitles a problem where a given vertex is “close”444By close we mean being among the nearest vertices. to many vertices , and is not close to any of these vertices. In case where these vertices need to receive the information from regarding its closest neighbors (i.e., where some their close vertices are close to ), ends up sending too many messages in a single phase. To overcome this, we carefully set the growth of the radius of the collected neighborhood in the graph exponentiation algorithm. We let only vertices that are close to each other to exchange their topology information and show that this is sufficient for computing the subgraphs. This procedure is the basis for or constructions as explained next.

Handling the Sparse Region.

The idea is to let every sparse vertex locally simulate a  spanner algorithm on its subgraph . For that purpose, we show that the deterministic spanner algorithm of [12] which takes rounds in general, in fact requires only rounds when running by a sparse vertex . This implies that the subgraph contains all the information needed for to locally simulate the spanner algorithm. This seemingly harmless approach has a subtle defect. Letting only the sparse vertices locally simulate a spanner algorithm might lead to a case where a certain edge is not added by a sparse vertex due to a decision made by a dense vertex in the local simulation in . Since is a dense vertex it did not run the algorithm locally and hence is not aware of adding these edges. To overcome that, the sparse vertices notify the dense vertices about their edges added in their local simulations. We show how to do it in rounds.

Handling the Dense Region.

In the following, we settle for stretch of for ease of description. By applying the topology collecting procedure, every dense vertex obtains a set consisting of its closest vertices within distance . The main benefit in computing these sets, is that it allows the dense vertices to “skip” over the first phases of Baswana-Sen, ready to apply the phase.

As described earlier, picking the centers of the clusters can be done by computing a hitting set for the set . It is easy to construct a random subset of cardinality that hits all these sets and to cluster all the dense vertices around this set. This creates clusters of strong diameter (in the spanner) that cover all the dense vertices. The final step connects each pair of adjacent clusters by adding to the spanner a single edge between each such pair, this adds edges to the spanner.

Hitting Sets with Short Seed.

The description above used a randomized solution to the following hitting set problem: given subsets of vertices , each , find a small set that intersects all sets. A simple randomized solution is to choose each node to be in with probability . The standard approach for derandomization is by using distributions with limited independence. Indeed, for the randomized solution to hold, it is sufficient to sample the elements from a -wise distribution. However, sampling an element with probability requires roughly random bits, leading to a total seed length of , which is too large for our purposes.

Our key observation is that for any set the event that can be expressed by a read-once DNF formula. Thus, in order to get a short seed it suffices to have a pseudoranom generator (PRG) that can “fool” read-once DNFs. A PRG is a function that gets a short random seed and expands it to a long one which is indistinguishable from a random seed of the same length for such a formula. Luckily, such PRGs with seed length of exist due to Gopalan et al. [17], leading to deterministic hitting-set algorithm with rounds.

Graph Notations.

For a vertex , a subgraph and an integer , let . When , we omit it and simply write , also when the subgraph is clear from the context, we omit it and write . For a subset , let be the induced subgraph of on . Given a disjoint subset of vertices , let . we say that and are adjacent if . Also, for , . A vertex is incident to a subset , if .

Road-Map.

Section 2 presents algorithm to collect the topology of nearby vertices. At the end of this section, using this collected topology, the graph is partitioned into sparse and dense subgraphs. Section 3 describes the spanner construction for the sparse regime. Section 4 considers the dense regime and is organized as follows. First, Section 4.1 describes a deterministic construction spanner given an hitting-set algorithm as a black box. Then, Section 5 fills in this missing piece and shows deterministic constructions of small hitting-sets via derandomization. Finally, Section 5.3 provides an alternative deterministic construction, with improved runtime but larger stretch.

2 Collecting Topology of Nearby Neighborhood

For simplicity of presentation, assume that is even, for odd, we replace the term with . In addition, we assume . Note that randomized constructions with rounds are known and hence one benefits from an algorithm for a non-constant . In the full version, we show the improved deterministic constructions for .

2.1 Computing Nearest Vertices in the Neighborhoods

In this subsection, we present an algorithm that computes the nearest vertices with distance for every vertex . This provides the basis for the subsequent procedures presented later on. Unfortunately, computing the nearest vertices of each vertex might require many rounds when . In particular, using Lenzen’s routing555Lenzen’s routing can be viewed as a -round algorithm applied when each vertex is a target and a sender of messages.[20], in the congested clique model, the vertices can learn their -neighborhoods in rounds, when the maximum degree is bounded by . Consider a vertex that is incident to a heavy vertex (of degree at least ). Clearly has vertices at distance , but it is not clear how can learn their identities. Although, is capable of receiving messages, the heavy neighbor might need to send messages to each of its neighbors, thus messages in total. To avoid this, we compute the nearest vertices in a lighter subgraph of with maximum degree . The neighbors of heavy vertices might not learn their -neighborhood and would be handled slightly differently in Section 4. A vertex is heavy if , the set of heavy vertices is denoted by . Let .

For each vertex define to be the set of closest vertices at distance at most from (breaking ties based on IDs) in . Define to be the truncated BFS tree rooted at consisting of the shortest path in , for every .

There exists a deterministic algorithm that within rounds, computes the truncated BFS tree for each vertex . That is, after running Alg. , each knows the entire tree .

Algorithm .

For every integer , we say that a vertex is -sparse if , otherwise we say it is -dense. The algorithm starts by having each non-heavy vertex compute in rounds using Lenzen’s algorithm. In each phase , vertex collects information on vertices in its -ball in , where:

At phase the algorithm maintains the invariant that a vertex holds a partial BFS tree in consisting of the vertices , such that:

(I1) For an -sparse vertex , .

(I2) For an -dense vertex , consists of the closest vertices to in .

Note that in order to maintain the invariant in phase , it is only required that in phase , the -sparse vertices would collect the relevant information, as for the -dense vertices, it already holds that . In phase , each vertex (regardless of being sparse or dense) sends its partial BFS tree to each vertex only if (1) and (2) . This condition can be easily checked in a single round, as every vertex can send a message to all the vertices in its set . Let be the subset of all received sets at vertex . It then uses the distances to , and the received distances to the vertices in the sets, to compute the shortest-path distance to each . As a result it computes the partial tree . The subset consists of the (at most ) vertices within distance from . This completes the description of phase . We next analyze the algorithm and show that each phase can be implemented in rounds and that the invariant on the trees is maintained.

Analysis.

We first show that phase can be implemented in rounds. Note that by definition, for every , and every . Hence, by the condition of phase , each vertex sends messages and receives messages, which can be done in rounds, using Lenzen’s routing algorithm [20].

We show that the invariant holds, by induction on . Since all vertices first collected their second neighborhood, the invariant holds666This is the reason why we consider only , as otherwise and we would not have any progress. for . Assume it holds up to the beginning of phase , and we now show that it holds in the beginning of phase . If is -dense, then should not collect any further information in phase and the assertion holds trivially.

Consider an -sparse vertex and let be the target set of the closest vertices at distance from . We will fix , and show that and in addition, has computed the shortest path to in . Let be - shortest path in . If all vertices on the -length prefix of are -sparse, then the claim holds as , , and where in the last vertex on the -length prefix of . Hence, by the induction assumption for the sets, can compute in phase its shortest-path to .

We next consider the remaining case where not all the vertices on the -length path are sparse. Let be the first -dense vertex (closest to ) on the -length prefix of . Observe that . Otherwise, contains vertices that are closer to than , which implies that these vertices are also closer to than , and hence should not be in (as it is not among the closest vertices to ), leading to contradiction. Thus, if also , then sends to in phase its shortest-path to . By the induction assumption for the sets, we have that has the entire shortest-path to . It remains to consider the case where the first -dense vertex on , , does not contain in its set, hence it did not send its information on to in phase . Denote and , thus . Since but , we have that and , which implies that . Let be the vertex preceding on the path, hence also appear on the -length prefix of and . By definition, is -sparse and it also holds that . Since , it holds that . Thus, can compute the - shortest-path using the - shortest-path it has received from . For an illustration, see Figure 1.

Figure 1: Shown is a path between and where is the first dense vertex on the -length prefix of . If then .

2.2 Dividing into Sparse and Dense Regions

Thanks for Alg. every non-heavy vertex computes the sets and the corresponding tree . The vertices are next divided into dense vertices and sparse vertices . Morally, the dense vertices are those that have at least vertices at distance at most in . Since the subsets of nearest neighbors are computed in rather than in , this vertex division is more delicate. A vertex is dense if either (1) it is heavy, (2) a neighbor of a heavy vertex or (3) . Otherwise, a vertex is sparse. Let be the dense (resp., sparse) vertices in .

Observation .

For , for every dense vertex it holds that .

Proof.

If a vertex is incident to a heavy vertex, then it has at least vertices at distance . Since , a non-sparse vertex it holds that . ∎

The edges of are partitioned into:

Since all the neighbors of heavy vertices are dense, it also holds that .

Overview of the Spanner Constructions.

The algorithm contains two subprocedures, the first takes care of the sparse edge-set by constructing a spanner and the second takes care of the dense edge-set by constructing . Specifically, these spanners will satisfy that for every , for . We note that the spanner rather than being contained in . The reason is that the spanner might contain edges incident to sparse vertices as will be shown later. The computation of the spanner for the sparse edges, , is done by letting each sparse vertex locally simulating a local spanner algorithm. The computation of is based on applying two levels of clustering as in Baswana-Sen. The selection of the cluster centers will be made by applying an hitting-set algorithm.

3 Handling the Sparse Subgraph

In the section, we construct the spanner that will provide a bounded stretch for the sparse edges. As we will see, the topology collected by applying Alg. allows every sparse vertex to locally simulate a deterministic spanner algorithm in its collected subgraph, and deciding which of its edges to add to the spanner based on this local view.

Recall that for every sparse vertex it holds that where and that . Let . By applying Alg. , and letting sparse vertices sends their edges to the sparse vertices in their neighborhoods in , we have:

Claim .

There exists a -round deterministic algorithm, that computes for each sparse vertex its subgraph .

Proof.

By running Alg. , every sparse vertex computes all the vertices in . Note that all the neighbors of a sparse vertex are non-heavy and thus . Next, we let every sparse vertex broadcasts that it is sparse. Every sparse vertex sends its edges in to every sparse vertex . Since every sparse vertex sends messages and receives messages, this can be done in many rounds using Lenzen’s routing algorithm. Consider an edge for a sparse vertex . By definition, both and thus at least one endpoint is sparse, say . By symmetry, it holds that and thus has received all the edges incident to . The claim follows. ∎

Our algorithm is based on an adaptation of the local algorithm of [12], which is shown to satisfy the following in our context. There exists a deterministic algorithm that constructs a spanner in the  model, such that every sparse vertex decides about its spanner edges within rounds. In particular, can simulate Alg. locally on and for every edge not added to the spanner , there is a path of length at most in . A useful property of the algorithm777This algorithm works only for unweighted graphs and hence our deterministic algorithms are for unweighted graphs. Currently, there are no local deterministic algorithms for weighted graphs. by Derbel et al. (Algorithm 1 in [12]) is that if a vertex did not terminate after rounds, then it must hold that . Thus in our context, every sparse vertex terminates after at most rounds888By definition we have that . Moreover, since it also holds that .. We also show that for simulating these rounds of Alg. by , it is sufficient for to know all the neighbors of its neighborhood in and these edges are contained in . The analysis of Lemma 3 is in Appendix C.

We next describe Alg. that computes . Every vertex computes in rounds and simulate Alg. in that subgraph. Let be the edges added to the spanner in the local simulation of Alg. in . A sparse vertex sends to each sparse vertex , the set of all -edges in . Hence, each sparse vertex sends messages (at most -edges to each of its at most vertices in ). In a symmetric manner, every vertex receives messages and this step can be done in rounds using Lenzen’s algorithm. The final spanner is given by . The stretch argument is immediate by the correctness of Alg. and the fact that all the edges added to the spanner in the local simulations are indeed added to . The size argument is also immediate since we only add edges that Alg. would have added when running by the entire graph.

[hidealllines=false] Algorithm (Code for a sparse vertex )

  1. Apply Alg. to compute for each sparse vertex .

  2. Locally simulate Alg. in and let be the edges added to the spanner in .

  3. Send the edges of to the corresponding sparse endpoints.

  4. Add the received edges to the spanner .

4 Handling the Dense Subgraph

In this section, we present the construction of the spanner satisfying that for every . Here we enjoy the fact the neighborhood of each dense vertex is large and hence there exists a small hitting that covers all these neighborhoods. The structure of our arguments is as follows. First, we describe a deterministic construction of using an hitting-set algorithm as a black box. This would immediately imply a randomized spanner construction in -rounds. Then in Section 5, we fill in this last missing piece and show deterministic constructions of hitting sets.

Constructing spanner for the dense subgraph via hitting sets.

Our goal is to cluster all dense vertices into small number of low-depth clusters. This translates into the following hitting-set problem defined in [7, 30, 15]: Given a collection where each and , compute a subset of cardinality that intersects (i.e., hits) each subset . A hitting-set of size is denoted as small hitting-set.

We prove the next lemma by describing an the construction of the spanner given an algorithm that computes small hitting sets. In Section 5, we complement this lemma by describing several constructions of hitting sets. Let be an -vertex graph, let and be a set of subsets such that each node knows the set , for any and . Let be a hitting set algorithm that constructs a hitting set for such that in rounds. Then, there exists a deterministic algorithm for constructing within rounds. The next definition is useful in our context.

-depth Clustering.

A cluster is a subset of vertices and a clustering consists of vertex disjoint subsets. For a positive integer , a clustering is a -depth clustering if for each cluster , the graph contains a tree of depth at most rooted at the cluster center of and spanning all its vertices.

4.1 Description of Algorithm

The algorithm is based on clustering the dense vertices in two levels of clustering, in a Baswana-Sen like manner. The first clustering is an -depth clustering covering all the dense vertices. The second clustering, is an -depth clustering that covers only a subset of the dense vertices. For odd, let be equal to .

Defining the first level of clustering.

Recall that by employing Algorithm , every non-heavy vertex knows the set containing its nearest neighbors in . For every heavy vertex , let . Let be the set of all non-heavy vertices that are neighbors of heavy vertices. By definition, . Note that for every dense vertex , it holds that . The vertices of are in and hence have computed the set , however, there is in guarantee on the size of these sets.

To define the clustering of the dense vertices, Algorithm applies the hitting-set algorithm on the subsets . Since every set in has size at least , the output of algorithm is a subset of cardinality that hits all the sets in .

We will now construct the clusters in with as the cluster centers. To make sure that the clusters are vertex-disjoint and connected, we first compute the clustering in the subgraph , and then cluster the remaining dense vertices that are not yet clustered. For every (either dense or sparse), we say that is clustered if . In particular, every dense vertex for which is clustered (the neighbors of heavy vertices are either clustered or not). For every clustered vertex (i.e., even sparse ones), let , denoted hereafter the cluster center of , be the closest vertex to in , breaking shortest-path ties based on IDs. Since knows the entire tree , it knows the distance to all the vertices in and in addition, it can compute its next-hop on the - shortest path in . Each clustered vertex , adds the edge to the spanner . It is easy to see that this defines a -depth clustering in that covers all dense vertices in . In particular, each cluster has in the spanner a tree of depth at most that spans all the vertices in . Note that in order for the clusters to be connected in , it was crucial that all vertices in compute their cluster centers in , if such exists, and not only the dense vertices. We next turn to cluster the remaining dense vertices. For every heavy vertex , let be its closest vertex in . It then adds the edge to the spanner and broadcasts its cluster center to all its neighbors. Every neighbor of a heavy vertex that is not yet clustered, joins the cluster of and adds the edge to the spanner. Overall, the clusters of centered at the subset cover all the dense vertices. In addition, all the vertices in a cluster are connected in by a tree of depth . Formally, where .

Defining the second level of clustering.

Every vertex that is clustered in broadcasts its cluster center to all its neighbors. This allows every dense vertex to compute the subset consisting of the centers of its adjacent clusters in . Consider two cases depending on the cardinality of . Every vertex with , adds to the spanner an arbitrary edge in for every . It remains to handle the remaining vertices . These vertices would be clustered in the second level of clustering . To compute the centers of the clusters in , the algorithm applies the hitting-set algorithm on the collection of subsets with