DeepAI

In this paper, we bring the main tools of the Laplacian paradigm to the Broadcast Congested Clique. We introduce an algorithm to compute spectral sparsifiers in a polylogarithmic number of rounds, which directly leads to an efficient Laplacian solver. Based on this primitive, we consider the linear program solver of Lee and Sidford (FOCS 2014). We show how to solve certain linear programs up to additive error ϵ with n constraints on an n-vertex Broadcast Congested Clique network in Õ(√(n)log(1/ϵ)) rounds. Using this, we show how to find an exact solution to the minimum cost flow problem in Õ(√(n)) rounds.

• 12 publications
• 5 publications
12/31/2020

### Minor Sparsifiers and the Distributed Laplacian Paradigm

We study distributed algorithms built around edge contraction based vert...
09/07/2022

### Four Algorithms on the Swapped Dragonfly

The Swapped Dragonfly with M routers per group and K global ports per ro...
05/07/2018

### (Δ+1) Coloring in the Congested Clique Model

In this paper, we present improved algorithms for the (Δ+1) (vertex) col...
09/11/2021

### Accelerated Distributed Laplacian Solvers via Shortcuts

In this work we refine the analysis of the distributed Laplacian solver ...
02/21/2018

### MIS in the Congested Clique Model in O( Δ) Rounds

We give a maximal independent set (MIS) algorithm that runs in O(Δ) roun...
05/13/2019

### A Distributed Laplacian Solver and its Applications to Electrical Flow and Random Spanning Tree Computation

We present a distributed solver for a large and important class of Lapla...
06/20/2022

### The Capacity of 3 User Linear Computation Broadcast

The K User Linear Computation Broadcast (LCBC) problem is comprised of d...

## 1 Introduction

In this paper, we study algorithms for the Broadcast Congested Clique (BCC) model [DKO12]. In this model, the (problem-specific) input is distributed among several processors and the goal is that at the end of the computation each processor knows the output or at least the share of the output relevant to it. The computation proceeds in rounds and in each round each processor can send one message to all other processors. We can also view the communication as happening via a shared blackboard to which each processor may write (in the sense of appending) at most one message per round. The main metric in designing and analyzing algorithms for the Broadcast Congested Clique is the number of rounds performed by the algorithm.

A typical way of for example distributing an input matrix among processors would be that initially processor  only knows row  of the matrix. In many graph problems, this input matrix is the adjacency matrix of the graph. If communication with other processors is only possible along the edges of this graph, then the resulting model is often called the Broadcast CONGEST model [Lynch96]. Note that the unicast versions of these models, in which each processor may send a different message to each (neighboring) processor, are known as the Congested Clique [LPSPP05] and the CONGEST model [Peleg00], respectively.

In this paper, we bring the main tools of the so-called Laplacian paradigm to the BCC model. In a seminal paper, Spielman and Teng developed an algorithm for approximately solving linear systems of equations with a Laplacian coefficient matrix in a near-linear number of operations [ST14]. The Laplacian paradigm [Teng10]

refers to exploring the applications of this fast primitive in algorithm design. In a broader sense, this paradigm is also understood as the more general idea of employing linear algebra methods from continuous optimization outside of their traditional domains. Using such methods is very natural in distributed models because a matrix-vector multiplication can be carried out in a single round if each processor stores one coordinate of the vector. In recent years, this methodology has been successfully employed in the CONGEST model

[GKK+15, BeckerFKL21] and in particular, solvers for Laplacian systems with near-optimal round complexity have been developed for the CONGEST model – in networks with arbitrary topology [FGLP+20] and in bounded-treewidth graphs [AGL21] – and for the HYBRID model [AGL21]. In this paper, we switch the focus to the BCC model and show that it allows a faster implementation of the basic Laplacian primitive.

What further makes the BCC model intriguing is that – in contrast to the Congested Clique – for several problems no tailored BCC algorithms are known that are significantly faster than low-diameter versions of (Broadcast) CONGEST model algorithms. Consider, for example, the single-source shortest path problem. In the (Broadcast) CONGEST model, the fastest known algorithm takes rounds [ChechikM20], where is the diameter of the underlying (unweighted) communication network.111 Throughout the introductory part of this paper we often assume that all weights of graphs and entries of matrices are polynomially bounded to simplify some statements of running time bounds. In the BCC model, the state of the art for this problems is rounds [Nanongkai14], which essentially is not more efficient than the special case of the Broadcast CONGEST model. In the Congested Clique model however, is not a barrier for this problem as it can be solved in rounds [CDKL21] on undirected graphs. A similar classification can be made for directed graphs [ForsterN18, Censor-HillelKK19]. This naturally leads to the question whether BCC algorithms can be developed that are faster than their CONGEST model counterparts, since it is not clear which one dominates the other in strength.

It has recently been shown that in the CONGEST model, the maximum flow problem as well as the unit-capacity minimum cost flow problem can be solved in rounds [FGLP+20], where denotes the number of edges of the input graph; note that this round complexity can only be sublinear in for sparse graphs.

#### Our contributions.

Our main result is an algorithm that solves the minimum cost flow problem222Note that in contrast to the algorithm of Forster et al. [FGLP+20], we do not need to assume unit capacities. (which generalizes both the single-source shortest path problem and the maximum flow problem) in rounds in the BCC model, which in particular is sublinear for any graph density and matches the currently known upper bounds for the single-source shortest paths problem.

###### Theorem 1.1.

There exists a Broadcast Congested Clique algorithm that, given a directed graph with integral costs and capacities with and , computes a minimum cost maximum -

flow with high probability in

rounds.

In obtaining this result, we develop machinery of the Laplacian paradigm that might be of independent interest. The first such tool is an algorithm for computing a spectral sparsifier in the Broadcast CONGEST model.

###### Theorem 1.2.

There exists an algorithm that, given a graph with positive real weights satisfying and an error parameter , with high probability outputs a -spectral sparsifier of , where . Moreover, we obtain an orientation on such that with high probability each edge has out-degree . The algorithm runs in rounds in the Broadcast CONGEST model.

At a high level, our sparsifier algorithm is a modification of the CONGEST-model algorithm of Koutis and Xu [KX16]; essentially, uniform edge sampling is trivial in the CONGEST model, but challenging in the Broadcast CONGEST model. Note that the sparsifier algorithm of Koutis and Xu being restricted to the CONGEST model is a major obstacle for implementing the CONGEST-model Laplacian solver of Forster et al. [FGLP+20] also in the Broadcast CONGEST model.

Making the sparsifier known to every processor leads to a simple residual-correction algorithm for solving systems of linear equations with a Laplacian coefficient matrix up to high precision in the BCC model. Note that there is reduction [Gremban96] from solving linear equations with symmetric diagonally dominant (SDD) coefficient matrices to solving linear equations with Laplacian coefficient matrices, which also applies in the Broadcast Congested Clique.

###### Theorem 1.3.

There exists an algorithm in the Broadcast Congested Clique model that, given a graph , with positive real weights satisfying and Laplacian matrix , a parameter , and a vector , outputs a vector such that , for some satisfying . The algorithm needs preprocessing rounds and takes rounds for each instance of .

Finally, we show how to implement the algorithm of Lee and Sidford [LS14]333Note that in the more technical parts of our paper we explicitly refer to the arXiv preprints [LS13] and [LS19] instead of the conference version [LS14]. for solving linear programs up to small additive error in iterations in the BCC model. Here, the rank refers to the constraint matrix of the LP and in each iteration a linear system needs to be solved. If the constraint matrix has a special structure – which is the case for the LP formulation of the minimum cost flow problem – then a high-precision Laplacian solver can be employed for this task.

###### Theorem 1.4.

Let be a constraint matrix with , let be a demand vector, and let be a cost vector. Moreover, let be a given initial point in the feasible region . Suppose a Broadcast Congested Clique network consists of vertices, where each vertex knows both every entire -th row of for which and knows if . Moreover, suppose that for every and positive diagonal we can compute up to precision in rounds. Let . Then with high probability the Broadcast Congested Clique algorithm LPSolve outputs a vector with in rounds.

While this approach of solving LPs is inherently parallelizable (as the PRAM depth analysis of Lee and Sidford indicates), several steps pose a challenge for the BCC model and require more than a mere “translation” between models. In particular we need to use a different version of the Johnson-Lindenstraus lemma to approximate leverage scores. Further we give a BCC algorithm for projecting vectors on a mixed norm ball.

As in the approach of Lee and Sidford, our main result on minimum cost maximum flow then follows from plugging a suitable linear programming formulation of the problem into the LP solver.

#### Overview.

We provide a visual overview of the results in this paper and how they are interconnected in Figure 1.

In Section 4, we present our LP solver. Given a linear program of the form555Following Lee and Sidford, we write instead of the more common for the linear program, since this means that corresponds with the number of vertices and with the number of edges in LP formulations of flow problems.

 minx∈R:ATx=bcTx,

for some constraint matrix and some convex region , Lee and Sidford [LS14, LS19] show how to find an -approximate solution in time. An implementation of this algorithm in the Broadcast Congested Clique is rather technical and needs new subroutines, the main one being our Laplacian solver.

The algorithm is an interior point method that uses weighted path finding to make progress. The weights used are the Lewis weights, which can be approximated up to sufficient precision using the computation of leverage scores, which are defined as , where in our case , for some diagonal matrix . Computing leverage scores exactly is expensive, hence these too are approximated. This can be done using the observation that and the Johnson-Lindenstrauss lemma [JL84], which states that there exists a map such that , for polylogarithmic . Nowadays, several different (randomized) constructions for exist. A common choice in the realm of graph algorithms [SS11, LS19] is to use Achlioptas’ method [Achlioptas03], which samples each entry of with a binary coin flip. However, this is in practice not feasible in the Broadcast Congested Clique: we would need a coin flip for every edge, which can be performed by one of the endpoints, but cannot be communicated to the other endpoint due to the broadcast constraint. Instead we use the result of Kane and Nelson [KN12], that states we need only a polylogarithmic number of random bits in total. These can simply be sampled by one vertex and broadcast to all the other, who then internally construct . Now if we can multiply both and by a vector, and solve linear systems involving , for diagonal , then we can compute these leverage scores efficiently. These demands on are not unreasonable when we consider graph problems, because in such cases the constraint matrix will adhere to the structure of the graph Laplacian, and hence our Laplacian solver can be applied.

A second challenge in implementing Lee and Sidford’s LP solver is a subroutine that computes projections on a mixed norm ball. To be precise: for distributed over the network, the goal is to find

 arg max||x||2+||l−1x||∞≤1aTx.

We show that we can solve this maximization problem when we know the sums , , and for all . Computing such a sum for fixed is feasible in a polylogarithmic number of rounds. Moreover, we show that we do not need to inspect these sums for all , but that we can do a binary search, which reduces the total time complexity to polylogarithmic.

Following Lee and Sidford [LS14], we apply the LP solver to an LP formulation of the minimum cost maximum flow problem in Section 5. The corresponding constraint matrix has rows and thus rank . Furthermore, (for any diagonal matrix ) is symmetric diagonally dominant and thus can be approximated to high precision in a polylogarithmic number of rounds with our Laplacian solver. We only need to solve the LP up to precision , since we can round the approximate solution to an exact solution. Hence, the minimum maximum cost flow LP can be solved in rounds.

## 2 Preliminaries

First we detail the models we will be working with. Next, we review spanners and sparsifiers, and how to construct the latter from the former. Then we show how spectral sparsifiers can be used for solving Laplacian systems. Finally, we introduce flow problems on weighted graphs.

### 2.1 Models

In this paper, we consider multiple variants of message passing models with bandwidth constraints on the communication. Let us start by defining the CONGEST model. The CONGEST model [Peleg00] consists of a network of processors, which communicate in synchronous rounds. In each round, a processor can send information to its neighbors over a non-faulty link with limited bandwidth. We model the network of processors by a graph , where we identify the processors with the vertices and the communication links with the edges. We write and . Each vertex has a unique identifier of size , initially only known by the vertex itself and its neighbors. Computation in this model is done in rounds. At the start of each round, each vertex can send one message to each of its neighbors, and receives messages from them. The messages are of size at most . Before the next round, each vertex can perform (unlimited) internal computation. We measure the efficiency of an algorithm by the number of rounds.

In the CONGEST model, each vertex can send distinct messages to each of its neighbors. A more strict assumption on message passing, is that each vertex sends the same message to each of its neighbors, essentially broadcasting it to its neighbors. The CONGEST model together with this assumption is called the Broadcast CONGEST model [Lynch96].

Alternatively, we can let the communication network be independent of the graph being studied. More precisely, we allow communication between each pair of vertices. Together with the bandwidth constraint, this is called the Congested Clique [LPSPP05]. If we also impose the broadcast constraint, we have the Broadcast Congested Clique [DKO12].

### 2.2 Spanners and Spectral Sparsification

The Laplacian matrix of a weighted graph , or the graph Laplacian, is a matrix defined by

 Luv=⎧⎪⎨⎪⎩−w(u,v)if u is adjacent to v;∑x∈Vw(u,x)if u=v;0else.

Alternatively, we can define the Laplacian matrix in terms of the edge-vertex incidence matrix , defined by

The Laplacian then becomes , where is the diagonal matrix defined by the weights: .

Spectral sparsifiers were first introduced by Spielman and Teng [ST11]. A spectral sparsifier is a (reweighted) subgraph that has approximately the same Laplacian matrix as the original graph.

###### Definition 2.1.

Let be a graph with weights , and . We say that a subgraph with weights is a -spectral sparsifier for if we have for all :

 (1−ϵ)xTLHx≤xTLGx≤(1+ϵ)xTLHx, (1)

where and are the Laplacians of and respectively.

We introduce the short-hand notation when is positive semi-definite. This reduces equation 1 to .

Koutis and Xu [KX16] showed how to compute a spectral sparsifier by repeatedly computing spanners. This technique was later slightly improved by Kyng et al. [KPPS17]. Spanners are a special type of spanning subgraphs, where we demand that distances are preserved up to a constant factor. Trivially, any graph is a spanner of itself. In practice, the goal will be to find sparse subgraphs that are still spanners for the input graph.

###### Definition 2.2.

Let be a graph with weights . We say that a subgraph with weights is a spanner of stretch for if for each we have

 dS(u,v)≤αdG(u,v),

where we write for the distance from to in . A -bundle spanner of stretch is a union , where each is a spanner of stretch in .

The algorithm of Koutis and Xu is relatively simple: compute a -bundle spanner of stretch , sample the remaining edges with probability , repeat for iterations on the computed bundle spanner and sampled edges. The sparsifier then consists of the last bundle spanner, together with the set of edges left after the iterations, where edges are reweighted in a certain manner. In the original algorithm, the stretch was fixed, but the number of spanners in each bundle grew in each iteration. Kyng et al. [KPPS17] showed that can be kept constant throughout the algorithm, leading to a sparser result.

### 2.3 Laplacian Solving

We consider the following problem. Let be the Laplacian matrix for some graph on vertices. Given , we want to solve . Solving Laplacian equation exactly can be computationally demanding. Therefore, we consider an approximation to this problem: we want to find such that , where we write for any . One way to approach this is by using a spectral sparsifier of . Hereto we use preconditioned Chebyshev iteration, a well known technique from numerical analysis [Axelsson96, Saad03]. The statement below most closely resembles the formulation of Peng [peng13].

###### Theorem 2.3.

Suppose we have symmetric positive semi-definite matrices , and a parameter satisfying

 A≼B≼κA.

Then there exists an algorithm that, given a vector and parameter , returns a vector such that

 ||x−y||A≤ϵ||x||A,

for some satisfying . The algorithm takes iterations, each consisting of multiplying by a vector, solving a linear system involving , and a constant number of vector operations.

This yields the following corollary for Laplacian solving using spectral sparsifiers.

###### Corollary 2.4.

Let be a weighted graph on vertices, let be a parameter, and let a vector. Suppose is a -spectral sparsifier for . Then there exists an algorithm that outputs a vector such that , for some satisfying . The algorithm takes iterations, each consisting of a multiplying by a vector, solving a Laplacian system involving , and a constant number of vector operations.

###### Proof.

As is a sparsifier for , we have: , which we can rewrite to

 LG≼(1+12)LH≼1+121−12LG.

We set and , which are clearly both symmetric positive semi-definite. Furthermore, we set . We apply Theorem 2.3 with these settings to obtain the result. ∎

### 2.4 Flow Problems

In this section we formally define the maximum flow and the minimum cost maximum flow problems. Let be a directed graph, with capacities , and designated source and target vertices . We say is an - flow if

1. for each vertex we have ;

2. for each edge we have .

The value of the flow is defined as . The maximum flow problem is to find a flow of maximum value. Additionally, we can have costs on the edges: . The cost of the flow is defined as . The minimum cost maximum flow problem is to find a flow of minimum cost among all flows of maximum value.

Both problems allow for a natural linear program formulation. We present one for the minimum cost maximum flow problem, as this is the more general problem. Denote for the edge-vertex incidence matrix (see Section 2.2). Then we can write this as:

 min0≤x≤cqTx such that Bx=Fet−Fes,

for the value of the maximum flow, and and the vectors defined by . The answer to the minimum cost maximum flow problem is then found by a binary search over .

## 3 Spectral Sparsifiers and Laplacian Solving

In this section, we show how to construct spectral sparsifiers in the Broadcast CONGEST model, so in particular also for the Broadcast Congested Clique. We do this following the method of Koutis and Xu [KX16], which consists of repeatedly computing spanners and sampling the remaining vertices, see Section 2.2. While sampling edges is easy in the CONGEST model, it is highly non-trivial in the Broadcast CONGEST model. The reason for this is that in the CONGEST model the sampling of an edge can be done by one endpoint, and communicated to the other endpoint. In the Broadcast CONGEST model, the sampling can be done by one endpoint, but the result cannot be communicated efficiently to the other endpoint due to the broadcast constraint. To circumvent this, we show that the sampling needed for spectral sparsification can be done on the fly, rather than a priori in each iteration. Moreover, we show the result can be communicated implicitly. In Section 3.1, we show how to compute spanners where we have probabilities on edges existing, whether an edge exists is evaluated on the fly and (implicitly) communicated to the other endpoint. In Section 3.2 we show how to use this spanner construction to compute spectral sparsifiers in the Broadcast CONGEST model.

### 3.1 Spanners with Probabilistic Edges

Our goal is to compute a -spanner for a given probabilistic graph. More precisely, let be an undirected, weighted graph on vertices, with a probability function on the edges, and the parameter for the stretch of the spanner. We will give an algorithm Spanner(,,,,) that computes a subset , and divides this into two sets , such that each edge is part of independently with probability . This results in a -spanner for all graphs , where . Since this is a distributed algorithm, the output comes in a local form. At the end, each vertex has identified and , where .

When , our algorithm essentially reduces to the algorithm of Baswana-Sen from [BS07]. All computational steps coincide, but a difference in communication remains. The reason hereto is that in our algorithm the weights of edges are included in the communication. Depending on the magnitude of the weights, this can result in multiple rounds for each message, and consequently more rounds in total.

For the presentation of Baswana and Sen’s algorithm, we follow the equivalent formulation of Becker et al. [BeckerFKL21], which can be found in Appendix A. The general idea is that clusters are formed and revised through a number of phases. In each phase, a few of the existing clusters are sampled. These clusters move onto the next phase. Vertices from an unsampled cluster try to connect to a sampled cluster and to some neighboring clusters. As edges only exist with a certain probability, they need to be sampled before they can be used. We will make sure that the two vertices adjacent to an edge, never try to use it at the same time. When a vertex has tried to use an edge, the edge will always be broadcasted if it exists. If not, it turns out that the other vertex adjacent to this edge will be able to deduce this, without it being communicated explicitly.

Whenever we speak of the neighbors of a vertex , denoted by , we mean all neighbors that do not lie in the set of ‘deleted neighbors’: . Note that this set of neighbors will be subject to change throughout the process, as the number of elements in grows.

Step 1: Cluster marking
Initially, each vertex is a singleton cluster: . The main part of the algorithm will be ‘phases’, indexed . In phase , the center of each cluster (the first vertex in the cluster) marks the cluster with probability and broadcasts this result to the cluster. These clusters will move on to the next phase: is defined to be the set of clusters marked in phase . We define the identifier of a cluster to equal the ID of the center of the cluster. Each phase consists of cluster marking, followed by steps 2 and 3.

Step 2: Connecting to marked clusters
Let be a vertex in an unmarked cluster . The first thing does, is trying to connect to one of the marked clusters. It does this using the procedure Connect. Hereto we define to be the set of all neighbors of which lie in a marked cluster: . Now we let . Note that if , Connect returns . If , we broadcast . If it returns , we add to , joins the cluster of (it stores this decision by saving ), and we broadcast . In both cases, we add to .

After this step, all vertices in unmarked clusters may have joined marked clusters, and they have updated their sets by adding , and by adding . We also want to propagate these updates in to the neighbors of . This is easy for , since we can broadcast . However, we do not want to broadcast the set , as it can be large. Instead we make use of the choices in Connect to communicate changes in implicitly.

Let be a neighbor of in a marked cluster. If has broadcasted , then adds to . There are three situations where adds to :

3. If broadcasted with and .

In any other case, does nothing. This step ensures that gets added to if and only if . In total, this results in for all vertices .

As a final note: each vertex has broadcasted the ID of the cluster it joins, its neighbors keep track of these changes, as they will need the new cluster IDs when they try to connect to a marked cluster in the next phase. For the remainder of this phase (step 3), the ‘old’ cluster IDs are still valid.

Step 3: Connections between unmarked clusters
In this step, we create connections between the unmarked clusters. In the previous part, the situation was asymmetric: vertices of unmarked clusters connected to vertices in marked clusters. To make sure that at most one vertex decides upon the existence of an edge, we create two substeps. In the first substep a vertex in cluster can only connect to a neighboring cluster if . In the second substep, a vertex can only connect to neighboring clusters with higher ID. This way all necessary connections can be made, while no two vertices will simultaneously try to decide on the existence of the edge between them.

Step 3.1: Connecting to a cluster with a smaller ID
Let be a vertex in an unmarked cluster . We will try to connect to each neighboring cluster with . Fix such a cluster . Let be the neighbors of in this cluster, with , i.e. . Similar as before, we run Connect to decide which neighbor to connect to: . If it returns , we add to and we broadcast . If Connect returns , we simply broadcast . In both cases we add to . Again we wish to propagate these updates to ’s neighbors. As before, we communicate this implicitly.

Let be a vertex in neighboring cluster with and . If has broadcasted , then adds to . Again, there are three situations where adds to :

3. If broadcasted with and .

In any other case, does nothing. As before, note that this step ensures that for all vertices .

Step 3.2: Connecting to a cluster with a bigger ID
Vertices in an unmarked cluster have now connected to neighboring unmarked clusters with and the sets have been updated accordingly. However, we need to connect to all unmarked neighboring clusters, just as in the original algorithm (as depicted in Appendix A). Therefore we move on to the neighboring clusters with . The process for these clusters is completely analogous to substep 3.1, and thus will not be given here.

Step 4: After the phases
In the last part of the algorithm, we want to connect each vertex to all its neighboring clusters in . This is again done in three steps, similar to the steps 2, 3.1, and 3.2 in the phases above.

• All vertices that are not part of any remaining cluster connect, using Connect , to each neighboring remaining clusters . As before, they broadcast how they connect such that vertices in remaining clusters can add edges to accordingly.

• Vertices connect, using Connect , to each neighboring remaining clusters with . As before, they broadcast the result, such that neighbors can add edges to accordingly.

• Vertices connect, using Connect , to each neighboring remaining clusters with . As before, they broadcast the result, such that neighbors can add edges to accordingly.

In the following lemma we show that this algorithm indeed gives a spanner of stretch .

###### Lemma 3.1.

The spanner has stretch at most for all graphs , where . For any choice of , it has at most edges in expectation. Moreover, we obtain an orientation on such that each edge has out-degree in expectation.

###### Proof.

First of all, note that setting reduces this more involved algorithm to the original algorithm, given in Appendix A, which we know to correctly create a spanner. We claim Spanner() also outputs as spanner, under the following assumption on the marking of clusters. In step 1, each cluster marks itself with probability . We can imagine that it does this by drawing from some source of random bits. Our assumption is that these random bits are the same for both algorithms. This assumption can be made, since these bits are independent of the probability on the edges. From now on, we call Spanner() algorithm and Spanner() algorithm . We claim that if algorithm outputs , and , that using as its input, algorithm will output . Since we already know that the output of algorithm gives a spanner for , this proves the lemma.

We will not only show that the output of the two algorithms is the same. We will even show that all intermediate steps (creating clusters and selecting spanner edges) are the same. We will prove this claim by induction. It is clear that the initialization of both algorithms is the same. We need to show that if both algorithms have produced the same situation up to a certain point, the next decision will also be the same. These decisions take place whenever a vertex tries to connect to some cluster. This happens in steps 2, 3.1, 3.2, 4.1, 4.2, and 4.3. Every time, the same principle is applied. We will give the proof of the induction procedure at step 2.

We assume so far the created clusters are exactly the same. Suppose is part of some unmarked cluster . We investigate what the Connect procedure results in for the two different algorithms. Suppose Connect outputs in algorithm . That means all neighbors of end up in . Hence has no neighbors in , as . Therefore algorithm will output .

Now suppose Connect outputs in algorithm . For contradiction, suppose that algorithm outputs . When algorithm calls the procedure Connect , we know , as it is a neighbor. We note that Connect sorts ascendingly according to weights, and in case of equal weights the smallest ID comes first. Since , the first option is accepted. So , must come before . Meaning that , or and . In both cases, also comes before when algorithm runs Connect. Since algorithm did not accept , this implies that . That means , thus is not a neighbor of in ; a contradiction.

Similar arguments hold for all other indicated steps. We conclude that both algorithms output the same graph. Baswana and Sen [BS07, Theorem 4.3] show that this is a -spanner for and that it has at most edges in expectation.

For the orientation, we simply orient edges within a cluster from child to parent. We orient edges between clusters from the vertex that added it to the other vertex. If both endpoint of an edge want to add the edge, we orient it arbitrarily. According to Baswana and Sen, each vertex adds edges in expectation, giving the result. ∎

Next, we analyze the running time of the algorithm.

###### Lemma 3.2.

The algorithm Spanner() takes rounds.

###### Proof.

The algorithm consists of phases, consisting of step 1, 2, and 3, and a final step 4. In step 1, the center needs to broadcast the result of the marking to all vertices in its cluster. This takes at most rounds, as the cluster is a tree of depth at most . In step 2 there is only one message: vertices in unmarked clusters announce which marked cluster they join (if any), by broadcasting the ID of the vertex they are connecting to and the weight of the corresponding edge. This takes rounds. In step 3, each vertex broadcasts the edges added to the spanner and the corresponding weights, taking rounds per edge. Clearly the number of edges added in each phase is bounded by the total number of added edges. The latter is in expectation and with high probability. Step 4 is adheres the same upper bound as step 3.

Adding all of this together, we obtain phases, each consisting of at most rounds, and a final step of at most rounds. This results in a total of at most rounds. ∎

We end this section with the following straight forward algorithm to compute a bundle of spanners.

By Lemma 3.1, this algorithm produces a -bundle of -spanners, where . By Lemma 3.2, it takes a total of rounds.

### 3.2 Sparsification

The algorithm we give for spectral sparsification is based upon Algorithm 1, as given in Section 2.2. Below, in Algorithm 4, we give a more concrete version of this algorithm, specifying how to compute the bundle spanner. This algorithm repeatedly calculates a -bundle spanner, and adds the remaining edges with probability . We amend this algorithm to be able to apply it in the Broadcast CONGEST model. The key difference is that whenever we need to keep edges with probability we do this ad hoc and ‘locally’, rather than a priori and ‘central’.

Kyng et al. [KPPS17] have shown that the number of spanners in each bundle can be kept the same throughout the algorithm, as opposed to increasing it in each iteration, which is done in the original algorithm of Koutis and Xu [KX16]. This results into a reduction of in the size of the spanner.

We use the spanner construction given in the previous section, which incorporates the ad hoc sampling with the spanner construction.

For correctness, we relate the output of our sparsification algorithm, to the output of the sparsification algorithm from Koutis and Xu [KX16], where we use the improved version of Kyng et al. [KPPS17] with fixed .

###### Lemma 3.3.

Given any input graph , and any possible graph , we have that

 P[{SpectralSparsify(}\emph{V,E,w,ϵ}{)}=H]=P[{SpectralSparsify-apriori(}\emph{V,E,w,ϵ}{)}=H].
###### Proof.

Throughout this proof, we will use superscripts for the setting with a priori sampling and for the setting with ad hoc sampling, when both are equal we omit the superscript.

We will show that at every step, the probability that a certain edge gets added to the spanner is the same in both algorithms. We will prove this by induction, under the assumption that the algorithms have led to the same result up to a given point. The base case is easy: here all probabilities are 1, thus both algorithms behave the same.

Now for the induction step, we assume:

• the first -bundle spanners are created exactly the same for ,

• the first spanners of the -th -bundle spanner are created the same ,

• the first phases of computing the -th spanner have been the same.

Moreover, we assume that both algorithms for computing the -th spanner use the same random bits for marking clusters.

There are in fact multiple induction steps, occurring whenever an edge is chosen to be part of the spanner. These decisions take place in steps 2, 3.1, 3.2, 4.1, 4.2, and 4.3. In each of these steps, the same principle is applied. We will give the proof of the induction procedure at step 2.

Let be a vertex in an unmarked cluster. Suppose that Connect is considering to connect to some neighbor in an unmarked cluster . We have to show that the probability that is accepted by Connect with ad hoc sampling, is the same as the probability that it exists in the algorithm with a priori sampling.

First, suppose that . Let be the last -bundle that was part of. Then in the ad hoc setting it is accepted by Connect with probability . In the a priori setting, the edge exists with times the probability it existed in , resulting in the total probability .

Now suppose for some . We will show . We proceed by contradiction, so assume . Hence also . Now we look at the -th spanner of the -th -bundle spanner. Since , we know that two things can be the case.

• When the algorithm with ad hoc sampling called Connect, this has accepted with or and . This means that when the algorithm with a priori sampling calls Connect, it will try before and thus adds to . This implies , a contradiction.

• When the algorithm with ad hoc sampling called Connect, it returned . Since is an option for the algorithm with a priori sampling. It has at least one option, so will choose some (perhaps equal to ). Resulting in , a contradiction.

Similar arguments hold for all other indicated steps, hence by induction, the probabilities that a certain graph is equal to the constructed -bundle spanners occurring in the construction of the algorithms are the same. It is left to show that for remaining edges the probability of being added to is the same in both algorithms.

Suppose . Let be the index of the last bundle spanner was part of (possibly zero).

• In the a priori algorithm, the probability of being added to the next phase is each time. Thus the probability of it lasting until the end is .

• In the ad hoc algorithm, the probability of existing gets lowered by a factor each phase, and reset to if is part of the bundle spanner. Hence resulting in in the last phase.

Now suppose . This means the ad hoc algorithm will not try to add it to , since it was part of for some . This means in creating the -th bundle spanner, it was considered, but not accepted. As in the a priori sub procedure of computing the -th bundle spanner, and we know that , we can deduce that