1 Introduction
In this paper, we study algorithms for the Broadcast Congested Clique (BCC) model [DKO12]. In this model, the (problemspecific) input is distributed among several processors and the goal is that at the end of the computation each processor knows the output or at least the share of the output relevant to it. The computation proceeds in rounds and in each round each processor can send one message to all other processors. We can also view the communication as happening via a shared blackboard to which each processor may write (in the sense of appending) at most one message per round. The main metric in designing and analyzing algorithms for the Broadcast Congested Clique is the number of rounds performed by the algorithm.
A typical way of for example distributing an input matrix among processors would be that initially processor only knows row of the matrix. In many graph problems, this input matrix is the adjacency matrix of the graph. If communication with other processors is only possible along the edges of this graph, then the resulting model is often called the Broadcast CONGEST model [Lynch96]. Note that the unicast versions of these models, in which each processor may send a different message to each (neighboring) processor, are known as the Congested Clique [LPSPP05] and the CONGEST model [Peleg00], respectively.
In this paper, we bring the main tools of the socalled Laplacian paradigm to the BCC model. In a seminal paper, Spielman and Teng developed an algorithm for approximately solving linear systems of equations with a Laplacian coefficient matrix in a nearlinear number of operations [ST14]. The Laplacian paradigm [Teng10]
refers to exploring the applications of this fast primitive in algorithm design. In a broader sense, this paradigm is also understood as the more general idea of employing linear algebra methods from continuous optimization outside of their traditional domains. Using such methods is very natural in distributed models because a matrixvector multiplication can be carried out in a single round if each processor stores one coordinate of the vector. In recent years, this methodology has been successfully employed in the CONGEST model
[GKK+15, BeckerFKL21] and in particular, solvers for Laplacian systems with nearoptimal round complexity have been developed for the CONGEST model – in networks with arbitrary topology [FGLP+20] and in boundedtreewidth graphs [AGL21] – and for the HYBRID model [AGL21]. In this paper, we switch the focus to the BCC model and show that it allows a faster implementation of the basic Laplacian primitive.What further makes the BCC model intriguing is that – in contrast to the Congested Clique – for several problems no tailored BCC algorithms are known that are significantly faster than lowdiameter versions of (Broadcast) CONGEST model algorithms. Consider, for example, the singlesource shortest path problem. In the (Broadcast) CONGEST model, the fastest known algorithm takes rounds [ChechikM20], where is the diameter of the underlying (unweighted) communication network.^{1}^{1}1 Throughout the introductory part of this paper we often assume that all weights of graphs and entries of matrices are polynomially bounded to simplify some statements of running time bounds. In the BCC model, the state of the art for this problems is rounds [Nanongkai14], which essentially is not more efficient than the special case of the Broadcast CONGEST model. In the Congested Clique model however, is not a barrier for this problem as it can be solved in rounds [CDKL21] on undirected graphs. A similar classification can be made for directed graphs [ForsterN18, CensorHillelKK19]. This naturally leads to the question whether BCC algorithms can be developed that are faster than their CONGEST model counterparts, since it is not clear which one dominates the other in strength.
It has recently been shown that in the CONGEST model, the maximum flow problem as well as the unitcapacity minimum cost flow problem can be solved in rounds [FGLP+20], where denotes the number of edges of the input graph; note that this round complexity can only be sublinear in for sparse graphs.
Our contributions.
Our main result is an algorithm that solves the minimum cost flow problem^{2}^{2}2Note that in contrast to the algorithm of Forster et al. [FGLP+20], we do not need to assume unit capacities. (which generalizes both the singlesource shortest path problem and the maximum flow problem) in rounds in the BCC model, which in particular is sublinear for any graph density and matches the currently known upper bounds for the singlesource shortest paths problem.
Theorem 1.1.
There exists a Broadcast Congested Clique algorithm that, given a directed graph with integral costs and capacities with and , computes a minimum cost maximum 
flow with high probability in
rounds.In obtaining this result, we develop machinery of the Laplacian paradigm that might be of independent interest. The first such tool is an algorithm for computing a spectral sparsifier in the Broadcast CONGEST model.
Theorem 1.2.
There exists an algorithm that, given a graph with positive real weights satisfying and an error parameter , with high probability outputs a spectral sparsifier of , where . Moreover, we obtain an orientation on such that with high probability each edge has outdegree . The algorithm runs in rounds in the Broadcast CONGEST model.
At a high level, our sparsifier algorithm is a modification of the CONGESTmodel algorithm of Koutis and Xu [KX16]; essentially, uniform edge sampling is trivial in the CONGEST model, but challenging in the Broadcast CONGEST model. Note that the sparsifier algorithm of Koutis and Xu being restricted to the CONGEST model is a major obstacle for implementing the CONGESTmodel Laplacian solver of Forster et al. [FGLP+20] also in the Broadcast CONGEST model.
Making the sparsifier known to every processor leads to a simple residualcorrection algorithm for solving systems of linear equations with a Laplacian coefficient matrix up to high precision in the BCC model. Note that there is reduction [Gremban96] from solving linear equations with symmetric diagonally dominant (SDD) coefficient matrices to solving linear equations with Laplacian coefficient matrices, which also applies in the Broadcast Congested Clique.
Theorem 1.3.
There exists an algorithm in the Broadcast Congested Clique model that, given a graph , with positive real weights satisfying and Laplacian matrix , a parameter , and a vector , outputs a vector such that , for some satisfying . The algorithm needs preprocessing rounds and takes rounds for each instance of .
Finally, we show how to implement the algorithm of Lee and Sidford [LS14]^{3}^{3}3Note that in the more technical parts of our paper we explicitly refer to the arXiv preprints [LS13] and [LS19] instead of the conference version [LS14]. for solving linear programs up to small additive error in iterations in the BCC model. Here, the rank refers to the constraint matrix of the LP and in each iteration a linear system needs to be solved. If the constraint matrix has a special structure – which is the case for the LP formulation of the minimum cost flow problem – then a highprecision Laplacian solver can be employed for this task.
Theorem 1.4.
Let be a constraint matrix with , let be a demand vector, and let be a cost vector. Moreover, let be a given initial point in the feasible region . Suppose a Broadcast Congested Clique network consists of vertices, where each vertex knows both every entire th row of for which and knows if . Moreover, suppose that for every and positive diagonal we can compute up to precision in rounds. Let . Then with high probability the Broadcast Congested Clique algorithm LPSolve outputs a vector with in rounds.
While this approach of solving LPs is inherently parallelizable (as the PRAM depth analysis of Lee and Sidford indicates), several steps pose a challenge for the BCC model and require more than a mere “translation” between models. In particular we need to use a different version of the JohnsonLindenstraus lemma to approximate leverage scores. Further we give a BCC algorithm for projecting vectors on a mixed norm ball.
As in the approach of Lee and Sidford, our main result on minimum cost maximum flow then follows from plugging a suitable linear programming formulation of the problem into the LP solver.
Overview.
We provide a visual overview of the results in this paper and how they are interconnected in Figure 1.
To compute spectral sparsifiers in the Broadcast CONGEST model, we follow the setup of Koutis and Xu [KX16]. Roughly said, this consists of repeatedly computing spanners and retaining each edge that is not part of a spanner with probability . While this easily allows for an implementation in the CONGEST model (as pointed out by Koutis and Xu), it is not clear how to do this in a broadcast model – neither the Broadcast CONGEST model, nor the more powerful Broadcast Congested Clique.^{4}^{4}4We believe that it would be interesting to explore whether the boundedindependence sampling technique of Doron et al. [DMVZ20] could also be applied to the algorithmic framework of Koutis and Xu [KX16]. Such a sampling method based on a random seed of polylogarithmic size would also significantly simplify an argument in the quantum sparsifier algorithm of Apers and de Wolf [ApersW20]. Note that in the Broadcast Congested Clique model, a designated vertex could initially sample such a small random seed and communicate it to all other vertices with only a polylogarithmic overhead in round complexity. In the Broadcast CONGEST model, such an approach would however lead to an overhead of rounds (the diameter of the underlying communication network), which, as we show, is avoidable. A straightforward way to sample an edge would be that one of its endpoints (say the one with the lower ID) decides if it should further exist. The problem with this approach is that a vertex might be responsible for performing the sampling of a polynomial number of incident edges and the broadcast constraint prevents this vertex from sharing the result with each of the corresponding neighbors. We overcome this obstacle as follows. We explicitly maintain the probability that an edge still exists in the current iteration of the sparsifier algorithm of Koutis and Xu. Every time that an edge should be added to the current iteration’s spanner according to the spanner algorithm, one of the endpoints samples whether the edge exists using the maintained probability. Due to the vertex’ subsequent action in the spanner algorithm, the corresponding neighbor can deduce the result of the sampling. We show that this idea of implicitly learning about the result of the sampling can be implemented by modifying the spanner algorithm of Baswana and Sen [BS07]. We present our modification to compute a spanner on a “probabilistic” graph (in the sense described above) in Section 3.1. In Section 3.2, we prove that this can be plugged into the framework of Koutis and Xu to compute a spectral sparsifier in the Broadcast CONGEST model. Subsequently, we show in Section 3.3 that the spectral sparsifier can be used for Laplacian solving with standard techniques.
In Section 4, we present our LP solver. Given a linear program of the form^{5}^{5}5Following Lee and Sidford, we write instead of the more common for the linear program, since this means that corresponds with the number of vertices and with the number of edges in LP formulations of flow problems.
for some constraint matrix and some convex region , Lee and Sidford [LS14, LS19] show how to find an approximate solution in time. An implementation of this algorithm in the Broadcast Congested Clique is rather technical and needs new subroutines, the main one being our Laplacian solver.
The algorithm is an interior point method that uses weighted path finding to make progress. The weights used are the Lewis weights, which can be approximated up to sufficient precision using the computation of leverage scores, which are defined as , where in our case , for some diagonal matrix . Computing leverage scores exactly is expensive, hence these too are approximated. This can be done using the observation that and the JohnsonLindenstrauss lemma [JL84], which states that there exists a map such that , for polylogarithmic . Nowadays, several different (randomized) constructions for exist. A common choice in the realm of graph algorithms [SS11, LS19] is to use Achlioptas’ method [Achlioptas03], which samples each entry of with a binary coin flip. However, this is in practice not feasible in the Broadcast Congested Clique: we would need a coin flip for every edge, which can be performed by one of the endpoints, but cannot be communicated to the other endpoint due to the broadcast constraint. Instead we use the result of Kane and Nelson [KN12], that states we need only a polylogarithmic number of random bits in total. These can simply be sampled by one vertex and broadcast to all the other, who then internally construct . Now if we can multiply both and by a vector, and solve linear systems involving , for diagonal , then we can compute these leverage scores efficiently. These demands on are not unreasonable when we consider graph problems, because in such cases the constraint matrix will adhere to the structure of the graph Laplacian, and hence our Laplacian solver can be applied.
A second challenge in implementing Lee and Sidford’s LP solver is a subroutine that computes projections on a mixed norm ball. To be precise: for distributed over the network, the goal is to find
We show that we can solve this maximization problem when we know the sums , , and for all . Computing such a sum for fixed is feasible in a polylogarithmic number of rounds. Moreover, we show that we do not need to inspect these sums for all , but that we can do a binary search, which reduces the total time complexity to polylogarithmic.
Following Lee and Sidford [LS14], we apply the LP solver to an LP formulation of the minimum cost maximum flow problem in Section 5. The corresponding constraint matrix has rows and thus rank . Furthermore, (for any diagonal matrix ) is symmetric diagonally dominant and thus can be approximated to high precision in a polylogarithmic number of rounds with our Laplacian solver. We only need to solve the LP up to precision , since we can round the approximate solution to an exact solution. Hence, the minimum maximum cost flow LP can be solved in rounds.
2 Preliminaries
First we detail the models we will be working with. Next, we review spanners and sparsifiers, and how to construct the latter from the former. Then we show how spectral sparsifiers can be used for solving Laplacian systems. Finally, we introduce flow problems on weighted graphs.
2.1 Models
In this paper, we consider multiple variants of message passing models with bandwidth constraints on the communication. Let us start by defining the CONGEST model. The CONGEST model [Peleg00] consists of a network of processors, which communicate in synchronous rounds. In each round, a processor can send information to its neighbors over a nonfaulty link with limited bandwidth. We model the network of processors by a graph , where we identify the processors with the vertices and the communication links with the edges. We write and . Each vertex has a unique identifier of size , initially only known by the vertex itself and its neighbors. Computation in this model is done in rounds. At the start of each round, each vertex can send one message to each of its neighbors, and receives messages from them. The messages are of size at most . Before the next round, each vertex can perform (unlimited) internal computation. We measure the efficiency of an algorithm by the number of rounds.
In the CONGEST model, each vertex can send distinct messages to each of its neighbors. A more strict assumption on message passing, is that each vertex sends the same message to each of its neighbors, essentially broadcasting it to its neighbors. The CONGEST model together with this assumption is called the Broadcast CONGEST model [Lynch96].
Alternatively, we can let the communication network be independent of the graph being studied. More precisely, we allow communication between each pair of vertices. Together with the bandwidth constraint, this is called the Congested Clique [LPSPP05]. If we also impose the broadcast constraint, we have the Broadcast Congested Clique [DKO12].
2.2 Spanners and Spectral Sparsification
The Laplacian matrix of a weighted graph , or the graph Laplacian, is a matrix defined by
Alternatively, we can define the Laplacian matrix in terms of the edgevertex incidence matrix , defined by
The Laplacian then becomes , where is the diagonal matrix defined by the weights: .
Spectral sparsifiers were first introduced by Spielman and Teng [ST11]. A spectral sparsifier is a (reweighted) subgraph that has approximately the same Laplacian matrix as the original graph.
Definition 2.1.
Let be a graph with weights , and . We say that a subgraph with weights is a spectral sparsifier for if we have for all :
(1) 
where and are the Laplacians of and respectively.
We introduce the shorthand notation when is positive semidefinite. This reduces equation 1 to .
Koutis and Xu [KX16] showed how to compute a spectral sparsifier by repeatedly computing spanners. This technique was later slightly improved by Kyng et al. [KPPS17]. Spanners are a special type of spanning subgraphs, where we demand that distances are preserved up to a constant factor. Trivially, any graph is a spanner of itself. In practice, the goal will be to find sparse subgraphs that are still spanners for the input graph.
Definition 2.2.
Let be a graph with weights . We say that a subgraph with weights is a spanner of stretch for if for each we have
where we write for the distance from to in . A bundle spanner of stretch is a union , where each is a spanner of stretch in .
The algorithm of Koutis and Xu is relatively simple: compute a bundle spanner of stretch , sample the remaining edges with probability , repeat for iterations on the computed bundle spanner and sampled edges. The sparsifier then consists of the last bundle spanner, together with the set of edges left after the iterations, where edges are reweighted in a certain manner. In the original algorithm, the stretch was fixed, but the number of spanners in each bundle grew in each iteration. Kyng et al. [KPPS17] showed that can be kept constant throughout the algorithm, leading to a sparser result.
2.3 Laplacian Solving
We consider the following problem. Let be the Laplacian matrix for some graph on vertices. Given , we want to solve . Solving Laplacian equation exactly can be computationally demanding. Therefore, we consider an approximation to this problem: we want to find such that , where we write for any . One way to approach this is by using a spectral sparsifier of . Hereto we use preconditioned Chebyshev iteration, a well known technique from numerical analysis [Axelsson96, Saad03]. The statement below most closely resembles the formulation of Peng [peng13].
Theorem 2.3.
Suppose we have symmetric positive semidefinite matrices , and a parameter satisfying
Then there exists an algorithm that, given a vector and parameter , returns a vector such that
for some satisfying . The algorithm takes iterations, each consisting of multiplying by a vector, solving a linear system involving , and a constant number of vector operations.
This yields the following corollary for Laplacian solving using spectral sparsifiers.
Corollary 2.4.
Let be a weighted graph on vertices, let be a parameter, and let a vector. Suppose is a spectral sparsifier for . Then there exists an algorithm that outputs a vector such that , for some satisfying . The algorithm takes iterations, each consisting of a multiplying by a vector, solving a Laplacian system involving , and a constant number of vector operations.
Proof.
As is a sparsifier for , we have: , which we can rewrite to
We set and , which are clearly both symmetric positive semidefinite. Furthermore, we set . We apply Theorem 2.3 with these settings to obtain the result. ∎
2.4 Flow Problems
In this section we formally define the maximum flow and the minimum cost maximum flow problems. Let be a directed graph, with capacities , and designated source and target vertices . We say is an  flow if

for each vertex we have ;

for each edge we have .
The value of the flow is defined as . The maximum flow problem is to find a flow of maximum value. Additionally, we can have costs on the edges: . The cost of the flow is defined as . The minimum cost maximum flow problem is to find a flow of minimum cost among all flows of maximum value.
Both problems allow for a natural linear program formulation. We present one for the minimum cost maximum flow problem, as this is the more general problem. Denote for the edgevertex incidence matrix (see Section 2.2). Then we can write this as:
for the value of the maximum flow, and and the vectors defined by . The answer to the minimum cost maximum flow problem is then found by a binary search over .
3 Spectral Sparsifiers and Laplacian Solving
In this section, we show how to construct spectral sparsifiers in the Broadcast CONGEST model, so in particular also for the Broadcast Congested Clique. We do this following the method of Koutis and Xu [KX16], which consists of repeatedly computing spanners and sampling the remaining vertices, see Section 2.2. While sampling edges is easy in the CONGEST model, it is highly nontrivial in the Broadcast CONGEST model. The reason for this is that in the CONGEST model the sampling of an edge can be done by one endpoint, and communicated to the other endpoint. In the Broadcast CONGEST model, the sampling can be done by one endpoint, but the result cannot be communicated efficiently to the other endpoint due to the broadcast constraint. To circumvent this, we show that the sampling needed for spectral sparsification can be done on the fly, rather than a priori in each iteration. Moreover, we show the result can be communicated implicitly. In Section 3.1, we show how to compute spanners where we have probabilities on edges existing, whether an edge exists is evaluated on the fly and (implicitly) communicated to the other endpoint. In Section 3.2 we show how to use this spanner construction to compute spectral sparsifiers in the Broadcast CONGEST model.
3.1 Spanners with Probabilistic Edges
Our goal is to compute a spanner for a given probabilistic graph. More precisely, let be an undirected, weighted graph on vertices, with a probability function on the edges, and the parameter for the stretch of the spanner. We will give an algorithm Spanner(,,,,) that computes a subset , and divides this into two sets , such that each edge is part of independently with probability . This results in a spanner for all graphs , where .
Since this is a distributed algorithm, the output comes in a local form. At the end, each vertex has identified and , where .
When , our algorithm essentially reduces to the algorithm of BaswanaSen from [BS07]. All computational steps coincide, but a difference in communication remains. The reason hereto is that in our algorithm the weights of edges are included in the communication. Depending on the magnitude of the weights, this can result in multiple rounds for each message, and consequently more rounds in total.
For the presentation of Baswana and Sen’s algorithm, we follow the equivalent formulation of Becker et al. [BeckerFKL21], which can be found in Appendix A. The general idea is that clusters are formed and revised through a number of phases. In each phase, a few of the existing clusters are sampled. These clusters move onto the next phase. Vertices from an unsampled cluster try to connect to a sampled cluster and to some neighboring clusters. As edges only exist with a certain probability, they need to be sampled before they can be used. We will make sure that the two vertices adjacent to an edge, never try to use it at the same time. When a vertex has tried to use an edge, the edge will always be broadcasted if it exists. If not, it turns out that the other vertex adjacent to this edge will be able to deduce this, without it being communicated explicitly.
Whenever we speak of the neighbors of a vertex , denoted by , we mean all neighbors that do not lie in the set of ‘deleted neighbors’: . Note that this set of neighbors will be subject to change throughout the process, as the number of elements in grows.
Step 1: Cluster marking
Initially, each vertex is a singleton cluster: . The main part of the algorithm will be ‘phases’, indexed . In phase , the center of each cluster (the first vertex in the cluster) marks the cluster with probability and broadcasts this result to the cluster. These clusters will move on to the next phase: is defined to be the set of clusters marked in phase . We define the identifier of a cluster to equal the ID of the center of the cluster.
Each phase consists of cluster marking, followed by steps 2 and 3.
Step 2: Connecting to marked clusters
Let be a vertex in an unmarked cluster . The first thing does, is trying to connect to one of the marked clusters. It does this using the procedure Connect. Hereto we define to be the set of all neighbors of which lie in a marked cluster: . Now we let . Note that if , Connect returns . If , we broadcast . If it returns , we add to , joins the cluster of (it stores this decision by saving ), and we broadcast . In both cases, we add to .
After this step, all vertices in unmarked clusters may have joined marked clusters, and they have updated their sets by adding , and by adding . We also want to propagate these updates in to the neighbors of . This is easy for , since we can broadcast . However, we do not want to broadcast the set , as it can be large. Instead we make use of the choices in Connect to communicate changes in implicitly.
Let be a neighbor of in a marked cluster. If has broadcasted , then adds to . There are three situations where adds to :

If broadcasted ;

If broadcasted with ;

If broadcasted with and .
In any other case, does nothing. This step ensures that gets added to if and only if . In total, this results in for all vertices .
As a final note: each vertex has broadcasted the ID of the cluster it joins, its neighbors keep track of these changes, as they will need the new cluster IDs when they try to connect to a marked cluster in the next phase. For the remainder of this phase (step 3), the ‘old’ cluster IDs are still valid.
Step 3: Connections between unmarked clusters
In this step, we create connections between the unmarked clusters. In the previous part, the situation was asymmetric: vertices of unmarked clusters connected to vertices in marked clusters. To make sure that at most one vertex decides upon the existence of an edge, we create two substeps. In the first substep a vertex in cluster can only connect to a neighboring cluster if . In the second substep, a vertex can only connect to neighboring clusters with higher ID. This way all necessary connections can be made, while no two vertices will simultaneously try to decide on the existence of the edge between them.
Step 3.1: Connecting to a cluster with a smaller ID
Let be a vertex in an unmarked cluster . We will try to connect to each neighboring cluster with . Fix such a cluster . Let be the neighbors of in this cluster, with , i.e. . Similar as before, we run Connect to decide which neighbor to connect to: . If it returns , we add to and we broadcast . If Connect returns , we simply broadcast . In both cases we add to . Again we wish to propagate these updates to ’s neighbors. As before, we communicate this implicitly.
Let be a vertex in neighboring cluster with and . If has broadcasted , then adds to . Again, there are three situations where adds to :

If broadcasted ;

If broadcasted with ;

If broadcasted with and .
In any other case, does nothing. As before, note that this step ensures that for all vertices .
Step 3.2: Connecting to a cluster with a bigger ID
Vertices in an unmarked cluster have now connected to neighboring unmarked clusters with and the sets have been updated accordingly. However, we need to connect to all unmarked neighboring clusters, just as in the original algorithm (as depicted in Appendix A). Therefore we move on to the neighboring clusters with . The process for these clusters is completely analogous to substep 3.1, and thus will not be given here.
Step 4: After the phases
In the last part of the algorithm, we want to connect each vertex to all its neighboring clusters in . This is again done in three steps, similar to the steps 2, 3.1, and 3.2 in the phases above.

All vertices that are not part of any remaining cluster connect, using Connect , to each neighboring remaining clusters . As before, they broadcast how they connect such that vertices in remaining clusters can add edges to accordingly.

Vertices connect, using Connect , to each neighboring remaining clusters with . As before, they broadcast the result, such that neighbors can add edges to accordingly.

Vertices connect, using Connect , to each neighboring remaining clusters with . As before, they broadcast the result, such that neighbors can add edges to accordingly.
In the following lemma we show that this algorithm indeed gives a spanner of stretch .
Lemma 3.1.
The spanner has stretch at most for all graphs , where . For any choice of , it has at most edges in expectation. Moreover, we obtain an orientation on such that each edge has outdegree in expectation.
Proof.
First of all, note that setting reduces this more involved algorithm to the original algorithm, given in Appendix A, which we know to correctly create a spanner. We claim Spanner() also outputs as spanner, under the following assumption on the marking of clusters. In step 1, each cluster marks itself with probability . We can imagine that it does this by drawing from some source of random bits. Our assumption is that these random bits are the same for both algorithms. This assumption can be made, since these bits are independent of the probability on the edges. From now on, we call Spanner() algorithm and Spanner() algorithm . We claim that if algorithm outputs , and , that using as its input, algorithm will output . Since we already know that the output of algorithm gives a spanner for , this proves the lemma.
We will not only show that the output of the two algorithms is the same. We will even show that all intermediate steps (creating clusters and selecting spanner edges) are the same. We will prove this claim by induction. It is clear that the initialization of both algorithms is the same. We need to show that if both algorithms have produced the same situation up to a certain point, the next decision will also be the same. These decisions take place whenever a vertex tries to connect to some cluster. This happens in steps 2, 3.1, 3.2, 4.1, 4.2, and 4.3. Every time, the same principle is applied. We will give the proof of the induction procedure at step 2.
We assume so far the created clusters are exactly the same. Suppose is part of some unmarked cluster . We investigate what the Connect procedure results in for the two different algorithms. Suppose Connect outputs in algorithm . That means all neighbors of end up in . Hence has no neighbors in , as . Therefore algorithm will output .
Now suppose Connect outputs in algorithm . For contradiction, suppose that algorithm outputs . When algorithm calls the procedure Connect , we know , as it is a neighbor. We note that Connect sorts ascendingly according to weights, and in case of equal weights the smallest ID comes first. Since , the first option is accepted. So , must come before . Meaning that , or and . In both cases, also comes before when algorithm runs Connect. Since algorithm did not accept , this implies that . That means , thus is not a neighbor of in ; a contradiction.
Similar arguments hold for all other indicated steps. We conclude that both algorithms output the same graph. Baswana and Sen [BS07, Theorem 4.3] show that this is a spanner for and that it has at most edges in expectation.
For the orientation, we simply orient edges within a cluster from child to parent. We orient edges between clusters from the vertex that added it to the other vertex. If both endpoint of an edge want to add the edge, we orient it arbitrarily. According to Baswana and Sen, each vertex adds edges in expectation, giving the result. ∎
Next, we analyze the running time of the algorithm.
Lemma 3.2.
The algorithm Spanner() takes rounds.
Proof.
The algorithm consists of phases, consisting of step 1, 2, and 3, and a final step 4. In step 1, the center needs to broadcast the result of the marking to all vertices in its cluster. This takes at most rounds, as the cluster is a tree of depth at most . In step 2 there is only one message: vertices in unmarked clusters announce which marked cluster they join (if any), by broadcasting the ID of the vertex they are connecting to and the weight of the corresponding edge. This takes rounds. In step 3, each vertex broadcasts the edges added to the spanner and the corresponding weights, taking rounds per edge. Clearly the number of edges added in each phase is bounded by the total number of added edges. The latter is in expectation and with high probability. Step 4 is adheres the same upper bound as step 3.
Adding all of this together, we obtain phases, each consisting of at most rounds, and a final step of at most rounds. This results in a total of at most rounds. ∎
We end this section with the following straight forward algorithm to compute a bundle of spanners.
3.2 Sparsification
The algorithm we give for spectral sparsification is based upon Algorithm 1, as given in Section 2.2. Below, in Algorithm 4, we give a more concrete version of this algorithm, specifying how to compute the bundle spanner. This algorithm repeatedly calculates a bundle spanner, and adds the remaining edges with probability . We amend this algorithm to be able to apply it in the Broadcast CONGEST model. The key difference is that whenever we need to keep edges with probability we do this ad hoc and ‘locally’, rather than a priori and ‘central’.
Kyng et al. [KPPS17] have shown that the number of spanners in each bundle can be kept the same throughout the algorithm, as opposed to increasing it in each iteration, which is done in the original algorithm of Koutis and Xu [KX16]. This results into a reduction of in the size of the spanner.
We use the spanner construction given in the previous section, which incorporates the ad hoc sampling with the spanner construction.
For correctness, we relate the output of our sparsification algorithm, to the output of the sparsification algorithm from Koutis and Xu [KX16], where we use the improved version of Kyng et al. [KPPS17] with fixed .
Lemma 3.3.
Given any input graph , and any possible graph , we have that
Proof.
Throughout this proof, we will use superscripts for the setting with a priori sampling and for the setting with ad hoc sampling, when both are equal we omit the superscript.
We will show that at every step, the probability that a certain edge gets added to the spanner is the same in both algorithms. We will prove this by induction, under the assumption that the algorithms have led to the same result up to a given point. The base case is easy: here all probabilities are 1, thus both algorithms behave the same.
Now for the induction step, we assume:

the first bundle spanners are created exactly the same for ,

the first spanners of the th bundle spanner are created the same ,

the first phases of computing the th spanner have been the same.
Moreover, we assume that both algorithms for computing the th spanner use the same random bits for marking clusters.
There are in fact multiple induction steps, occurring whenever an edge is chosen to be part of the spanner. These decisions take place in steps 2, 3.1, 3.2, 4.1, 4.2, and 4.3. In each of these steps, the same principle is applied. We will give the proof of the induction procedure at step 2.
Let be a vertex in an unmarked cluster. Suppose that Connect is considering to connect to some neighbor in an unmarked cluster . We have to show that the probability that is accepted by Connect with ad hoc sampling, is the same as the probability that it exists in the algorithm with a priori sampling.
First, suppose that . Let be the last bundle that was part of. Then in the ad hoc setting it is accepted by Connect with probability . In the a priori setting, the edge exists with times the probability it existed in , resulting in the total probability .
Now suppose for some . We will show . We proceed by contradiction, so assume . Hence also . Now we look at the th spanner of the th bundle spanner. Since , we know that two things can be the case.

When the algorithm with ad hoc sampling called Connect, this has accepted with or and . This means that when the algorithm with a priori sampling calls Connect, it will try before and thus adds to . This implies , a contradiction.

When the algorithm with ad hoc sampling called Connect, it returned . Since is an option for the algorithm with a priori sampling. It has at least one option, so will choose some (perhaps equal to ). Resulting in , a contradiction.
Similar arguments hold for all other indicated steps, hence by induction, the probabilities that a certain graph is equal to the constructed bundle spanners occurring in the construction of the algorithms are the same. It is left to show that for remaining edges the probability of being added to is the same in both algorithms.
Suppose . Let be the index of the last bundle spanner was part of (possibly zero).

In the a priori algorithm, the probability of being added to the next phase is each time. Thus the probability of it lasting until the end is .

In the ad hoc algorithm, the probability of existing gets lowered by a factor each phase, and reset to if is part of the bundle spanner. Hence resulting in in the last phase.
Now suppose . This means the ad hoc algorithm will not try to add it to , since it was part of for some . This means in creating the th bundle spanner, it was considered, but not accepted. As in the a priori sub procedure of computing the th bundle spanner, and we know that , we can deduce that