# A Faster Distributed Single-Source Shortest Paths Algorithm

We devise new algorithms for the single-source shortest paths problem in the CONGEST model of distributed computing. While close-to-optimal solutions, in terms of the number of rounds spent by the algorithm, have recently been developed for computing single-source shortest paths approximately, the fastest known exact algorithms are still far away from matching the lower bound of Ω̃ (√(n) + D) rounds by Peleg and Rubinovich [SICOMP'00], where n is the number of nodes in the network and D is its diameter. The state of the art is Elkin's randomized algorithm [STOC'17] that performs Õ(n^2/3 D^1/3 + n^5/6) rounds. We significantly improve upon this upper bound with our two new randomized algorithms, the first performing Õ (√(n D)) rounds and the second performing Õ (√(n) D^1/4 + n^3/5 + D) rounds.

• 2 publications
• 34 publications
04/15/2018

### A Deterministic Distributed Algorithm for Exact Weighted All-Pairs Shortest Paths in Õ(n^3/2) Rounds

We present a deterministic distributed algorithm to compute all-pairs sh...
07/12/2021

### Optimally Reliable Cheap Payment Flows on the Lightning Network

Today, payment paths in Bitcoin's Lightning Network are found by searchi...
10/12/2019

### Reachability and Shortest Paths in the Broadcast CONGEST Model

In this paper we study the time complexity of the single-source reachabi...
10/26/2020

### Distance Computations in the Hybrid Network Model via Oracle Simulations

The Hybrid network model was introduced in [Augustine et al., SODA '20] ...
05/30/2022

### Fully Polynomial-Time Distributed Computation in Low-Treewidth Graphs

We consider global problems, i.e. problems that take at least diameter t...
05/19/2020

### Faster Deterministic All Pairs Shortest Paths in Congest Model

We present a new deterministic algorithm for distributed weighted all pa...
07/23/2018

### A Faster Deterministic Distributed Algorithm for Weighted APSP Through Pipelining

We present a new approach to computing all pairs shortest paths (APSP) i...

## 1 Introduction

In this paper, we consider the fundamental problem of computing single-source shortest paths (SSSP) in the CONGEST model [Peleg00] of distributed computing. The CONGEST model is one of the major message-passing models studied in the literature and is characterized by synchronized communication in a network via non-faulty bounded-bandwidth links. Such a communication network is modeled by a graph in which the nodes correspond to the processors in the network and the edges correspond to the communication links between the processors.

In unweighted graphs, the SSSP problem can be solved in communication rounds by performing breadth first search, where  is the diameter of the graph. In weighted graphs, where we assume that the edge weights do not represent the communication speed, a straightforward distributed variant of the Bellman-Ford algorithm [Ford56, Bellman58, Moore59] computes SSSP in rounds, where is the number of nodes in the network. Moreover, Peleg and Rubinovich [PelegR00] showed a lower bound of for this problem111In this paper, we use -, -, -, and -notation to suppress factors that are polylogarithmic in ., where is the unweighted diameter of the graph, when viewed as the underlying communication network. ( is often called hop diameter.) The past few years have witnessed many improved upper bounds when approximate solutions are allowed (e.g.  [LenzenPS13, Nanongkai14, HenzingerKN16, ElkinN16, BeckerKKL17, HaeuplerL18, HaeuplerHW18]), culminating in the close-to-optimal [DasSarmaHKKNPPW12, PelegR00, Elkin06, KorKP13, ElkinKNP14] approximation scheme of Becker et al. [BeckerKKL17]. For exact algorithms, the bound from the Bellman-Ford algorithm was the state of the art for a long time until in a recent breakthrough, Elkin [Elkin17] proved an upper bound that is sublinear in  for undirected graphs with non-negative edge weights; he obtained a randomized algorithm that performs rounds for , and rounds for larger .

### 1.1 Our Results

Our main result is a significant improvement upon Elkin’s upper bound for polynomially bounded integer edge weights222In general, our running times scale with the number of bits needed to represent edge weights. and . We devise two randomized (Las Vegas) algorithms, the first performing rounds and the second performing

rounds; both bounds hold with high probability and in expectation. Our first bound matches the

lower bound of Peleg and Rubinovich [PelegR00] up to polylogarithmic factors when is polylogarithmic in .

Our algorithms in fact work in a more restricted model called Broadcast CONGEST, where a node must send the same message to every neighbor in each round. The bound holds when edge weights are in as typically assumed in the literature. Note that our bounds hold even when edges are directed. The directed case refers to when edges have directions, but the communication is always bi-directional. A question about this case was raised in [Nanongkai14]. The previous bound by Elkin does not work for this case. Note that in general our bounds have to be multiplied by the number of bits needed to represent edge weights (which is under a typical assumption), while Elkin’s bound does not.

##### Independent Work by Ghaffari and Li [GhaffariL18].

Recently in STOC 2018, and independently from our results, Ghaffari and Li presented a distributed algorithm for exact SSSP with time complexity and another algorithm with complexity . Our bounds compare favorably with theirs in the entire range of parameters. Like our bounds, their bounds hold when edges are directed. Their bounds also depend on the number of bits needed to represent edge weights.

Using techniques developed in this paper, we obtain two additional results, which might be interesting for putting our main result and our techniques into context. The first is a -approximation -round algorithm for directed SSSP. This algorithm is Monte Carlo, meaning that its approximation guarantee holds with high probability, and works in the Broadcast CONGEST model. Previously such a bound was known only for the special case of single-source reachability [GhaffariU15], where we want to know whether there is a directed path from the source node to each node. None of previous approximation algorithms for SSSP (e.g. [LenzenPS13, Nanongkai14, HenzingerKN16, ElkinN16, BeckerKKL17]) can handle this case.

Our second result is a new work/depth trade-off for exact SSSP on directed graphs in the PRAM model. We provide algorithm with work and depth (for any fixed ). The parallel SSSP problem has received considerable attention in the literature [Spencer97, KleinS97, Cohen97, BrodalTZ98, ShiS99, Cohen00, MeyerS03, MillerPVX15, Blelloch0ST16], but we are aware of only two relevant results for exact SSSP in directed graphs. First, the algorithm of Spencer [Spencer97] has work and depth (for any ). Second, the algorithm of Klein and Subramanian has work and depth, but does not give any trade-offs. Thus, our algorithm gives a better trade-off than the state of the art when and in this case it has work and depth. In this range, the trade-off is (up to polylogarithmic factors) the same as the one of Shi and Spencer [ShiS99] for undirected graphs.333The algorithm of Shi and Spencer uses a hop set construction that is tailored to undirected graphs; the same hop set construction has later also been used in the distributed setting [Nanongkai14, Elkin17].

### 1.2 Further Related Work

Closely related are All-Pairs Shortest Paths (APSP) and -Source Shortest Paths (-SSP) problems. The current best upper bound for APSP is [HuangNS17]. The situation is more complicated for -SSP. We refer to [FrischknechtHW12, PelegRT12, HolzerW12, LenzenP13, Elkin17, HuangNS17, GhaffariL18, AgarwalRKP18] and references therein for details.

The exact shortest paths problem has also received attention in the Congested Clique model, which is the special case of the CONGEST model where the network is a clique. Nanongkai obtained a SSSP algorithm that performs rounds in this model and Censor-Hillel et al. [CensorHillelKKLPS15] obtained an algorithm that performs rounds, solving the more general APSP problem.

There are many other graph problems studied in the CONGEST model, such as minimum spanning tree, minimum cut, and maximum flow; see, e.g., [Gafni85, ChinT85, Awerbuch87, GarayKP98, KuttenP98, GhaffariK13, Ghaffari14, NanongkaiS14_disc, GhaffariKKLP15, PanduranganRS-STOC17, Elkin-podc17-mst, Bar-YehudaCGS17].

### 1.3 Overview of Techniques

The starting point of our algorithms is the classic scaling framework which was heavily used in the sequential setting since the 80s (e.g. [Gabow85, GabowT89, Goldberg95-soda93]). In the context of distributed shortest paths, this framework was used first in the algorithm of Huang et al. [HuangNS17] for APSP. This algorithm follows Gabow’s bit-wise scaling technique [Gabow85], where edge weights are considered one bit at a time so that distances between nodes can be bounded from above. This approach was later taken by Ghaffari and Li [GhaffariL18] for computing SSSP. Our starting point is rather different from Huang et al. [HuangNS17] and Ghaffari and Li [GhaffariL18] in that we will follow recursive scaling, Gabow’s second scaling technique [Gabow85], and its extension by Klein and Subramanian [KleinS97], who applied it for SSSP in the PRAM model. At a high level, this approach uses a very general reduction to extend certain approximate SSSP algorithms to exact SSSP algorithms. Intuitively, this allows us to borrow some existing tools developed for approximation algorithms.

The challenge here lies in two special aspects of this reduction. First, it is an inherent necessity of the reduction that the approximation algorithm works on directed graphs and thus our exact SSSP algorithm also works on directed graphs – assuming that the underlying communication network is undirected. Because of this, it is not a surprise that we achieve a new approximation algorithm that can handle the directed case as an additional result.

Second, and more importantly, the reduction will not work for any arbitrary approximate SSSP algorithm, as it requires the distance estimates returned by the algorithm to satisfy an additional condition about ‘dominating’ the edge weights of the input graph. Klein and Subramanian can ensure this property by (1) computing a directed hop set to effectively reduce the approximate shortest path diameter of the graph, (2) adding the edges of the hop set to the input graph, and (3) obtaining the distance estimate from performing a “small” number iterations of the Bellman-Ford algorithm, exploiting the reduced approximate shortest path diameter. This approach cannot directly be transferred to the CONGEST model because the additional edges of the hop set can only be “simulated” by the nodes in the network, and the standard way of doing so will not yield a small enough guarantee on the number of rounds spent in the final Bellman-Ford step. We circumvent this problem by using a different algorithm design instead, where we (1) compute a skeleton graph consisting of a few “important” nodes, including the source node, and edges between the skeleton nodes with weights equal to approximate pairwise distances, (2) solve recursively on the skeleton graph, and (3) run a “small” number of iterations of the Bellman-Ford algorithm incorporating the distances from the source computed in step (2) to obtain distance estimates with the domination property. This may sound a bit circular at first because we again face the problem of having to compute exact SSSP on the skeleton graph. However, computation on the smaller skeleton graph is now simulated by performing global broadcasts in the whole network. We can thus benefit from the fact that we can use slightly different algorithm design techniques, such as “adding” edges to the graph without immediately increasing the number of rounds, just like in the PRAM model.

To this end, we note that although our algorithm follows the same general framework as that by Klein and Subramanian [KleinS97], it is not simply a mere “translation” of their PRAM algorithm to the CONGEST model. It in fact leads to a new work/depth trade-off in the PRAM model that could not be achieved with the algorithm of Klein and Subramanian.

### 1.4 Organization

The rest of this paper is organized as follows: In Section 2 we introduce all definitions necessary for our main results and review existing tools from the literature. In Section 3 we schematically develop the central “auxiliary” algorithm that we need to obtain our results. We prove its correctness, but do not yet analyze its model-specific complexity. In Sections 4 and 5 we obtain our two main results by providing two slightly different implementations of the auxiliary algorithm together with their complexity analyses. Finally, in Section 6 we sketch how our techniques imply additional results for approximate SSSP in the Broadcast CONGEST model and exact SSSP in the PRAM model.

## 2 Preliminaries

### 2.1 CONGEST Model and Problem Formulation

The CONGEST model [Peleg00] is a synchronous message-passing model with non-faulty bounded-bandwidth links. More formally, consider a communication network of processors modeled by an undirected graph where each node of models a processor and each pair of nodes indicates a bidirectional communication link between the processors corresponding to and , respectively. In the remainder of this paper, we identify nodes and processors. In the CONGEST model, the initial knowledge about is distributed among the nodes: every node has a unique identifier of size (where ) initially known only to itself and its neighbors, i.e., the nodes to which it has direct communication links. The nodes communicate in synchronous rounds: At the beginning of each round, every node may send to each of its neighbors a message of size and subsequently receive the messages sent by its neighbors. Before the next round begins, each node may perform internal computation based on all messages it has received so far and its local knowledge of the network. The complexity of an algorithm is usually measured in the number of rounds and the total number of messages sent until the algorithm terminates, whereas internal computation is considered free. Typically, the asymptotic complexity is expressed in terms of and the diameter of the unweighted communication network , i.e., the maximum distance between any pair of nodes in . The restriction of the bandwidth  to distinguishes the CONGEST model from the LOCAL model [Linial92]. The Broadcast CONGEST model is a specialization of the CONGEST model with the additional constraint that, whenever a node sends a message in some round, it has to send (broadcast) the same message to all of its neighbors. In general, we call a round in which, for every node, the message sent to the neighbors is the same, a broadcast round.

In the single-source shortest paths (SSSP) problem, we are given a weighted directed graph and a distinguished source node and the task is to compute the distance from to in for every node . In the CONGEST model, we require that the nodes of are equal to the nodes of the underlying communication network and that for every directed edge there is an undirected communication link . The input is distributed among the nodes of the network as follows: initially, every node knows (a) whether it is the source or not and (b) its set of incoming and outgoing edges together with their weight. The SSSP problem is solved as soon as every node knows its distance from in .

In this paper, we consider the restriction of the SSSP problem to non-negative integer edge weights, where we let denote the maximum edge weight. We work under the typical assumption that each edge weight can be encoded in a constant number of messages, i.e. , which implies that is polynomial in . Nevertheless, we do work out the dependence on in our algorithms for the interested reader. Observe, that in the CONGEST model an implicit shortest path tree rooted at can be constructed in additional rounds: first every node sends its distance from to all of its neighbors and then every node determines one incoming neighbor such that which will serve as the parent in the shortest path tree.

### 2.2 Notation and Terminology

Consider a weighted directed graph with source node , where of size is the set of nodes, of size is the set of edges, and assigns a non-negative integer weight to each edge. For every path  in , denote the weight of in by . For every pair of nodes denote the distance from to in , i.e., the weight of the shortest path from to in , by .

In defining multiplicative distance approximation, one can consider both overestimation and underestimation of the true distance. For technical reasons, we will refer to both types of distance estimates in our algorithms and thus use the following convention: If , then is an -approximation of for a pair of nodes if

 dG(u,v)≤~d(u,v)≤α⋅dG(u,v)

and if , then is an -approximation of if

 α⋅dG(u,v)≤^d(u,v)≤dG(u,v).

Since the edge weights, and thus all pairwise distances, are integer, we assume without loss of generality that the distance estimates and are integer (otherwise and , respectively, will serve as the desired -approximations).

For every pair of nodes and every integer , we define the shortest -hop path from to  to be the path of minimum weight among all paths from to with at most edges. The -hop distance from to in , denoted by is the weight of the shortest -hop path from to .444Although the -hop distances of all pairs of nodes do not necessarily induce a metric, the term ‘-hop distance’ is somewhat established in the literature.

### 2.3 Toolkit

In the following, we review known tools that we use to design our algorithm.

#### 2.3.1 Reduction from Approximate SSSP to Exact SSSP

We will apply the following reduction by Klein and Subramanian [KleinS97] for extending certain approximate SSSP algorithms to exact SSSP algorithms. The reduction is based on Gabow’s recursive scaling algorithm [Gabow85]. Initially this reduction was formulated for the PRAM model, but it can be adapted to the CONGEST model in a straightforward way.

###### Theorem 2.1 (Implicit in [KleinS97]).

Assume there is an auxiliary algorithm in the CONGEST model that, given a directed graph with positive integer edge weights and a source node , computes a distance estimate such that

 12⋅dG(s,v)≤^d(s,v)≤dG(s,v) (1)

for every node  and

 ^d(s,v)≤^d(s,u)+wG(u,v) (2)

for every edge . Then there is an exact SSSP algorithm in the CONGEST model that, given a directed graph with non-negative integer weights in the range for every edge and a source node , makes calls to algorithm  (on graphs with positive integer weights in the range for every edge ) and has an additional overhead of broadcast rounds. If is a randomized Monte Carlo algorithm that is correct with high probability, then is a Las Vegas algorithm whose bounds on the number of calls to  and the additional overhead of broadcast rounds, respectively, hold with high probability and in expectation.

We give a proof of Theorem 2.1 for completeness in Appendix A.

Our task of designing an exact SSSP algorithm thus reduces to designing an auxiliary approximate SSSP algorithm satisfying conditions (1) and (2). Note that (2) is a special form of the triangle inequality expressing that the distance estimates dominate the edge weights of . The reduction will fail if this additional constraint is not met, and thus we cannot use arbitrary distance estimates, which poses a challenge in designing the auxiliary algorithm. As Klein and Subramanian [KleinS97] observed, one can for example ensure (2) by obtaining as the exact distances from in a suitable supergraph of where and for every edge .

#### 2.3.2 Computing Bounded Hop Distances

We will use several primitives for computing (approximate) bounded-hop distances. The well-known “cornerstone” for this task is the Bellman-Ford algorithm.

###### Lemma 2.2 (Folklore).

In the Broadcast CONGEST model, one can, given a weighted directed graph  with source node and an integer hop parameter , compute, for every node , the value and make it known to in rounds such that each node in total broadcasts at most messages to its neighbors by using a synchronized version of the Bellman-Ford algorithm

While the dependence on the number of rounds in the Bellman-Ford algorithm is optimal, there is some room for improvement in the number of messages. Using a well-known weight-rounding technique [KleinS97, Cohen98, Zwick02, Bernstein09, Madry10, Bernstein13, Nanongkai14], the number of messages can be reduced at the cost of introducing some approximation error. Algorithmically, this technique amounts to performing a single-source shortest path computation up to bounded distance in an integer-weighted graph. For this task, one can use a shortest path algorithm similar to breadth-first search that charges each node (at most) once for sending distance information to its neighbors.

###### Lemma 2.3 (Implicit in [Nanongkai14]).

In the Broadcast CONGEST model, there is an algorithm that, given a directed graph with positive integer edge weights, a fixed source node , and an integer hop parameter , computes, for every node , a distance estimate ultimately known to such that

 dG(s,v)≤~d(s,v)≤(1+ϵ)dhG(s,v)

in rounds where each node in total broadcasts at most messages to its neighbors.

This approximation algorithm can be leveraged to compute approximate -hop distances from a set of sources. Note however that if we run instances of the algorithm of Lemma 2.3, then in the worst-case it might happen that a single node might have to send different messages to its neighbors in all instances of the algorithm simultaneously. The naive approach of circumventing this type of congestion will blow up the number of rounds by a factor of . Nanongkai [Nanongkai14] showed that the congestion can be avoided using a random delay technique.

###### Lemma 2.4 ([Nanongkai14]).

In the Broadcast CONGEST model, there is a randomized algorithm that, given a directed graph with positive integer edge weights, a fixed set of source nodes of size , and an integer hop parameter , computes, for every source node and every node , a distance estimate ultimately known to such that, with high probability,

 dG(s,v)≤~d(s,v)≤(1+ϵ)dhG(s,v)

in rounds.

Note that, in combination with the aforementioned rounding technique, one can alternatively use the deterministic source detection algorithm of Lenzen and Peleg [LenzenP13, LenzenP14] for this task.

#### 2.3.3 Randomized Hitting Set Construction

Similar to previous randomized shortest path algorithms, we will use the fact that, for a specified parameter , one can by a simple randomized process obtain a set of size such that every shortest path consisting of nodes contains a node of with high probability, i.e., is a hitting set for the system of sets defined by the shortest paths with nodes. This technique was introduced to the design of graph algorithms by Ullman and Yannakakis [UllmanY91]. A general lemma can be formulated as follows.

###### Lemma 2.5.

Let , let be a set of size , and let be a collection of sets over the universe of size at least . Let  be a subset of that was obtained by choosing each element of  independently with probability where . Then, with high probability (whp), i.e., probability at least , the following two properties hold:

1. For every , the set contains an element of , i.e., .

2. .

To apply the lemma for hitting shortest paths consisting of nodes in a graph , set , , and for every pair of nodes and such that there is shortest path from  to  with exactly  nodes define a corresponding set containing all the nodes on one of these shortest paths, resulting in .

## 3 Schematic Auxiliary Algorithm

In the following, we present an auxiliary algorithm that computes distance estimates satisfying the preconditions (1) and (2) of Theorem 2.1 and can thus be extended to an exact SSSP algorithm using the recursive-scaling approach. We formulate the auxiliary algorithm in a schematic manner and defer model-specific implementation details and the complexity analysis to later sections. The auxiliary algorithm is parameterized by an integer parameter whose relevance will only become clear in the complexity analysis. Formally, the guarantees obtained in this section can be summarized as follows.

###### Lemma 3.1 (Main Lemma).

For any directed input graph with fixed source node and any integer , the auxiliary algorithm below consisting of Steps 1–6 computes, for every node , a distance estimate satisfying conditions (1) and (2) of Theorem 2.1.

The auxiliary algorithm proceeds as follows:

1. [align=left,label=Step 0.,ref=Step 0]

2. Construct a set of skeleton nodes containing (a) the source node and (b) additionally, for every pair of nodes and such that the shortest path from to in consists of exactly nodes, at least one node on one of these shortest paths.

3. For each skeleton node , compute -approximate -hop distances from , i.e., distance estimates such that

 dG(x,v)≤~d(x,v)≤2⋅dhG(x,v) (3)

for every node .

4. Construct the skeleton graph with edge weight for every .

5. Compute distances from on the skeleton graph , i.e., for every skeleton node .

6. Construct the augmented graph with the weight function given by

 wG′(u,v)=⎧⎪⎨⎪⎩dH(u,v)for every (u,v)∈({s}×C)∖E2⋅wG(u,v)for every (u,v)∈E∖({s}×C)min(dH(u,v),2⋅wG(u,v))for every (u,v)∈E∩({s}×C) \, .
7. Compute the -hop distances from in the augmented graph , i.e., for every node and return for every node .

The correctness proof has two main parts. We first argue (see Lemma 3.2) that the distances in  are a -approximation of the distances in . Then we show (see Lemma 3.3) that the distance estimates computed by our algorithm, namely for every node , are proportional to the distances from in (independent of any hop bound), i.e., . By combining these two facts (see Lemma 3.4) it follows that the distance estimates returned by the auxiliary algorithm satisfy conditions (1) and (2) of Theorem 2.1.

###### Lemma 3.2.

For every node , .

###### Proof.

First, observe that by the definition of we clearly have for every node as every path of is also contained in and the weight of such a path in is at most twice its weight in by the definition of the edge weights in .

We now show that for every node . Observe that it is sufficient to show that for every edge of as then every path from to in has weight at least , which in particular also applies to the shortest path from to in . Note that by the definition of the edge weights in we now only have to show that for every skeleton node . Now observe that for every pair of skeleton nodes we have by (3) and thus . This in turn implies that for every skeleton node , as desired. ∎

###### Lemma 3.3.

For every node , .

###### Proof.

The inequality is obvious by the definition of the -hop distance from to . In the remainder of this proof we argue that by constructing a path in  with at most  edges and of weight at most .

Consider a shortest path from to in . We can assume that is a simple path and thus contains at most one edge for any , and if so, this edge must be the first edge of . We assume in the following that starts with some edge , as for the case that all edges of are contained in a simpler version of the argument applies.

Now let be the shortest path from to in . Subdivide into consecutive subpaths such that the subpaths consist of exactly nodes and the subpath consists of at most nodes. By the properties of guaranteed in 1, we can assume that each of the subpaths contains a skeleton node of . Set and for every let be a skeleton node on . Note that between any pair of consecutive skeleton nodes and (for ) there are at most edges on and thus

 dhG(yi,yi+1)=dG(yi,yi+1). (4)

We now give an upper bound on the weight of the edge in , mainly applying the triangle inequality for the distance metric induced by the skeleton graph :

 wG′(s,yk) ≤dH(s,yk) (definition of wG′(s,yk)) ≤dH(s,y1)+dH(y1,y2)+⋯+dH(yk−1,yk) (triangle inequality) ≤wH(s,y1)+wH(y1,y2)+⋯+wH(yk−1,yk) (dH(yi,yi+1)≤wH(yi,yi+1)) =wH(s,y1)+~d(y1,y2)+⋯+~d(yk−1,yk) (definition of wH(yi,yi+1)) ≤wH(s,y1)+2⋅dhG(y1,y2)+⋯+2⋅dhG(yk−1,yk) (by (3)) =wH(s,y1)+2⋅dG(y1,y2)+⋯+2⋅dG(yk−1,yk) (by (4)) =wH(s,y1)+2⋅dG(y1,yk) (y1,…,yk on shortest path π) =wH(s,x)+2⋅dG(x,yk) (y1=x)

Now consider the path in consisting of first the edge from to of weight and then the subpath of from to . We now compare the weight of with the weight of . Recall that consists of first the edge and then a path from to consisting only of edges contained in . Therefore, has weight at least

 wH(s,x)+2⋅dG(x,v).

By the upper bound on above, has weight at most

 wH(s,x)+2⋅dG(x,yk)+2⋅dG(yk,v)=wH(s,x)+2⋅dG(x,v).

It follows that the weight of is at most the weight of , where was the shortest path from to in . Furthermore, consist of at most edges. Thus, as desired. ∎

###### Lemma 3.4.

For every node , the distance estimate satisfies conditions (1) and (2) of Theorem 2.1.

###### Proof.

By Lemma 3.2 we have for every node . Now observe that by Lemma 3.3 and thus condition (1) is satisfied.

Condition (2) essentially follows from the fact that the distance metric on  obeys the triangle inequality as together with Lemma 3.3 we have, for every edge :

 ^dG(s,v) =12⋅dhG′(s,v) =12⋅dG′(s,v) ≤12⋅(dG′(s,u)+dG′(u,v)) ≤12⋅(dG′(s,u)+wG′(u,v)) ≤12⋅(dG′(s,u)+2⋅wG(u,v)) =12⋅dG′(s,u)+wG(u,v) =12⋅dhG′(s,u)+wG(u,v) =^dG(s,u)+wG(u,v).\qed

## 4 First CONGEST Model Implementation

In the following we present our first exact SSSP algorithm for the CONGEST model. Its guarantees can be formalized as follows.

###### Theorem 4.1.

In the Broadcast CONGEST model, there is a randomized SSSP algorithm for directed graphs with non-negative integer edge weights in the range that performs rounds in expectation.

In our algorithm, we use the following implementation of the auxiliary algorithm of Section 3:

1. [align=left,label=Step 0.,ref=Step 0]

2. We implement the sampling process of Lemma 2.5: Simultaneously, adds itself to and each other node adds itself to with probability . Afterwards, we spend rounds in a global upcast to determine the size of . If (a low-probability event), then abort the algorithm. Now the set satisfies the condition demanded in 1 of Section 3 with high probability and thus our implementation of the auxiliary algorithm will also be correct with high probability.

3. Using the algorithm of Lemma 2.4 as a subroutine, this step can be implemented in rounds.

4. The skeleton graph is constructed implicitly – in the sense that each skeleton node only knows its set of incoming edges together with their weight – by globally broadcasting the set of skeleton nodes in the network in rounds.

5. We use a message-passing version of Dijkstra’s algorithm that in each iteration determines the next node to visit, i.e., the one with minimum tentative distance, by performing a global upcast in the network. This step can thus be implemented in rounds.

6. By performing the corresponding edge weight modifications internally at each node, i.e., without any additional communication, the augmented graph is constructed implicitly in the sense that each node only knows its set of incoming edges together with their weight.

7. We implement this step with the Bellman-Ford algorithm. The first iteration of the Bellman-Ford algorithm can be performed in rounds as every skeleton node already knows the weight of the edge in and thus only needs to be informed about the start of the algorithm. The remaining iterations take rounds in a synchronized implementation of Bellman-Ford by Lemma 2.2, yielding an overall complexity of rounds for this step.

Asymptotically, the overall number of rounds is . By setting we obtain an auxiliary algorithm performing rounds. Theorem 4.1 now follows as a Corollary from Lemma 3.1 and Theorem 2.1.

## 5 Second CONGEST Model Implementation

In our second distributed algorithm, we obtain better guarantees for certain parameter ranges by using a different approach for computing exact distances from on the skeleton graph; the rest of the algorithm is the same as in Section 4. Instead of running Dijkstra’s algorithm, we implement 4 by repeating the algorithmic scheme and effectively constructing a skeleton of the skeleton. Intuitively, this somewhat straightforward repetition of the scheme improves the efficiency because computing on the skeleton graph allows slightly different algorithmic techniques than computing on the original network itself as the former is simulated by performing global broadcasts in the network. This gives us some slack to exploit for increased efficiency. We remark that adding more levels of recursion will not boost the efficiency further because computing shortest paths for the skeleton of the skeleton is not a bottleneck in our running time analysis.

In the following, we explicitly separate the two layers – input graph and skeleton graph – mentioned above. We first show how to compute SSSP “on the skeleton graph” and then demonstrate how this can be used for computing SSSP on the input graph.

### 5.1 Implementation in Broadcast LOCAL Clique Model

Consider the following Broadcast LOCAL Clique model which deviates from the Broadcast CONGEST model in the following ways: (1) the communication network is a clique, i.e., every message sent by a node is received by all other nodes of the network and (2) the size of the message sent per round is arbitrary (and in particular may also be ). The complexity of an algorithm for a clique network with  nodes is determined by the number of rounds and the total size of all messages  broadcast by the nodes over the course of the algorithm.

The Broadcast LOCAL Clique model may seem a bit artificial on its own, but it is highly relevant for our CONGEST model implementation of the auxiliary algorithm because of the following straightforward simulation result for computing exact distances on the skeleton graph.

###### Lemma 5.1 (Implicit in [Nanongkai14]).

Assume there is an exact SSSP algorithm for directed graphs with non-negative integer edge weights in the Broadcast LOCAL Clique model spending rounds and messages. Then 4 of the auxiliary algorithm can be implemented in rounds in the Broadcast CONGEST model, where is the bandwidth of the network and is its diameter.

###### Proof Sketch.

We simulate a run of the Broadcast LOCAL Clique algorithm on the skeleton graph, which is a clique of  nodes. We do this by making each message sent by global knowledge, which we carry out by globally broadcasting all messages via a breadth-first-search spanning tree of the communication network . Such a spanning tree can be constructed initially in rounds. In the -th of the rounds of we have to send some  messages and we know that . In the Broadcast CONGEST model, the total number of rounds for this simulation therefore is by standard arguments [Peleg00]. ∎

We now show how to obtain an efficient exact SSSP algorithm in the Broadcast LOCAL Clique model by first implementing the auxiliary algorithm of Section 3 and then extending it to an exact SSSP algorithm using the the recursive-scaling reduction of Theorem 2.1.

###### Lemma 5.2.

In the Broadcast LOCAL Clique model, there is a randomized SSSP algorithm for directed graphs with non-negative edge weights in the range that in expectation spends rounds and messages for any integer parameter .

###### Proof.

We implement 1, 3, and 5 as in Section 4. We implement 2 by simultaneously running an instance of the algorithm of Lemma 2.3 from each skeleton node . In each of the rounds of the algorithm of Lemma 2.3, we aggregate, for every node, all the messages that it would have to broadcast over all instances and broadcast them to all other nodes in a single round in the Broadcast LOCAL Clique model. This results in a total size of for all these messages in the Broadcast LOCAL Clique model. We implement 4 in the naive way by having each center node broadcast its outgoing edges in and computing internally at every node. This takes rounds and messages. Finally, 6 requires a single run of Bellman-Ford, which by Lemma 2.2 takes rounds and messages. We now apply the reduction of Theorem 2.1 to obtain the desired SSSP algorithm for the Broadcast LOCAL Clique model. ∎

We remark that the essential bottleneck of this implementation of the auxiliary turns out to be the final Bellman-Ford computation in 6 when performed in this algorithm for the Broadcast LOCAL Clique model. Recall that this step is necessary to ensure the domination property of (2).

### 5.2 Faster CONGEST Model Implementation in High-Diameter Networks

###### Theorem 5.3.

In the Broadcast CONGEST model, there is a randomized SSSP algorithm for directed graphs with non-negative edge weights in the range that performs rounds in expectation.

###### Proof.

We implement 1, 2, 3, 5, and 6 of the auxiliary algorithm as in Section 4. Applying the simulation of Lemma 5.1 to the algorithm of Lemma 5.2 (with parameter ), we can implement 4 in rounds. Now the overall number of rounds is . We use two variants for balancing these terms. In the first variant, we set and to get an upper bound of rounds, where the first term dominates the second term when . In the second variant, we set and to get an upper bound of rounds, where the first term dominates the second term when . Our overall algorithm first computes a -approximation to the diameter of the communication network in rounds by performing breadth-first search from an arbitrary node in and then chooses and according to the approximate value of to get an upper bound of rounds in this implementation of the auxiliary algorithm. We now apply the reduction of Theorem 2.1 to obtain the desired SSSP algorithm for the Broadcast CONGEST model. ∎

Note that the algorithm of Theorem 5.3 is (asymptotically) faster than the simpler algorithm of Theorem 4.1 when .

In this section, we work out some additional results, first for approximate SSSP the distributed setting, and then for exact SSSP in the parallel setting.

### 6.1 Directed Approximate SSSP

In the following, we give an algorithm for computing approximate SSSP on directed graphs in the CONGEST model that matches the round complexity of the fastest known algorithm for single-source reachability up to polylogarithmic factors.

The algorithmic scheme followed by our algorithm is quite similar to the one of the auxiliary algorithm:

1. [align=left,label=Step 0.,ref=Step 0]

2. Construct a set of skeleton nodes containing (a) the source node and (b) additionally, for every pair of nodes and such that the shortest path from to in consists of exactly nodes, at least one node on one of these shortest paths.

3. For each skeleton node , compute -approximate -hop distances from , i.e., distance estimates such that

 dG(x,v)≤~d(x,v)≤(1+ϵ)⋅dhG(x,v)

for every node .

4. Construct the skeleton graph with edge weight for every .

5. Compute -approximate distances from on the skeleton graph , i.e.,

 dH(s,x)≤d′(s,x)≤(1+ϵ)⋅dH(s,x)

for every skeleton node .

6. Compute

 ^d(s,v):=minx∈C(d′(s,x)+~d(x,v))

for every node .

###### Lemma 6.1.

For any directed input graph with fixed source node , any and any integer , the algorithm above consisting of Steps 1–5 computes, for every node , a distance estimate such that .

We omit the proof of this lemma, as the arguments in correctness proof for the algorithm are just a variation of those given in [Nanongkai14]. The -approximation guarantee follows because .

We proceed with giving a CONGEST-model implementation of the algorithm above. Here, we use a two-step process similar to the implementation of the auxiliary algorithm in Section 5: We first provide an implementation in the Broadcast LOCAL Clique model and then use that algorithm as a black box for the implementation in the Broadcast CONGEST model.

###### Lemma 6.2.

In the Broadcast LOCAL Clique model, there is a randomized -approximate SSSP algorithm for directed graphs with non-negative edge weights in the range that spends rounds and messages for any integer parameter . The algorithm is correct with high probability.

###### Proof.

By a standard trick one can transform an instance with non-negative integer edge weights to one with positive integer edge weights: give weight to all -weight edges and scale up all other edge weights by a factor of (see Appendix A.1). This increases the maximum edge weight by a factor of and thus the complexity of the algorithm by a factor of .

We implement 1 and 3 as in Section 4. We implement 2 by simultaneously running an instance of the algorithm of Lemma 2.3 from each skeleton node . In each of the rounds of the algorithm of Lemma 2.3, we aggregate, for every node, all the messages that it would have to broadcast over all instances and broadcast them to all other nodes in a single round in the Broadcast LOCAL Clique model. This results in a total size of for all these messages in the Broadcast LOCAL Clique model. We implement 4 in the naive way by having each center node broadcast its outgoing edges in and computing (i.e., exact distances) internally at every node. This takes rounds and messages. Finally, 5 only requires internal computation. ∎

###### Theorem 6.3.

In the Broadcast CONGEST model, there is a randomized -approximate SSSP algorithm for directed graphs with non-negative edge weights in the range that performs rounds. The algorithm is correct with high probability.

###### Proof.

Again, it is sufficient to provide an algorithm for positive integer edge weights.

We implement 1, 2, and 3 as in Section 4. Applying the simulation of Lemma 5.1, which holds regardless of the approximation ratio, to the algorithm of Lemma 6.2 (with parameters  and ), we can implement 4 in rounds. Now the overall number of rounds is . By setting and this becomes as desired. ∎

In the following, we show that our approach also gives a new work/depth trade-off for computing SSSP in the PRAM model. In analogy to the RAM model, the PRAM (parallel RAM) allows the possibility of processors computing in parallel with shared memory access. The work of a parallel algorithm is the total number of operations performed over all processors and the depth is the length of the longest series of operations that have to be performed sequentially due to data dependencies. The depth is essentially the number of parallel computation steps needed until the last processor is finished for any schedule of assigning the algorithm’s operations to the processors.

Our contribution in the PRAM model is an exact SSSP algorithm with work and depth for any . For directed graphs, this specific trade-off was not known before. In particular, the algorithm of Klein and Subramanian [KleinS97], who follows the same general framework as we do, has work and depth , i.e., it it does not allow for such a trade-off as any choice of different from would be sub-optimal. Thus, the conceptual novelties in our algorithm, which were motivated by the application to the CONGEST model, also carry over to an improvement in the PRAM model.

###### Theorem 6.4.

In the PRAM model, there is a randomized SSSP algorithm for directed graphs with non-negative edge weights in the range that has work with high probability and in expectation and depth for any given .

###### Proof.

To obtain the exact SSSP algorithm, we implement the auxiliary algorithm of Section 3 as follows:

1. [align=left,label=Step 0.,ref=Step 0]

2. We implement the sampling process of Lemma 2.5: Each node is added to independently with probability . Afterwards, determine the size of and if (a low-probability event), then abort the algorithm. This has work and depth . Now the set satisfies the condition demanded in 1 of Section 3 with high probability and thus our implementation of the auxiliary algorithm will also be correct with high probability.

3. For each node in , this step can be implemented by computing a shortest path tree up to distance in a graph with suitably rounded edge weights. Similar to the analysis of bounded breadth-first search, this approach has work for each node in and depth (see Lemma 3.2 in [KleinS97], which is similar to Lemma 2.3 in Section 2). Thus the total work of this step is and the depth is .

4. The straightforward approach for constructing and storing the graph has work