|+4||([CHE13] and this paper.)|
|+6||([BKM+10] and [WOO10])|
A graph on nodes can have on the order of edges. For very large values of , this amount of edges can be prohibitively expensive, both to store in space and to run graph algorithms on. Thus it may be prudent to operate instead on a smaller approximation of the graph. A spanner is a type of subgraph which preserves distances between nodes up to some error, which we call the stretch. Spanners were introduced in [PS89] and additive spanners were first studied in [LS93].
Definition 1 (Additive Spanners).
A additive spanner (or “ spanner”) of a graph is a subgraph that satisfies for each pair of nodes .
Note that since is a subgraph of , the lower bound is immediate (the error is one-sided). Spanners have found applications in distance oracles [BK06], parallel and distributed algorithms for computing almost shortest paths [COH98, EP05], synchronizers [PU87], and more.
In addition to finding spanner constructions that have the least amount of edges possible, it is also in our interest that these construction algorithms be fast. Because spanners are meant to make graphs more compact, they are mainly of interest for very large graphs. Thus, for very large , a polynomial time speedup to an algorithm for producing spanners is highly desirable. There is a long line of work done in the interest of speeding up spanner constructions, including [RZ04, BS07, WOO10, KNU17, ADF+19]. Some additive spanner size and efficiency results are summarized in Table 1 above. For a comprehensive survey, see [ABS+20].
Our focus in this paper is the +4 spanner construction first presented by Shiri Chechik in [CHE13]. In particular, we present a polynomial speedup to Chechik’s algorithm.
For comparison, the bottleneck to Chechik’s original construction is solving the All-Pairs-Shortest-Paths problem; with combinatorial methods, this has an runtime, and with matrix multiplication methods, [SEI95] 111We note that when , the bottleneck in the algebraic case is instead the second stage of the algorithm described in section 1.2 ().. Currently, [AW21]. See Section 1.2 for a full runtime analysis of the original construction.
With matrix multiplication methods, the worst case runtime becomes , as this problem can be solved with all-pairs shortest-paths. However, we note that there is a range of values where our combinatorial algorithm outperforms algebraic methods; if , then when , the complexity of our algorithm is polynomially faster.
We will use to refer to a canonical shortest path between two nodes . is a variable we use in some of our algorithms that describes some computed path. For a node , denotes the neighborhood of (the set containing and its neighbors) in . When is a set, .
1.2 Current Runtime of the +4 Spanner Construction
In [CHE13], Shiri Chechik presents a spanner construction that produces a +4 spanner on edges on average with probability . The runtime complexity, however, was not analyzed. In this section, we will describe Chechik’s algorithm and then give a runtime analysis. Chechik’s construction of a +4 spanner of an input graph can be split into three stages:
All edges adjacent to “light nodes” (nodes with degree ) are added to .
Nodes are sampled for inclusion into a set with probability . BFS trees for these nodes are computed, and the edges for these trees are added to .
Nodes are sampled for inclusion into a set with probability . For each “heavy” node (nodes with degree in the original graph) that is not in , but is adjacent to some node of , we arbitrarily choose a neighbor and add the edge to . These choices also define the “clusters” of the graph: for each , is the set containing and its adjacent heavy nodes that were paired with in the previous step. We now find, for each pair , the shortest path subject to the constraint that , , and has heavy nodes. We use to refer to the number of heavy nodes on in .
Algorithm 1 gives the full details. The computationally dominant step of this algorithm is the task of finding these shortest paths between the clusters in (iii). For worst case inputs, the expected number of clustered nodes (nodes in some cluster) is . Thus, this algorithm’s runtime will be bottlenecked by the all-pairs-shortest-paths problem. We now show that the heavy-node constraint on the paths does not increase the runtime. To see this, we note that it’s enough to search over paths of the form
( denotes path concatenation), where range over and , . Specifically, we want the shortest path of this form for each pair of clusters , where is a constraint-satisfying ( heavy nodes) path that is also a shortest path in .
To find such paths, we first solve APSP ( time combinatorially, time algebraically) to get shortest paths for each . Then for all pairs of clustered nodes , with cluster centers respectively: if , set as the current best path for the cluster pair if one hasn’t yet been selected, otherwise replace the current path iff is shorter. This is an APSP time process for finding the shortest valid canonical shortest path connecting each cluster pair; at the end of the process, we add the edges of these best paths. We note that because we’re not searching over all paths, but only one set of canonical shortest paths, it’s possible we fail to find valid (constraint satisfying) paths between some cluster pairs. This does not impede correctness, as we only require these paths in the cases that they exist.
2 Fast Construction of the +4 Spanner
In this section, we present our main result; a modification of Chechik’s +4 spanner construction that has runtime with high probability.
2.1 Constrained Shortest Paths
Chechik’s original algorithm required the computation of shortest paths subject to a constraint on the number of heavy nodes in the paths. Our modification also makes use of constrained shortest paths, but the CSSSP (Constrainted Single-Source Shortest Paths) problem is stronger than necessary for our purposes, and we can get away with a better runtime by solving a weaker problem. In this section, we define and give an efficient algorithm for a weaker variation on CSSSP, which we’ll call weak CSSSP. In particular, we will only need to find constrained shortest paths from to in situations where a certain type of constrained path already exists.
Definition 2 (Weak CSSSP).
The weak constrained single-source shortest paths problem is defined by the following algorithm contract:
Input: An (unweighted, undirected) graph , a set of “gray” edges , a source vertex , and a positive integer .
Output: For every , a path on gray edges, satisfying the following:
For any path on gray edges and with , is an path with .
Informally, if there are paths that (i) have gray edges, and (ii) are “short enough” in the sense that they have at most stretch over the true distance, then the outputted path has gray edges and has minimal length among these paths satisfying (i) and (ii). Note that if no such paths exist, then can be anything, satisfying the contract vacuously.
In [AMD19], AliAbdi et al. present a label-setting algorithm “Bi-SPP” (Bi-Colored Shortest Path Problem) for solving the “Gray Vertices Bounded Single Source Shortest Paths Problem” (GB-SPP). This is the problem of finding shortest paths from a vertex to every other vertex subject to the constraint that the paths have gray nodes. Their solution has an worst case runtime. Even though this constraint is on nodes and not edges, weak CSSSP can be solved with GB-SPP, as we will now describe.
Given input to an instance of the weak CSSSP problem: we designate a node to be gray if it is adjacent to a gray edge. It is clear that if a path has gray edges, it must then have gray nodes. Thus if we solve GB-SPP on this graph with parameter , the resulting paths satisfy for any path on gray edges. Furthermore, the paths have gray edges. Therefore these resulting paths satisfy the weak CSSSP requirements.
We now present a new algorithm for solving weak CSSSP with the same runtime as plain SSSP in the weighted setting - using Dijkstra’s algorithm with Fibonacci heaps, a runtime of . We give each non-gray edge weight 1, and each gray edge weight . We run Dijkstra’s algorithm with these weights and report the paths it computes.
Algorithm 2 solves weak CSSSP in time.
The time complexity follows immediately from the complexity of Dijkstra’s algorithm, which is the dominant stage of the algorithm. We now prove correctness.
Let and let be an arbitrary path satisfying (i) gray edges, and (ii) . We will first show that has gray edges. Suppose to show a contradiction that it has gray edges. Note by construction that the weight of a path is its length plus times the number of gray edges. Thus . Furthermore, . But we also have, by the fact that is the lowest-weight path, that , and thus . This implies that , which is a contradiction.
Thus the computed path has gray edges. We now show to complete the proof. We have that
Thus as required.
2.2 Application to 4-Additive Spanner Construction
We are now ready to state our modification to Chechik’s spanner construction. Two insights allow us to improve the efficiency: (i) instead of finding the constrained shortest paths between the clusters, it is sufficient to only do this for paths between nodes. Furthermore, (ii) it is sufficient to compute the weak CSSSP paths for this task. Besides these two changes, the construction is the same as Chechik’s original construction. We now prove our main result through the following series of lemmas:
For a shortest path in a graph and for any vertex , has neighbors in .
Suppose to show a contradiction that has four neighbors , and assume WLOG in . This implies . But the path has length , which is a contradiction. ∎
We now show that for any two nodes of , we have with very high probability that . Note that it’s sufficient to prove this result when don’t have all of their edges included in . We call such nodes “uncovered”. This is because when are not both uncovered, it’s enough to demonstrate this stretch for the subpath of beginning and ending at the first and last uncovered nodes respectively.
A node is said to be “covered” in if all of its edges are included in .
This allows us to assume that are in , as all other nodes are covered by our algorithm. The proof of the following lemma is identical to the first part of the proof of Lemma 2.2 in [CHE13], but we repeat it here for completeness.
For any two uncovered nodes such that the canonical shortest path has heavy nodes, we have with probability
In this case, we claim that there is a probability that is adjacent to a BFS tree in . has heavy nodes, each of degree . Thus the sum of the degrees of nodes on is . By Lemma 3, this implies there are at least nodes adjacent to . Each node has probability of being included in , and thus having a shortest-path tree rooted at in . Therefore the probability that none of these nodes adjacent to have such a tree rooted at them is
where we used the fact that for . Thus, we have a probability of the existence of a node neighboring some such that a BFS tree rooted at is in . When this is the case, we can simply take the followed by the shortest paths provided by the BFS tree, which has a stretch factor of 2 as shown below:
For any two uncovered nodes such that the canonical shortest path has heavy nodes, we have with probability .
Both are uncovered and thus in . Let such that and . We assume as this case is trivial.
Call an edge “heavy” if both and are heavy nodes. Since is a path on heavy nodes, it has at most heavy edges. Thus the path has at most heavy edges, and , since is a shortest path in . Then when we compute weak CSSSP to get a path in , we have by our algorithm contract that . Let , and note that is a path in . This path witnesses that
Correctness follows by the above two lemmas and the union bound, which gives us that with probability , holds for all .
The subgraph produced by Algorithm 3 has edges with high probability.
We can separate the addition of edges to into 4 types:
The edges incident to light nodes are added. Each light node is incident to edges by definition, so edges are added.
The BFS tree of each node in is added. Each such tree contributes edges. The probability of a node being added to is , so with high probability, and thus edges are added with high probability.
Edges adjacent to heavy nodes that are are added. Nodes are added to with probability , thus the probability of being neither in nor adjacent to a node in is . If , then it is adjacent to a node in with high probability. Thus the number of edges added for is at most with high probability. Unioning over all , this adds edges with high probability.
Edges on paths between nodes with heavy edges are added. with high probability, yielding pairs of . All the light edges (edges adjacent to light nodes) have already been added to , so each path between these pairs adds at most edges. Unioning over the number of pairs, this adds edges with high probability. ∎
On an node edge input graph, Algorithm 3 runs in with high probability.
The only two superlinear stages of the algorithm are (a) the generation of the Breadth-First Search trees, and (b) solving weak CSSSP for each node of . For (a): nodes are sampled to be in with probability , so with high probability. BFS has worst-case runtime . Thus this stage is time. For (b): we showed in section 2.1 an algorithm that solves weak CSSSP in time, which we run for each node of . Multiplying this over the size of (which has size with high probability), we get time with high probability.
Theorem 1 now follows from Lemmas 5-7.
In this paper we have presented a new state-of-the-art complexity result for constructing the +4 spanner. This fills in literature gap that has existed between +2,+6, and +8 spanners, as this is the first paper studying the efficiency of the +4 spanner construction. To find further polytime improvements to this construction would require a polynomial reduction in the number of nodes to compute shortest path trees on. The next bottleneck to the algorithm is the time needed to build the BFS trees rooted at the nodes, which is with high probability. We believe that a better runtime than this would require a novel +4 spanner construction.
Many thanks to Greg Bodwin, without whose supervision this paper would not be possible. I also thank fellow students Eric Chen and Cheng Jiang, with whom I discussed an earlier version of the paper.
- [ABS+20] (2020) Graph spanners: a tutorial review. Computer Science Review 37, pp. 100253. Cited by: §1.
Fast estimation of diameter and shortest paths (without matrix multiplication). SIAM Journal on Computing 28 (4), pp. 1167–1181. Cited by: Table 1.
- [AMD19] (2019-12-03) Constrained shortest path problems in bi-colored graphs: a label-setting approach. GeoInformatica. External Links: Cited by: §2.1.
- [AW21] (2021) A refined laser method and faster matrix multiplication. In Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 522–539. Cited by: §1.
- [ADF+19] (2019) Constructing light spanners deterministically in near-linear time. In 27th Annual European Symposium on Algorithms (ESA 2019), Cited by: §1.
- [BKM+10] (2010-11) Additive spanners and (, )-spanners. ACM Transactions on Algorithms 7 (1), pp. 1–26. External Links: Cited by: Table 1.
- [BK06] (2006) Faster algorithms for approximate distance oracles and all-pairs small stretch paths. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS’06), pp. 591–602. Cited by: §1.
- [BS07] (2007) A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Structures & Algorithms 30 (4), pp. 532–563. Cited by: §1.
- [CHE13] (2013) New additive spanners. In Proceedings of the twenty-fourth annual ACM-SIAM symposium on Discrete algorithms, pp. 498–512. Cited by: §1.2, Table 1, §1, §2.2, 1.
- [COH98] (1998) Fast algorithms for constructing t-spanners and paths with stretch t. SIAM Journal on Computing 28 (1), pp. 210–236. Cited by: §1.
- [EP05] (2005) Approximating k-spanner problems for k¿ 2. Theoretical Computer Science 337 (1-3), pp. 249–277. Cited by: §1.
- [KNU17] (2017) Additive spanners and distance oracles in quadratic time. arXiv preprint arXiv:1704.04473. Cited by: Table 1, §1.
- [LS93] (1993) Additive graph spanners. Networks 23 (4), pp. 343–363. Cited by: §1.
- [PS89] (1989) Graph spanners. Journal of graph theory 13 (1), pp. 99–116. Cited by: §1.
- [PU87] (1987) An optimal synchronizer for the hypercube. In Proceedings of the sixth annual ACM Symposium on Principles of distributed computing, pp. 77–85. Cited by: §1.
- [RZ04] (2004) On dynamic shortest paths problems. In European Symposium on Algorithms, pp. 580–591. Cited by: §1.
- [SEI95] (1995) On the all-pairs-shortest-path problem in unweighted undirected graphs. Journal of computer and system sciences 51 (3), pp. 400–403. Cited by: §1.
- [WOO10] (2010) Additive spanners in nearly quadratic time. In International Colloquium on Automata, Languages, and Programming, pp. 463–474. Cited by: Table 1, §1.