The girth of a graph is the length of the shortest cycle in . It is an important graph quantity that has been studied extensively in both combinatorial settings (see Bollobás’s book [Bol98] for a discussion) and computational settings. In particular, exact algorithms for the girth running in time in weighted directed graphs [OS17] are known. On the other hand, a result of Vassilevska W. and Williams show that a truly subcubic algorithm for girth (i.e. running in time for some ) implies a truly subcubic algorithm for the All Pairs Shortest Path (APSP) problem [WW10]. As it is a longstanding open problem whether APSP admits a truly subcubic time algorithm, exact computation of the girth in truly subcubic time would be a major breakthrough.
This has motivated the study of efficient approximation algorithms for the girth. There has been extensive work on approximating the girth in undirected graphs [IR77, LL09, RW12, DKS17]. Many such algorithms use the concept of a -spanner of a graph , a fundamental combinatorial object which was introduced by Chew [Che89]. An -spanner of a graph is a subgraph of which multiplicatively preserves distances up to a factor of . It is well-known that -spanners with edges exist for any undirected weighted graph [ADD93]. There has been much work on the efficient construction of spanners [TZ05, RTZ05, BS03] as well as improved spanner constructions in the case of undirected unweighted graphs [LL09, RW12]. These algorithms immediately imply algorithms for girth approximation in undirected graphs.
Unfortunately, approximately computing all pairs distances in directed graphs is a notoriously difficult problem and while sparse spanners do exist in all undirected graphs, they do not exist in all directed graphs. For example, any directed spanner for the “directed” complete bipartite graph with vertices on the left directed towards vertices on the right clearly requires all edges. This problem seems to arise from the fact that the distance metric in directed graphs is asymmetric. Therefore, if we want to construct sparse spanners, it is natural to work instead with the symmetric roundtrip distance metric, defined as [CW04] and similarly define an -roundtrip spanner of a directed graph to be a subgraph that multiplicatively preserves roundtrip distances up to a factor of .
Interestingly, there do exist roundtrip spanners for directed graphs with comparable sparsity as spanners for undirected graphs. A result of Roditty, Thorup, and Zwick [RTZ08] shows that for any and , every graph has a -roundtrip spanner with edges, where is the maximum edge weight. While this algorithm ran in time , as it requires the computation of all pairs distances in the graph, recent work Pachocki et al. [PRS18] gave a randomized algorithm running in time which on weighted directed graphs returns a -roundtrip spanner with edges and an approximation to the girth. Up to a logarithmic approximation factor, this matches the sparsity and runtime known for spanners on undirected weighted graphs and girth on sparse graphs.
The result of Pachocki et al. [PRS18] constitutes one of small, but rapidly growing [CKP17], set of instances where it is possible to obtain robust nearly linear time approximations to fundamental quantities of directed graphs in nearly linear time, overcoming typical running time gaps between solving problems on directed and undirected graphs. However, a fundamental open problem left open by this work is whether it is possible to fully close this gap and provide algorithms for girth approximation and roundtrip spanners in directed graphs that fully matching the runtime and sparsity of those in undirected graphs. This is the primary problem this paper seeks to address and this paper provides multiple new roundtrip spanner construction algorithms with improved runtime, approximation quality, and dependency on randomness.
1.1 Our Results
In this paper we provide several results which improve on the multiplicative approximation ratio for the girth approximation algorithms and roundtrip spanner constructions in the work of Pachocki et al. [PRS18]. Here and throughout the remainder of the paper we use notation to hide factors polylogarithmic in , where is the number of vertices in the graph.
First, in Section 3 we show how to multiplicatively approximate the girth of a weighted directed graph with vertices and edges to within a factor of in time. We also show how to construct multiplicative roundtrip spanners with edges for such graphs in time. These algorithms are deterministic and constitute the first deterministic nearly linear time algorithms for multiplicative approximation of the girth and multiplicative roundtrip spanners with edges.
Theorem 1 (Deterministic Multiplicative Girth Approximation).
For any integer and weighted directed graph with vertices, edges, and unknown girth we can compute in time an estimate
time an estimatesuch that .
Theorem 2 (Deterministic Multiplicative Roundtrip Spanners).
For any integer and any weighted directed graph with vertices and edges, we can compute in time an multiplicative roundtrip spanner with edges.
Setting yields the following corollaries. For these results nearly match the optimal algorithms in undirected graphs for girth approximation and the construction of spanners.
For any weighted directed graph with vertices, edges, and unknown girth we can compute in time an estimate such that .
For any weighted directed graph with vertices and edges, we can compute in time an multiplicative roundtrip spanner with edges.
In Section 4 we then consider obtaining constant approximations to the girth. In particular we provide a randomized algorithm that obtains a -approximation to the girth on graphs with non-negative integer edge weights in time. Up to logarithmic factors this matches the runtime that would be predicted from the fact that -undirected spanners with edges can be constructed in time for . Further, we show that this procedure can be used to with high probability obtain constant multiplicative roundtrip spanners in directed graphs with arbitrary edge weights in time.
Theorem 3 (3-Multiplicative Girth Approximation).
For any directed graph with vertices, edges, integer non-negative edge weights, and unknown girth we can compute in time an estimate such that with high probability in .
Theorem 4 (8-Multiplicative Roundtrip Spanners).
For any directed graph with vertices, edges, integer non-negative edge weights, we can compute in time an -multiplicative roundtrip spanner with edges with high probability in .
Interestingly, we achieve these results by a different approach than our deterministic algorithms. Highlighting this, in Section 5 we show how to combine the techniques of these algorithms to obtain both multiplicative spanners of size and multiplicative approximations to the girth in time with high probability in .
Theorem 5 (Constant Multiplicative Girth Approximation).
For any integer and any weighted directed graph with vertices, edges, and unknown girth we can compute in time an estimate such that with high probability in .
Theorem 6 (Constant Multiplicative Roundtrip Spanners).
For any integer and any weighted directed graph with vertices and edges, we can compute in time an multiplicative roundtrip spanner with edges with high probability in .
1.2 Comparison to Pachocki et al. [Prs18]
Our Theorem 1 and Theorem 2 offer immediate improvements over the analogous results in [PRS18]. Specifically, our algorithms provide a tighter multiplicative girth approximation and multiplicative spanner stretch in the same runtime as the algorithms in [PRS18], which produce a girth approximation and roundtrip spanner with edges in time
Additionally, our algorithm is deterministic and in our opinion, simpler. The algorithm of Pachocki et al. [PRS18] involved the following pieces. First, they resolve the case where there is a vertex whose inball and outball (of some small radius) intersect in a significant fraction of the vertices of the graph by cutting out a ball of randomly chosen radius. They determine whether such vertices exist by using a method of Cohen to estimate ball sizes [Coh97]. In the other case, they use exponential clustering (see [MPX13]) to partition the graph and recurse. Finally, they rerun the algorithm times.
Our algorithm is instead based only on ball growing around vertices to “partition” the graph into possibly overlapping pieces. We simultaneously grow an inball and outball around a vertex until either both balls occupy a majority of the vertices of the graph, or until we can add a piece to our partition. In the case where both balls grow large and intersect in a significant fraction of the vertices of the graph, we use a similar method to that of Pachocki et al., but instead find a deterministic method to make progress and recurse. By using ball growing around any vertices to form a partition, we avoid the need to estimate ball sizes and use exponential clustering.
Our results, Theorem 3, Theorem 4, Theorem 5, and Theorem 6 further improve upon Pachocki et al. [PRS18] obtaining constant multiplicative approximation to the girth and computing constant multiplicative roundtrip spanners. These algorithms provided in Section 4 and Section 5 are randomized and only succeed with high probability, as opposed to those provided in Section 3, but are the first to achieve any constant approximation to the girth in a time polynomially better than the time currently required for APSP. These algorithms leverage new techniques not present in previous roundtrip-spanner algorithms and we believe are of independent interest.
1.3 Overview of Approach
Overview of deterministic results:
Here we summarize at a high level our approach for constructing roundtrip spanners on directed graphs presented in Section 3. For the sake of simplicity, we focus on unweighted directed graphs and for a parameter , construct a subgraph (roundtrip spanner) so that if the roundtrip distance between and is at most in , then their roundtrip distance is at most in .
The key insight of guiding our algorithm is the following: instead of partitioning the graph into disjoint pieces and recursing (as is done in [PRS18]), we instead allow the pieces to overlap on the boundaries. This is justified by bythe following observation. Consider a subgraph of , and let denote the subgraph consisting of all vertices within distance of . Then if we recursively build a roundtrip spanner on , then we are guaranteed that we can delete from our graph. Indeed, if and the roundtrip distance between and is at most , then . This simple observation allows us to overcome the critical challenge in [PRS18], arguing that that graph can be broken apart, while nevertheless preserving roundtrip distance.
This observation also forms the basis of an optimal spanner construction on unweighted undirected graphs, which appears in a book of Peleg (exercise 3 on page 188 in [Pel00]). Specifically, for any integer , we can construct a -spanner with edges in time The construction works as follows. Start at any vertex , let denote the ball of radius centered at , and let denote the number of vertices in . Grow such balls around until we find an index with We can clearly guarantee that . At this point, add a spanning tree on to your spanner and delete all vertices in Now, recurse on the remaining graph. It is easy to check that the resulting spanner is as desired.
Our algorithm for directed graphs is similar. Let be the parameter as defined in the first paragraph. Instead of growing roundtrip balls, we grown inballs and outballs where an inball (resp. outball) of radius around is the set of all points with distance at most to (resp. from) . Fix a vertex , and let denote the inball and outball of radius around , and let denote the number of vertices in the balls and fix We start by growing and inball and outball around . First, if , then we can build a roundtrip ball of radius and delete from our graph. This is safe essentially by our observation above. Otherwise, we find an index such that isn’t much larger than , we recursively build a roundtrip cover on and then delete . This is safe to do by our observation above. Similarly, if there is an index such that isn’t much larger than , we recursively build a roundtrip cover on and then delete . Through standard ball cutting inequalities we can show that such an index exists (Lemma 3.2). We would like to elaborate on a few points. First, when we compare the sizes of and , we compare both the number of vertices and edges, the former to control the size of the roundtrip spanner constructed, and the latter to control runtime. Second, we grow the inball and outball at the same rate, i.e. we alternately add an edge at a time to the inball and outball to maintain that the work spent on each is the same.
Summary of randomized approach
To obtain constant multiplicative approximations to the girth, in Section 4 we provide a very different approach than that taken for obtaining our deterministic approximations. We think this approach is of independent interest and further demonstrate its utility in Section 5 by showing how to combine the insights that underly it with the algorithm from Section 3 to achieve arbitrary constant approximations.
Our approach to obtaining a approximation to the girth is rooted in the simple insight that if a vertex is in a cycle of length then every vertex in the ball of radius from is at distance at most from every vertex in the cycle. Consequently, for each vertex if we repeatedly prune vertices from its outball of radius if they do not have the property that they can reach every vertex in this ball by traversing a distance at most , then we will never prune away vertices in a cycle of length from that vertex.
Leveraging these insights, we can show that if we randomly compute distances to and from a random vertices and if a cycle off length is not discovered immediately then we can efficiently implement a pruning procedure so that each vertex only has in expectation vertices that could possibly be in a cycle of length through that vertex. By then checking each of these sets for a cycle and being careful about the degrees of the vertices (and therefore the cost of the algorithm) this approach yields essentially a -approximation to the girth in time with high probability in .
Our -approximation is then obtained by carefully applying this argument to both outballs and inballs and leveraging the simple fact that if a vertex is on a cycle of length then for every either or . Further, our approximations of Section 5 are then achieved by using these techniques to better control the size of the outballs and inballs in an invocation of the deterministic algorithm of Section 3.
For weighted directed graph , we let and denote the vertex and edge sets of . We assume all edge lengths are nonnegative. For a subgraph (not necessarily vertex induced), let denote the set of vertices of , and let denote the set of edges. For a subset , we define to be subgraph induced by . When the graph is clear from context, we let and denote and respectively.
For a weighted directed graph with non-negative edge lengths, we let denote the (shortest path) distance from to in . When the graph is clear from context, we simply denote this as If there is no path from to , we let When is a subgraph of , we let denote the (shortest path) distance from to only using the edges in We denote the roundtrip distance between and as and define a roundtrip spanner.
Definition 2.1 (Roundtrip Spanner).
We say that a subgraph is an -roundtrip spanner if for all
For weighted directed graph we define the inball and outball of radius around a vertex as
respectively. In other words, the inball of radius around is the subgraph induced by vertices with The outball is defined similarly. We define the ball of radius around vertex as
In other words, the ball of radius around is the subgraph induced by vertices within roundtrip distance of .
3 Deterministic Approximation Algorithms
In this section we present our deterministic algorithms for computing a approximation to the girth and computing multiplicative roundtrip spanners. Our main result will be showing how to compute improved roundtrip covers as defined originally in [RTZ08]. Leveraging this result we will prove Theorem 1 and Theorem 2.
First, leveraging the definitions of balls in Section 2 we define roundtrip covers. Intuitively, roundtrip covers are a union of balls of radius such that if vertices satisfy then are both in some ball in the cover.
Definition 3.1 (Roundtrip Covers).
A collection of balls is a roundtrip cover of a weighted directed graph if and only if every ball in has radius at most , and for any with there is a ball such that .
Specifically, we show the following theorem.
Theorem 7 (Improved Roundtrip Covers).
For an -vertex -edge graph , an execution of returns a collection of balls that forms a roundtrip cover of a weighted directed graph in time where
To show Theorem 1 from Theorem 7, we can compute roundtrip covers for all , and set our girth estimate as the minimum radius of any ball in the cover that has a cycle. To compute a roundtrip spanner, simply take the union of all the balls in the roundtrip covers for all
The rest of the section is organized as follows. In Section 3.1 we state our main algorithm. In Section 3.2 we analyze the algorithm and prove Theorem 7. In Section 3.3 we use Theorem 7 to formally prove Theorem 1 and Theorem 2.
3.1 Main Algorithm
We first give a high-level description of our algorithm for computing Roundtrip Covers, RoundtripCover, which is presented formally as Algorithm 1.
High-level Description of Algorithm:
As discussed in Section 1.3, our algorithm is based on ball growing along with the following observation: if for a radius we compute a roundtrip cover of and add all the balls in the computed roundtrip cover on to our final cover, then we can safely delete all vertices from our graph and recurse on the rest of graph; the deleted vertices are already satisfied in the sense that for every with there is a ball in the cover such that . Indeed, if and then and therefore we are guaranteed that the roundtrip cover on contains a ball such that . Using this observation, we grow inballs and outballs around vertices in our graph to “partition” our graph into pieces that possibly overlap, where the overlap corresponds to the boundary in our example.
We describe our algorithm in more detail now. Consider any vertex . We grow an inball and outball around at the same rate, spending the same time on the inball and outball. First, we consider the case that for some , as was done in Pachocki et al. [PRS18]. Then we know that By our observation above, we can add the ball to our roundtrip cover, delete from , and recurse on the remainder. Otherwise, if we find a radius such that say and satisfy the conditions of GoodCut (Algorithm 2), then we recurse on and delete from our graph and recurse on the remaining graph. This is safe to do by our observation above. We can also do an analogous process on and By a variant of the standard ball-growing inequality (Lemma 3.2) we can show that a good cut always exists.
We now will give some intuition about the condition in GoodCut and the (somewhat strange) appearance of the in our algorithm. First, we remark that the condition in GoodCut must track both the number of vertices and edges in the ball: the former to control recursion depth and roundtrip cover size, and the latter to control runtime. Now we give intuition for why we require an approximation factor in our algorithm. Consider growing inballs from for various radii , and recall that we make a cut depending on the relative sizes of and . Now, note that if for example , we can afford to have , as we can simply run a naive algorithm on now. On the other hand, if for example , we can essentially only afford to have To see the latter, note that the recurrence has solution
Now, interpolating between these two extremes allows us to compute the optimal way to do ball cutting (which is done inGoodCut). This leads to a ball cutting procedure with levels, and thus results in an approximation ratio.
Explanation of Algorithm 1:
We now explain what each piece of Algorithm 1 is doing. Here, and track the radius of the inball and outball that we are growing. We grow the balls at the same rate. If we notice that at any point we are in position to make a good cut (see lines 7, 9) then we do so. Otherwise, we know that both balls will eventually contain many vertices (see line 5). In this case, we add to our roundtrip cover, delete from our graph, and recurse. To grow the inball and outball at the same rate, we run Dijkstra to grow the inball and outball, alternately processing an edge at a time from the inball and outball. We check the condition of GoodCut on a ball when we have certified that we have processed all vertices up to distance or respectively.
3.2 Analysis of RoundtripCover and proof of Theorem 7
In this section we prove Theorem 7, bounding the performance of our roundtrip cover algorithm Algorithm 1. We start by showing that at all points in the algorithm, hence some condition in lines 5, 7, 9 will trigger eventually.
At all points during Algorithm 1, we have that
We show , and the bound on is analogous. To prove this we assume that none of the conditions in the inner loop of the algorithm trigger, and compute the resulting vertex and edge sizes of and . To this end, assume that and By the conditions of lines 5, 7, and 11 we know that each time we increment either
We first show that Eq. 1 can only hold for values of . To this end, define a sequence as and By induction it follows that In particular,
This shows that the condition in Eq. 1 can only hold at most times. Similarly, after Eq. 2 holds for different , we will have that At this point, Eq. 3 can hold at most times. This gives us that in total
as desired. ∎
Now we proceed to proving Theorem 7.
Proof of Theorem 7.
We first show that the algorithm indeed returns a roundtrip cover. Then we bound the total size of balls in the roundtrip cover, as well as the runtime.
Returns a roundtrip cover.
We analyze lines 6, 8, and 10. In line 6, note that by Lemma 3.2, we know that Therefore, we know that Additionally, it is clear that for any vertex , if another vertex satisfies then Therefore, the ball contains both and , so we can safely delete from and recurse. This is exactly what is happening in line 6. In line 8, note that for any vertex , if another vertex satisfies then Therefore, if we construct a roundtrip cover on , then we can safely delete from and recurse. This is exactly what occurs in line 8. The same argument now applies to line 10. Finally, note that all balls we create are of radius
Total sizes of balls is .
We show by induction that the total number of vertices among all balls in the rountrip cover computed is at most for an input graph with vertices. We show this by analyzing lines 6, 8, and 10. For line 6, note that because , we know that . Therefore, it suffices to verify
which is clear. For line 8, for simplicity let Then by the condition of GoodCut, it suffices to note that
The same argument now applies to line 10.
Can be implemented to run in time .
We can implement the algorithm to grow and at the same rate, i.e., we process a single inedge and outedge at a time, and increment and when we are sure that we’ve processed the whole inball or outball. This can be done with Dijkstra’s algorithm. We stop growing a ball once it contains at least vertices. This way, any time we recurse, the total amount of work we have done to this point is at most twice the number of edges in the piece we are recursing on in lines 6, 8, and 10. To bound the runtime, we imagine lines 8 and 10 as partitioning the graph into pieces of the form or and then recursing on or . This way, the depth of the recursion is at most because we know that when we recurse.
We will now show that the total number of edges in level of the recursion is bounded by , where the top level is level . We proceed by induction on Say that the algorithm partitions into where each is either of the form or For simplicity, let and let or corresponding to what was. We know by the condition of GoodCut that By induction, we know that the total number of edges processed in level is at most
Now, it is clear that the total work done on a graph at some node of the recursion tree is as line 6 only occurs times. Now taking in the above claim completes the proof. ∎
Both theorems follow easily from Theorem 7.
Proof of Theorem 1.
We first show the result for unweighted graphs. To show this, run
Now, set our estimate of the girth to be the smallest radius of any nontrivial ball that we had in a roundtrip cover. By the guarantees of RoundtripCover, it is clear that as desired. It is clear that the algorithm runs in time by Theorem 7. We can extend this to weighted graphs by instead taking , where is the maximum edge weight. This can be improved to by the same method as done in [PRS18], where they give a general reduction by contracting small weight strongly connected components and deleting large weight edges (see Section 5.1 in [PRS18] for more details). ∎
4 Randomized Constant Approximations
To simplify our algorithm and analysis we assume that the maximum degree of is bounded by , i.e. we assume it is only a constant larger than the average degree, which is . We justify this assumption by showing that we can always reduce to this case as is formalized in the following lemma.
Given a directed weighted graph of vertices and edges with non negative edge weights, one can construct a graph in time of vertices and edges with non negative edge weights and of maximum degree such that 1. all roundtrip distances (between pairs of vertices in ) in and in are the same. 2. Moreover, given a cycle in , one can easily find (in time) a cycle in of the same length. 3. Finally, given a subgraph of , one can easily find (in time) a subgraph of such that the number of edges in is at most the number of edges in and the roundtrip distances in and are the same.
The reduction is as follows. Let . Replace all the outgoing edges from by a balanced -tree with all weights of internal edges 0 (a balanced tree with degree where is the root of the tree and all edges of the tree are directed from the root) where each leaf in this tree is “responsible” for of the outgoing edges of , that is, each leaf has outgoing edges to (different) neighbors of . The weight of these edges are the original corresponding weight of the edges of . We set the weight of all edges in the balanced -tree to be 0. A similar process is done for the incoming edges of for every node . It is not hard to verify that the number of new nodes created is proportional to the number of edges divided by , that is, the number of new nodes is . In addition, every two original nodes and that have a directed path in , also have a directed path in the modified graph. It is not hard to verify that all round trip distances in and are the same (this implies also that the girth of and is the same). Moreover, given a cycle in one can easily find a cycle of the same length in by simply contracting the -trees of each vertex. Finally, given a subgraph of , one can obtain a subgraph of by simply contracting the -trees of each vertex. It is not hard to verify that all roundtrip distances (for pairs of vertices in ) in and are the same. ∎
4.1 An Time -approximation to Girth
In this section we show a procedure that given a directed weighted graph and a girth estimate , returns a cycle of length at most if the girth in is at most . The algorithm is given by GirthApprox (See Algorithm 3) which in turn invokes the subroutine SimilarSet (See Algorithm 4).
In order to approximate the girth of , similarly to the previous section, we simply invoke this procedure for every for and stop once the procedure returns a cycle. If is the girth of this incurs an additional factor to the running time (as for the first index such that the algorithm will return a cycle w.h.p.) and an additional factor in the approximation ratio. The additional factor in the approximation ratio can be avoided if the weights are integers by simply using binary search on the range between 1 and (where is the maximum edge weight in ) and finding two consecutive integers and such that the procedure returned a cycle of length at most when invoked on but not a cycle when invoked on . This incurs a factor in the running time that can be improved to by the same method as done in [PRS18] of contracting small weight strongly connected components and deleting large weight edges (see Section 5.1 in [PRS18] for more details).
Let be a directed graph with vertices and edges. We assume the graph is of average degree and that also the maximum degree in the graph is also .
The subroutine SimilarSet gets as an input the graph and the target distance and either returns a cycle of length at most or returns a subset of vertices for every . The subset for a vertex consists of vertices at distance at most from with the guarantee that contains all vertices that 1. are at distance at most from and 2. are on a cycle of length with . Procedure GirthApprox invokes the Procedure SimilarSet twice, once on and once on the reversed graph of (the graph obtained by reversing every edge of ). If a cycle of length is returned in one of these calls then procedure GirthApprox returns such a cycle. Otherwise, let be the sets returned from invoking SimilarSet on the graph and on the reversed graph. Next, the procedure for every checks if there is a cycle containing of length at most in the induced graph of . If such a cycle exists then the procedure returns such a cycle.
Procedure SimilarSet works as follows. The algorithms starts by sampling subsets of expected size