Efficient and Simple Algorithms for Fault Tolerant Spanners

by   Michael Dinitz, et al.

It was recently shown that a version of the greedy algorithm gives a construction of fault-tolerant spanners that is size-optimal, at least for vertex faults. However, the algorithm to construct this spanner is not polynomial-time, and the best-known polynomial time algorithm is significantly suboptimal. Designing a polynomial-time algorithm to construct (near-)optimal fault-tolerant spanners was given as an explicit open problem in the two most recent papers on fault-tolerant spanners ([Bodwin, Dinitz, Parter, Vassilevka Williams SODA '18] and [Bodwin, Patel PODC '19]). We give a surprisingly simple algorithm which runs in polynomial time and constructs fault-tolerant spanners that are extremely close to optimal (off by only a linear factor in the stretch) by modifying the greedy algorithm to run in polynomial time. To complement this result, we also give simple distributed constructions in both the LOCAL and CONGEST models.



There are no comments yet.


page 1

page 2

page 3

page 4


A Polynomial Time Algorithm for Almost Optimal Vertex Fault Tolerant Spanners

We present the first polynomial time algorithm for the f vertex fault to...

Optimal Vertex Fault-Tolerant Spanners in Polynomial Time

Recent work has pinned down the existentially optimal size bounds for ve...

Partially Optimal Edge Fault-Tolerant Spanners

Recent work has established that, for every positive integer k, every n-...

Fault Tolerant and Fully Dynamic DFS in Undirected Graphs: Simple Yet Efficient

We present an algorithm for a fault tolerant Depth First Search (DFS) Tr...

Vertex Fault-Tolerant Emulators

A k-spanner of a graph G is a sparse subgraph that preserves its shortes...

Optimistic Initialization and Greediness Lead to Polynomial Time Learning in Factored MDPs - Extended Version

In this paper we propose an algorithm for polynomial-time reinforcement ...

Translating between the representations of a ranked convex geometry

It is well known that every closure system can be represented by an impl...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let be a graph, possibly with edge lengths . A -spanner of , for , is a subgraph that preserves all pairwise distances within factor , i.e.,


for all (where denotes the shortest-path distance in a graph ). The distance preservation factor is called the stretch of the spanner. Less formally, graph spanners are a form of sparsifiers that approximately preserve distances (as opposed to other notions of graph sparsification which approximately preserve cuts [BK15], the spectrum [SS11, BSS14], or other graph properties). When considering spanners through the lens of sparsification, perhaps the most important goal in the study of graph spanners is understanding the tradeoff between the stretch and the sparsity. The main result in this area, which is tight assuming the “Erdős girth conjecture” [Erd64], was given by Althöfer et al.:

Theorem 1 ([Add93]).

For every positive integer , every weighted graph has a -spanner with at most edges.

This notion of graph spanners was first introduced by Peleg and Schäffer [PS89] and Peleg and Ullman [PU89] in the context of distributed computing, and has been studied extensively for the last three decades in the distributed computing community as well as more broadly. Spanners are not only inherently interesting mathematical objects, but they also have an enormous number of applications. A small sampling includes uses in distance oracles [TZ05], property testing [BGJ09, BBG14], synchronizers [PU89], compact routing [TZ01], preprocessing for approximation algorithms [BKM09, DKN17]), and many others.

Many of these applications, particularly in distributed computing, arise from modeling computer networks or distributed systems as graphs. But one aspect of distributed systems that is not captured by the above spanner definition is the possibility of failures. We would like our spanner to be robust to failures, so that even if some nodes fail we still have a spanner of what remains. More formally, is an -(vertex-)fault tolerant -spanner of if for every set with the spanner condition holds for , i.e.,

for all . If is instead an edge set then this gives a definition of an -edge-fault-tolerant -spanner.

This notion of fault-tolerant spanners was first introduced by Levcopoulos, Narasimhan, and Smid [LNS98] in the context of geometric spanners (the special case when the vertices are in Euclidean space and the distance between two points is the Euclidean distance), and has since been studied extensively in that setting [LNS98, Luk99, CZ04, NS07]. Note that in the geometric setting, for all , since faults do not change the underlying geometric distances.

In general graphs, though, may be extremely different from , making this definition more difficult to work with. The first results on fault-tolerant graph spanners were by Chechik, Langberg, Peleg, and Roditty [CLPR10], who showed how to modify the Thorup-Zwick spanner [TZ05] to be -fault tolerant with an additional cost of approximately : the number of edges in the -fault tolerant -spanner that they create is approximately (where hides polylogarithmic factors). Since [CLPR10] there has been a significant amount of work on improving the sparsity, particularly as a function of the number of faults (since we would like to protect against large numbers of faults but usually care most about small stretch values). First, Dinitz and Krauthgamer [DK11] improved the size to by giving a black-box reduction to the traditional non-fault tolerant setting. Then Bodwin, Dinitz, Parter, and Vassilevska Williams [BDPW18] decreased this to , which they also showed was optimal (for vertex faults) as a function of and (i.e., the only non-optimal dependence is the ). Unlike previous fault-tolerant spanner constructions, this optimal construction was based off of a natural greedy algorithm (the natural generalization of the greedy algorithm of [ADD93]). An improved analysis of the same greedy algorithm was then given by Bodwin and Patel [BP19], who managed to show the fully optimal bound of .

Unlike the previous fault-tolerant spanner constructions of [CLPR10, DK11] and the greedy non-fault tolerant algorithm of [ADD93], the greedy algorithm of [BDPW18, BP19] has a significant weakness: it takes exponential time. Obtaining the same (or similar) size bound in polynomial time was explicitly mentioned as an important open question in both [BDPW18] and [BP19].

1.1 Our Results and Techniques

In this paper we design a surprisingly simple algorithm to construct nearly-optimal fault-tolerant spanners in polynomial time, in both unweighted and weighted graphs.

Theorem 2.

There is a polynomial time algorithm which, given integers and and a (weighted) graph , constructs an -fault tolerant -spanner with at most edges in time (for vertex fault tolerance) or (for edge fault tolerance).

Note that while we are a factor of away from complete optimality (for vertex faults), this is truly optimal when the stretch is constant and, for non-constant stretch values, is still significantly sparser than the analysis of the exponential time algorithm by [BDPW18] (which lost an exponential factor in ).

The main idea in our algorithm is to replace the exponential-time subroutine used in the greedy algorithm of [BDPW18, BP19] with an appropriate polynomial-time approximation algorithm. More specifically, the main step of the exponential time greedy algorithm is to consider whether a given candidate edge is “already spanned” by the subgraph that has already been built. This means determining whether, for some candidate edge , there is a fault set with such that . If such a fault set exists then the algorithm adds to , and otherwise does not111Note that in the fault-free case this just means checking whether there is already a path of stretch at most between the endpoints, which is precisely the original greedy algorithm of [ADD93].. In both [BDPW18] and [BP19], the only method given to find such a set was to try all possible sets, giving running time that is exponential in and thus exponential in the size of the input.

Our main approach is to speed this up by designing a polynomial-time algorithm to replace this exponential-time step. Unfortunately, the corresponding problem (known as Length-Bounded Cut) is NP-hard [BEH06], so we cannot hope to actually solve it efficiently. Instead, we design an approximation algorithm for Length-Bounded Cut and use it instead. We end up with a fairly weak approximation (basically a -approximation), and one which only holds in the unweighted case. But this turns out to be enough for the unweighted case: it intuitively allows us to (in polynomial time) build a -fault-tolerant -spanner instead of an -fault tolerant -spanner, which changes the extra size cost from to . However, this is only intuition. The graph we end up creating is not in fact -fault tolerant, and is not necessarily even a subgraph of the -fault-tolerant spanner that the true greedy algorithm would have built. So we cannot simply argue that our algorithm returns something with at most as many edges as the greedy -fault-tolerant greedy spanner. Instead, we need to analyze the size of our spanner from scratch. Fortunately, we can do this by simply following the proof strategy of [BP19] with only some minor modifications.

A natural approach to the weighted case would be to try to generalize this by creating an -approximation for Length-Bounded Cut in the weighted setting. Such an algorithm would certainly suffice, but unfortunately we do not know how to design any nontrivial approximation algorithm for Length-Bounded Cut in the presence of weights. While this might appear to rule out using a similar technique, we show that special properties of the greedy algorithm allow us to essentially reduce to the unweighted setting! We use the weights to determine the order in which we consider edges, but for the rest of the algorithm we simply “pretend” to be in the unweighted setting. Since the size bound for the unweighted case worked for any ordering, that same size bound will apply to our spanner. And then we can use the fact that we considered edges in order of nondecreasing weights to argue that the subgraph we create is in fact an -fault tolerant -spanner even though we ignored the weights.

Distributed Settings

While the focus of this paper is on a centralized polynomial-time algorithm since the existence of such an algorithm was an explicit open question from [BDPW18] and [BP19], we complement this result with some simple algorithms in the standard LOCAL and CONGEST models of distributed computation.

In the LOCAL model, we can use standard network decompositions to find a clustering of the graph where the clusters have low diameter, every edge is in at least one cluster, and the clustering comes from partitions. Since in the LOCAL model we are allowed unbounded message sizes, this means that in time we can send the subgraph induced by each cluster to the cluster center (an arbitrary node in the cluster), who can then locally run the greedy algorithm on that cluster and then inform the nodes in the cluster about the edges that have been chosen. This will take only communication rounds (since clusters have diameter ) and will incur only an extra factor in the number of edges (since the clustering can be divided into partitions).

In the CONGEST model we cannot apply this approach (even though we could find a similar clustering) because we are not able to gather large induced subgraphs at the cluster centers (due to the bound on message sizes). Instead, we show that the older fault-tolerant spanner construction of [DK11] can be combined with the standard (non-fault-tolerant) spanner algorithm in the CONGEST model due to Baswana and Sen [BS07] to give a fault-tolerant spanner algorithm in CONGEST. This approach means that the size increases to (so we are a factor of away from the bounds of the polynomial-time greedy algorithm), but the number of rounds needed is quite small despite the limitation on message sizes ( rounds).

2 Notation and Preliminaries

We will be discussing graphs where and . Sometimes these graphs will also have a weight function . We will slightly abuse notation to let for all . For a (possibly weighted) graph , we will let denote the length of the shortest (lowest-weight) path from to (if no such path exists then this length is ). For any , we let denote the subgraph of induced by . For let be , and for let be .

Definition 1.

Let be a (possibly weighted) graph. A subgraph of is an -vertex-fault-tolerant (-VFT) -spanner of if for all with and . A subgraph of is an -edge-fault-tolerant (-EFT) -spanner of if for all with .

Throughout this paper, for simplicity we will only discuss the vertex fault tolerant case since that is the more difficult one to prove upper bounds for. The proofs for the edge fault tolerant case are essentially identical, with only one step of the algorithm running slightly slower.

We first show an equivalent definition that will let us restrict which pairs of vertices we care about.

Lemma 3.

Let be a graph with weight function and let be a subgraph of . Then is an -VFT -spanner of if and only if for all with and such that and


The only if direction is immediately implied by Definition 1, since for any with and such that and , we know from Definition 1 that .

For the if direction, let with and . Let be the shortest path in between and . If then , and thus . If , then we know that for all , and thus

Hence is an -VFT -spanner of . ∎

The original greedy algorithm for fault-tolerant spanners was introduced and analyzed by [BDPW18], with an improved analysis by [BP19], and is given in Algorithm 1. The part of this algorithm which takes exponential time is the “if” condition, i.e., checking whether there is a fault set which hits all stretch- paths. For edge fault tolerance, the algorithm is the same except that is an edge set.

  for all  in nondecreasing weight order do
     if there exists a set of at most vertices such that  then
        add to H
     end if
  end for
  return  H
Algorithm 1 Greedy -VFT -Spanner Algorithm

3 Unweighted Graphs

In this section we design a polynomial-time algorithm for the special case of unweighted (or unit-weighted) graphs. We begin by designing a simple approximation algorithm for the Length-Bounded Cut problem, and then show that this algorithm can be plugged into the greedy algorithm with only a small loss.

3.1 Length-Bounded Cut

In order to design a polynomial-time variant of the greedy algorithm, we want to replace the “if” condition by something that can be computed in polynomial time. While there are many possibilities, there are two obvious approaches: we could try to compute the the maximum such that there is a fault set of size which hits all -hop paths, or we could try to compute the minimum such that there is a fault set of size which hits all -hop paths. It turns out that this second approach is more fruitful.

Consider the following problem, known as the Length-Bounded Cut problem [BEH06]. The input is an unweighted graph with and , vertices (known as the terminals), and a positive integer . The goal is to return the set of minimum cardinality such that

Theorem 4.

There is a -approximation algorithm for Length-Bounded Cut which runs in time.


Recall that the classical Bellman-Ford algorithm has the property that each iteration takes time, and after iterations it has computed the shortest -hop path from the source to the destination. So we can use Bellman-Ford to check whether there is a path with at most hops from to in time.

This gives the following natural algorithm, which is essentially the standard “frequency” approximation of Set Cover (or Hitting Set). Initialize . While there exists a path of length at most from to in , add all nodes of (other than and ) to . Obviously when this algorithm terminates, is a feasible length-bounded cut. And it clearly runs in at most time, since each iteration of the algorithm takes time and there can be at most iterations (since in each iteration we add at least one more node to ).

To bound the approximation ratio, let denote the optimal length-bounded cut. Then for every path which our algorithm considers (and adds to ), it must be the case that since must hit all paths of length at most . Let be the set of all paths considered by the algorithm (where we have removed from the paths), and notice that they are all node-disjoint, have at most nodes in them, and . Hence

where for the last inequality we used that the paths in are node-disjoint. ∎

To handle edge fault-tolerance, we need to slightly change the definition of Length-Bounded Cut to return an edge set rather than a vertex set. The same algorithm works (where in each iteration we add all edges in the path to rather than all vertices), but it becomes a -approximation rather than a -approximation. It also runs in time rather than , since we might have to run Bellman-Ford times rather than times. This is what results in the slightly worse time bound for our EFT spanner construction compared to our VFT spanner construction.

3.2 Modified Greedy

Let be an undirected unweighted graph. In order to construct a fault tolerant spanner of we consider the following natural extension of the greedy algorithm for constructing spanners. For an EFT spanner algorithm, we simply use the edge-based version of Length-Bounded Cut and use the threshold rather than .

  for all  in arbitrary order do
     Let be the value of the length-bounded cut found by Theorem 4 on input graph with terminals and length bound .
     if  then
        add to H
     end if
  end for
  return  H
Algorithm 2 Modified Greedy VFT Spanner Algorithm
Theorem 5.

The running time of Algorithm 2 is at most for vertex fault tolerance and for edge fault tolerance.


We run our Length-Bounded Cut approximation algorithm once for each edge in . Thus the running time of Algorithm 1 is for VFT and for EFT. ∎

We now prove that this algorithm does indeed return a valid solution, despite the use of an approximation algorithm to determine whether or not to add an edge (we prove this only for VFT for simplicity, but the proof for EFT is the same).

Theorem 6.

Algorithm 2 returns an -VFT -spanner.


Let be an arbitrary fault set with and with . By Lemma 3, we just need to show that (since is unweighted) in order to prove the theorem. Clearly this is true if . If , then when the algorithm considered it must have been the case that . Since was the value returned by the algorithm of Theorem 4, we know from the approximation guarantee of Theorem 4 that the true minimum length-bounded cut on (for with length bound ) has at least nodes in it. Thus is not a length-bounded cut in for with length bound , and so . ∎

Now it remains only to prove the size of the returned spanner. To do this, a natural approach would be to argue that the spanner it returns is a subgraph of the greedy -VFT spanner, since it seems like whenever our modified algorithm requires us to add an edge it has found a cut certifying that the greedy -fault tolerant spanner would also have had to add that edge. Unfortunately, this is not true since the modified algorithm might not add some edges that the true greedy algorithm would have added, and thus later on our algorithm might have to actually add some edges that the true greedy algorithm would not have had to add.

The next natural approach would be to try to use the analysis of [BP19] as a black box. Unfortunately we cannot do this either, since the lemmas they use are specific to the true greedy algorithm rather than our modification. However, it is straightforward to modify their analysis so that it continues to hold for our modified algorithm, with only an additional loss of . We do this here for completeness. As in [BP19], we start with the definition of a blocking set, and then give two lemmas using this definition. And also as in [BDPW18, BP19], we only prove this for VFT, as the proof for EFT is essentially identical.

Definition 2 ([Bp19]).

For any graph , we define to be a -blocking set of if for all , we have and for any cycle in with , there exists such that .

Lemma 7.

Any graph returned by Algorithm 2 with parameters has a -blocking set of size at most .

It was shown in [BP19] that the graph returned by the standard VFT greedy algorithm with parameters has a ()-blocking set of size at most .222In [BP19] the parameter “” is used to denote the stretch, while for us the stretch is , and thus there are slight constant factor differences between the statements as written in [BP19] and our interpretation of their statements. But our statements about [BP19] are correct under this change of variables. So our modified algorithm satisfies the same lemma up to a factor of . The proof is almost identical in our case; we essentially replace all instances of in their proof with .

Proof of Lemma 7.

Let be some edge in , and let be the subgraph maintained by the algorithm just before is added to (so is a subset of the final ). Since was added by Algorithm 2, there is some set with such that .

Now we can define the blocking set: let .

Since for all , we immediately get that as claimed. So we now need to show that is a -blocking set. To see this, let be any cycle with at most vertices in , and let be the last edge of this cycle to be added to . Let be the subgraph of built by the algorithm just before is added. Then is a path in of length at most , and thus there is some that is in . Thus . ∎

Now we know that the spanner returned by Algorithm 2 has a small blocking set. The next lemma implies that any such graph must have a dense but high-girth subgraph.

Lemma 8.

Let be any graph on nodes and edges (with ) that has a -blocking set of size at most . Then has a subgraph on nodes and edges that has girth greater than .


Let denote the induced subgraph of on a uniformly random subset of exactly nodes. Let , and let denote the graph obtained by removing every edge contained in any pair in . The graph will be the one we analyze.

The easiest property to analyze is the number of nodes in : there are precisely vertices in , which is as claimed.

The next easiest property of to prove is the girth. Let be a cycle in with at most nodes. is either in or it is not. If it is not in then some vertex in is not in , and thus is not in . On the other hand, if is in then by the definition of there is some edge so that , and also , and thus does not exist in .

To analyze , we start with the following observations.

  • Each remains in if

    . This happens with probability

  • Each remains in if . This happens with probability

Now we can use these observations to compute the expected size of :

Note that the bounds on and on the girth of are deterministic. So there is some subgraph which has those bounds and where the number of edges is at least the expectation, proving the lemma. ∎

This lemma allows us to prove the size bound.

Theorem 9.

The subgraph returned by Algorithm 2 has at most edges.


If then the theorem is trivially true. Otherwise, by Lemmas 7 and 8 we know that has a subgraph of girth larger than on nodes and edges. But it has long been known that any graph with vertices and girth larger than must have at most edges (this is the key fact used in the original non-fault tolerant greedy algorithm analysis [ADD93]), and hence . Therefore we have

Theorems 5, 6, and 9 together imply Theorem 2 in the unweighted case.

4 Weighted Graphs

We now show that we can use algorithm we designed for the weighted setting even in the presence of weights. Our algorithm is very simple: we order the edges in nondecreasing weight order, but then run the unweighted algorithm on the edges in this order. We give this algorithm more formally as Algorithm 3. Again, changing to edge fault tolerance is straightforward: we just use the edge version of Length-Bounded Cut and change to . So we prove this only for vertex fault tolerance for simplicity.

  for all  in nondecreasing weight order do
     Let be the value of the length-bounded cut found by Theorem 4 on input graph (with no weights) with terminals and length bound .
     if  then
        add to H
     end if
  end for
  return  H
Algorithm 3 Modified Greedy VFT Spanner Algorithm (Weighted)
Theorem 10.

Algorithm 3 returns an -VFT (-EFT) -spanner with at most edges in time at most ().


The running time is directly from Theorem 5, since the only additional step in the algorithm is sorting the edges by weight, which takes only additional time. The size also follows directly from Theorem 9, since Algorithm 3 is just a particular instantiation of Algorithm 2 where the ordering (which is unspecified in Algorithm 2) is determined by the weights. In other words, Theorem 9 holds for an arbitrary order, so it certainly holds for the weight ordering.

The more interesting part of this theorem is correctness: why does this algorithm return an -VFT -spanner despite ignoring weights? Let be an arbitrary fault set with and with and . By Lemma 3, we just need to show that in order to prove the theorem. Clearly this is true if . So suppose that . Then when the algorithm considered , it must have been the case that . Then from the approximation bound of Theorem 4 we know that the minimum length-bounded cut in (unweighted) for with (unweighted) length bound has size at least , and thus is not such a cut. Thus at the time the algorithm was considering , there was some path between and in with at most edges. But since we considered edges in order of nondecreasing weight, every edge in has weight at most . Thus

as required. ∎

5 Distributed Algorithms

In this section we give efficient algorithms in two standard distributed models: the LOCAL model and the CONGEST model [Pel00]. Recall that in both models we assume communication happens in synchronous rounds, and our goal is to minimize the number of rounds needed. In the LOCAL model each node can send an arbitrary message on each incident edge in each round, while in the CONGEST model these messages must have size at most bits (or words, so we can send a constant number of node IDs and weights in each message). Note that both models allow unlimited computation at each node, and hence the difficulty with applying the greedy algorithm is not the exponential running time, but its inherently sequential nature.

5.1 Local

In the LOCAL model we will be able to implement the greedy algorithm at only a small extra cost in the size of the spanner. Our approach is simple: we use standard network decompositions to decompose the graph into clusters, run the greedy algorithm in each cluster, and then take the union of the spanner for each cluster.

The following theorem is a simple corollary of the construction of “padded decompositions” given explicitly in previous work on fault-tolerant spanners 

[DK11]. It also appears implicitly in various forms in [LS93, Bar96, MPX13, MPVX15] (among others). In what follows, the hop diameter of a cluster refers to its unweighted diameter.

Theorem 11.

There is an algorithm in the LOCAL model which runs in rounds and constructs such that:

  1. Each is a partition of , with each part of the partition referred to as a cluster. Let be the collection of all clusters of all partitions.

  2. Each cluster has hop diameter at most and contains some special node known as the cluster center.

  3. (there are partitions).

  4. With high probability ( for any constant ) for every edge there is a cluster such that .

With this tool, it is easy to describe our algorithm. First we use Theorem 11 to construct the partitions. Then in each cluster we gather at the cluster center the entire subgraph induced by that cluster. Each cluster center uses the greedy algorithm (Algorithm 1) on to construct an -VFT -spanner of , and then sends out the selected edges to the nodes in . Let be the final subgraph created (the union of the edges of each )

Theorem 12.

With high probability, is an -VFT -spanner of with at most edges and the algorithm terminates in rounds.


The round complexity is obvious from the round complexity and cluster hop diameter bounds in Theorem 11.

The total number of edges added is at most

where we used the size bound on the greedy algorithm from [BP19] and the fact from Theorem 11 that each is a partition of .

To show correctness, consider some and with and so that . By Lemma 3, we just need to prove that . Let be a cluster which contains both and , which we know exists (with high probability) from Theorem 11. Let . Then

(definition of )

Thus is indeed an -VFT -spanner of . ∎

5.2 Congest

We unfortunately cannot use the approach that we used in the LOCAL model in the CONGEST model, since we cannot efficiently gather the entire topology of a cluster at a single node. We will instead use the fault-tolerant spanner of Dinitz and Krauthgamer [DK11], rather than the greedy algorithm, and combine it with the non-fault tolerant spanner of [BS07] which can be efficiently constructed in CONGEST. This approach means that, unlike in the centralized setting or the LOCAL model, we will not be able to get size-optimal fault-tolerant spanners.

The algorithm of [DK11] works as follows (in the traditional centralized model). Suppose that we have some algorithm which constructs a -spanner with at most edges on any graph with nodes. There are iterations, and in every iteration each node chooses to participate independently with probability . For each , let be the vertices who participate and let be the subgraph of induced by them. We let be the -spanner constructed by on . Then we return the union of all .

The main theorem that [DK11] proved about this is the following.

Theorem 13 ([Dk11]).

This algorithm returns an -VFT -spanner of with edges with high probability.

Note that when , this results in an -VFT -spanner with at most , which is precisely the bound from [DK11].

Since the algorithm of [DK11] uses an arbitrary non-fault tolerant spanner algorithm , by using a distributed spanner algorithm for we naturally end up with a distributed fault-tolerant spanner algorithm. In particular, we will combine the algorithm of [DK11] with the following algorithm due to Baswana and Sen [BS07].

Theorem 14 ([Bs07]).

There is an algorithm that computes a -spanner with at most edges of any weighted graph in rounds in the CONGEST model.

Combining Theorems 13 and 14 immediately gives an algorithm in CONGEST that returns an -VFT -spanner of size at most that runs in at most rounds (with high probability). We can just run each iteration of the Dinitz-Krauthgamer algorithm [DK11] in series, and in each iteration we use the Baswana-Sen algorithm [BS07]. Since there are iterations, and Baswana-Sen takes rounds, this gives a total round complexity of .

We can improve on this bound by taking advantage of the fact that each iteration of Dinitz-Krauthgamer runs on a relatively small graph (approximately nodes), so we can run some of these iterations in parallel.

Theorem 15.

There is an algorithm that computes an -VFT -spanner of with edges of any weighted graph and which runs in rounds in the CONGEST model (all with high probability).


In the first phase of the algorithm each vertex randomly selects iterations out of the total in which to participate. Then each vertex sends its chosen iterations to all of its neighbors. Identifying these iterations take bits, and thus rounds in CONGEST.

After this has completed we enter the second phase of the algorithm, and now every node knows which iterations it is participating in and which iterations each of its neighbors is participating in. With high probability, for every edge there are at most iterations in which both endpoints participate. Thus if we try to run all iterations of Baswana-Sen (Theorem 14) in parallel, we have “congestion” of on each edge (at each time step) since there could be up to that many iterations in which a message is supposed to be sent along that edge at that time. Thus we can simply use time steps for each time step of Baswana-Sen and can simulate all iterations of the Dinitz-Krauthgamer algorithm (note that each Baswana-Sen message needs to have a tag added to it with the iteration number, but since that takes at most bits it fits within the required message size). Hence the total running time of this second phase is at most .

The size and correctness bounds are direct from Theorems 13 and 14, and the round complexity is from our analysis of the two phases above. ∎

6 Conclusion and Future Work

In this paper we designed an algorithm to compute nearly-optimal fault-tolerant spanners in polynomial time, answering a question posed by [BDPW18, BP19]. We also gave an optimal construction in the LOCAL model which runs in rounds, and an efficient algorithm in the CONGEST model that constructs fault-tolerant spanners which have the same size as in [DK11] rather than the optimal size.

There are many interesting open questions remaining about efficient algorithms for fault-tolerant spanners, as well as about the extremal properties of these spanners. Most obviously, the size we achieve is a factor of away from the optimal size, due to our use of an -approximation for Length-Bounded Cut. Can this be removed, either by giving a better approximation for Length-Bounded Cut or through some other construction? While is somewhat small since spanners tend to be most useful for constant stretch (and never have stretch larger than ), it would still be nice to get fully optimal size in polynomial time. Similarly, our distributed constructions are extremely simple, and there is no reason to think that we actually need rounds in LOCAL or that we cannot get optimal size fault-tolerant spanners in CONGEST. It would be interesting to design better distributed and parallel algorithms for these objects, particularly since the greedy algorithm (the only size-optimal algorithm we know) tends to be difficult to parallelize.

From a structural point of view, we reiterate one of the main open questions from [BDPW18] and [BP19]: understanding the optimal bounds for edge-fault-tolerant spanners. The best upper bound we have is the same that we have for the vertex case, while the best lower bound is (from [BDPW18]). What is the correct bound?


  • [ADD93] Ingo Althöfer, Gautam Das, David P. Dobkin, Deborah Joseph, and José Soares. On sparse spanners of weighted graphs. Discrete & Computational Geometry, 9:81–100, 1993.
  • [Bar96] Y. Bartal. Probabilistic approximation of metric spaces and its algorithmic applications. In Proceedings of 37th Conference on Foundations of Computer Science, pages 184–193, Oct 1996.
  • [BBG14] Piotr Berman, Arnab Bhattacharyya, Elena Grigorescu, Sofya Raskhodnikova, David P. Woodruff, and Grigory Yaroslavtsev. Steiner transitive-closure spanners of low-dimensional posets. Combinatorica, 34(3):255–277, 2014.
  • [BDPW18] Greg Bodwin, Michael Dinitz, Merav Parter, and Virginia Vassilevska Williams. Optimal vertex fault tolerant spanners (for fixed stretch). In Artur Czumaj, editor, Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, pages 1884–1900. SIAM, 2018.
  • [BEH06] Georg Baier, Thomas Erlebach, Alexander Hall, Ekkehard Köhler, Heiko Schilling, and Martin Skutella. Length-bounded cuts and flows. In Michele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener, editors, Automata, Languages and Programming, pages 679–690, Berlin, Heidelberg, 2006. Springer Berlin Heidelberg.
  • [BGJ09] Arnab Bhattacharyya, Elena Grigorescu, Kyomin Jung, Sofya Raskhodnikova, and David P. Woodruff. Transitive-closure spanners. In Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’09, pages 932–941, 2009.
  • [BK15] András A. Benczúr and David R. Karger. Randomized approximation schemes for cuts and flows in capacitated graphs. SIAM J. Comput., 44(2):290–319, 2015.
  • [BKM09] Glencora Borradaile, Philip Klein, and Claire Mathieu. An approximation scheme for steiner tree in planar graphs. ACM Trans. Algorithms, 5(3):31:1–31:31, July 2009.
  • [BP19] Greg Bodwin and Shyamal Patel. A trivial yet optimal solution to vertex fault tolerant spanners. In Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing, PODC ’19, page 541–543, New York, NY, USA, 2019. Association for Computing Machinery.
  • [BS07] Surender Baswana and Sandeep Sen. A simple and linear time randomized algorithm for computing sparse spanners in weighted graphs. Random Struct. Algorithms, 30(4):532–563, 2007.
  • [BSS14] Joshua D. Batson, Daniel A. Spielman, and Nikhil Srivastava. Twice-ramanujan sparsifiers. SIAM Review, 56(2):315–334, 2014.
  • [CLPR10] Shiri Chechik, Michael Langberg, David Peleg, and Liam Roditty. Fault tolerant spanners for general graphs. SIAM J. Comput., 39(7):3403–3423, 2010.
  • [CZ04] Artur Czumaj and Hairong Zhao. Fault-tolerant geometric spanners. Discrete & Computational Geometry, 32(2):207–230, 2004.
  • [DK11] Michael Dinitz and Robert Krauthgamer. Fault-tolerant spanners: better and simpler. In Proceedings of the 30th Annual ACM Symposium on Principles of Distributed Computing, PODC 2011, San Jose, CA, USA, June 6-8, 2011, pages 169–178, 2011.
  • [DKN17] Michael Dinitz, Guy Kortsarz, and Zeev Nutov. Improved approximation algorithm for steiner k-forest with nearly uniform weights. ACM Trans. Algorithms, 13(3), July 2017.
  • [Erd64] Paul Erdős. Extremal problems in graph theory. In IN “THEORY OF GRAPHS AND ITS APPLICATIONS,” PROC. SYMPOS. SMOLENICE. Citeseer, 1964.
  • [LNS98] Christos Levcopoulos, Giri Narasimhan, and Michiel Smid. Efficient algorithms for constructing fault-tolerant geometric spanners. In

    Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing

    , pages 186–195. ACM, 1998.
  • [LS93] Nathan Linial and Michael E. Saks. Low diameter graph decompositions. Combinatorica, 13(4):441–454, 1993.
  • [Luk99] Tamas Lukovszki. New results on fault tolerant geometric spanners. Algorithms and Data Structures, pages 774–774, 1999.
  • [MPVX15] Gary L Miller, Richard Peng, Adrian Vladu, and Shen Chen Xu. Improved parallel algorithms for spanners and hopsets. In Proceedings of the Symposium on Parallelism in Algorithms and Architectures. ACM, 2015.
  • [MPX13] Gary L Miller, Richard Peng, and Shen Chen Xu. Parallel graph decompositions using random shifts. In Proceedings of the ACM Symposium on Parallelism in algorithms and architectures. ACM, 2013.
  • [NS07] Giri Narasimhan and Michiel Smid. Geometric Spanner Networks. Cambridge University Press, 2007.
  • [Pel00] David Peleg. Distributed computing: a locality-sensitive approach. SIAM, 2000.
  • [PS89] David Peleg and Alejandro A. Schäffer. Graph spanners. Journal of Graph Theory, 13(1):99–116, 1989.
  • [PU89] David Peleg and Jeffrey D. Ullman. An optimal synchronizer for the hypercube. SIAM J. Comput., 18(4):740–747, 1989.
  • [SS11] Daniel A. Spielman and Nikhil Srivastava. Graph sparsification by effective resistances. SIAM J. Comput., 40(6):1913–1926, 2011.
  • [TZ01] Mikkel Thorup and Uri Zwick. Compact routing schemes. In SPAA, pages 1–10, 2001.
  • [TZ05] Mikkel Thorup and Uri Zwick. Approximate distance oracles. J. ACM, 52(1):1–24, 2005.