Graph spanners are subgraphs which approximately preserve distances. Slightly more formally, given a graph (possibly with lengths on the edges), a subgraph of is a -spanner of if for all , where denotes shortest-path distances in (and in ). The value is called the stretch of the spanner.
Graph spanners were originally introduced in the context of distributed computing [26, 25], but have since proved to be a fundamental building block that is useful in a variety of applications, from property testing  to network routing 
. When building spanners there are many objectives which we could try to optimize, but probably the most popular is the number of edges (thesize or the sparsity). Not only is sparsity important in many applications, it also admits a beautiful tradeoff with the stretch, proved by Althöfer et al. :
Theorem 1.1 ().
For every integer and every weighted graph with , there is a -spanner of with at most edges.
While understanding the tradeoff between the size and the stretch was a seminal achievement, for many applications (particularly in distributed computing) we care not just about the size, but also about the maximum degree. Unfortunately, unlike the size, there is no possible tradeoff between the stretch and the maximum degree. This is trivial to see: if is a star, then the only spanner of with non-infinite stretch has maximum degree of . In general, if has maximum degree , then all we can say is the trivial fact that has a spanner with maximum degree at most . Nevertheless, given the importance of the maximum degree objective, there has been significant work on building spanners that minimize the maximum degree from the perspective of approximation algorithms [22, 10, 9]. From this perspective, we are given a graph and stretch value and are asked to find the “best” -spanner of (where “best” means minimizing the maximum degree).
While this has been an interesting and productive line of research, clearly there are problems with the maximum degree objective as well. For example, if it is unavoidable for there to be some node of large degree , the maximum degree objective allows us to make every other vertex also of degree , with no change in the objective function. But clearly we would prefer to have fewer high-degree nodes if possible!
So we are left with a natural question: can we define a notion of “cost” of a spanner which discourages very high degree nodes, but if there are high degree nodes, still encourages the rest of the nodes to have small degree? There is of course an obvious candidate for such a cost function: the norm of the degree vector. That is, given a spanner , we can define to be the -norm of the -dimensional vector in which the coordinate corresponding to a node contains the degree of in . Then is just (twice) the total number of edges, and is precisely the maximum degree. Thus the -norm is an interpolation between these two classical objectives. Moreover, for , this notion of cost has precisely the properties that we want: it encourages low-degree nodes rather than high-degree nodes, but if high-degree nodes are unavoidable it still encourages the rest of the nodes to be as low-degree as possible. These properties, of interpolating between the average and the maximum, are why the -norm has appeared as a popular objective for a variety of problems, ranging from clustering (the famous -means problem [21, 23]), to scheduling [4, 3, 1], to covering .
1.1 Our Results and Techniques
In this paper we initiate the study of graph spanners under the -norm objective. We prove a variety of results, giving upper bounds, lower bounds, and approximation guarantees. Our main result is the analog of Theorem 1.1 for the -norm objective, but we also characterize universal lower bounds as part of an effort to understand the generic approximation ratio for the related optimization problem. We also show that in some ways the -norm can behave fundamentally differently than the traditional or norms, by proving that the greedy algorithm can have an approximation ratio that is strictly better than the generic guarantee, unlike the or settings.
1.1.1 Upper Bound
We begin by proving our main result: a universal upper bound (the analog of Theorem 1.1) for -norm spanners. Recall the classical greedy algorithm for constructing a -spanner of a graph . Consider the edges in nondecreasing order of edge length, and when considering edge , add it to if currently . We call the greedy -spanner of . It is trivial to show that the greedy -spanner has girth at least . This is the algorithm that was used to prove Theorem 1.1, and it has since received extensive study (see, e.g., [19, 8]) and will form the basis of our upper bound:
Let be an integer, let be a graph (possibly with lengths on the edges), and let be the greedy -spanner of . Then for all .
In other words, if then our upper bound is , and otherwise it is . Clearly this interpolates between and : when this is the same bound as Theorem 1.1, while if this gives which is the only possible bound in terms of . It is also straightforward to prove that this bound is tight if we again assume the Erdős girth conjecture ; for completeness, we do this in Appendix A.
The proof of Theorem 1.1 from  is relatively simple: the greedy -spanner has girth at least , and any graph with more than edges must have a cycle of length at most . Generalizing this to the -norm is significantly more complicated, since it is not nearly as easy to show a relationship between the girth and the -norm. But this is precisely what we do.
It turns out to be easiest to prove Theorem 1.2 for stretch : it just takes one more step beyond  to split the vertices of the high-girth graph (the spanner) into “low” and “high” degrees, and show that each vertex set does not contribute too much to the norm. However for larger stretch values this approach does not work: the main lemma used for stretch (Lemma 3.2) is simply false when generalized to larger stretch bounds. Instead, we need a much more involved decomposition into “low”, “medium”, and “high”-degree nodes. This decomposition is very subtle, since the categories are not purely about the degree, but rather about how the degree relates to expansion at some particular distances from the node. We also need to further decompose the “high”-degree nodes into sets determined by which distance level we consider the expansion of. We then separately bound the contribution to the -norm of each class in the decomposition; for “low”-degree nodes this is quite straightforward, but for medium and high-degree nodes this requires some subtle arguments which strongly use the structure of large-girth graphs.
1.1.2 Universal Lower Bounds
To motivate our next set of results, consider the optimization problem of finding the “best” -spanner of a given input graph. When “best” is the smallest -norm this is known as the Basic -Spanner problem [15, 5, 14, 17], and when “best” is the smallest -norm this is the Lowest-Degree -Spanner problem [22, 10, 9]. It is natural to consider this problem for the -norm as well. It is also natural to consider how well the greedy algorithm (used to prove the upper bound of Theorem 1.2) performs as an approximation algorithm.
To see an obvious way of analyzing the greedy algorithm as an approximation algorithm, consider the -norm. Theorem 1.1 implies that the greedy algorithm always returns a spanner of size at most , while clearly every spanner must have size at least (assuming that the input graph is connected). Thus we immediately get that the greedy algorithm is an -approximation. By dividing a universal upper bound (an upper bound on the size of the greedy spanner that holds for every graph) by a universal lower bound (a lower bound on the size of every spanner in every graph), we can bound the approximation ratio in a way that is generic, i.e., that is essentially independent of the actual graph.
Now consider the -norm. The generic approach seems to break down here: the universal upper bound is only (as shown by the star graph), while the universal lower bound is only (as shown by the path). So it seems like the generic guarantee is just the trivial . But this is just because is the wrong parameter in this setting: the correct parameterization is based on , the maximum degree of (i.e., ). With respect to , the greedy algorithm (or any algorithm) returns a spanner with maximum degree at most , while any -spanner of a graph with maximum degree must have maximum degree at least (assuming the graph is unweighted). So there is still a “generic” guarantee which implies that the greedy algorithm is an -approximation.
This suggests that for , we will need to parameterize by both the number of nodes and the -norm of . We can define both universal upper bounds and universal lower bounds with respect to this dual parameterization:
With this notation, we can define the generic guarantee , and if we want a guarantee purely in terms of we can define the generic guarantee . Our upper bound of Theorem 1.2 can then be restated as the claim that
for all . So in order to understand the generic guarantees or , we need to understand the universal lower bound quantity .
Surprisingly, unlike the and cases, the universal lower bound for other values of is extremely complex. Understanding its value, and understanding the structure of the extremal graphs which match the bound given by
, are the most technically involved results in this paper. However, while the analysis and even the exact formulation of the lower bound is quite complex, it turns out to be easily computable from a simple linear program:
There is an explicit linear program of size which calculates for any . The bound given by the program is tight up to a factor of .
Our linear program and the proof of Theorem 1.3 appear in Section 6.2. In fact, our linear program not only calculates a lower bound on the -norm of any -spanner, it also gives the parameters which define an extremal graph of -norm with a -spanner whose -norm matches this lower bound. While the structure of these extremal graphs is simple, the dependence of the parameters of these graphs on and is quite complex. Nevertheless, we give a complete explicit description of these graphs for every possible value of .
Interestingly, despite the fact that is fundamentally a question of extremal graph theory (although as discussed our motivation is the generic guarantee on approximation algorithms), our techniques are in some ways more related to approximation algorithms. We give a linear program which computes the LB function, and we reason about it by explicitly constructing dual solutions. This is, to the best of our knowledge, the first time that structural bounds on spanners (as opposed to approximation bounds) have been derived using linear programs. Moreover, the structure of the extremal graphs is fundamentally related to a quantity which we call the -log density of the input graph. This is a generalization of the notion of “log-density”, which was introduced as the fundamental parameter when designing approximation algorithms for the Densest -Subgraph (DkS) problem , and has since proved useful in many approximation settings (see, e.g., [10, 11, 13, 12]).
1.1.3 Greedy Can Do Better Than The Generic Bound
As discussed, when or , the approximation ratio of the greedy algorithm can be bounded by the generic guarantee. But it turns out that the connection is actually even closer: when and , for every and the approximation ratio of the greedy algorithm is equal to the generic guarantee . In other words, greedy is no better than generic in the traditional settings (we prove this for completeness, but it is essentially folklore). In fact, for the objective, giving any approximation algorithm which is better than the generic guarantee is a long-standing open problem  which has only been accomplished for stretch  and stretch , while for the objective such an improvement was only shown recently  (and not with the greedy algorithm).
We show that, at least in some regimes of interest, -norm graph spanners exhibit fundamentally different behavior from and : the greedy algorithm has approximation ratio which is better than the generic guarantee, even though the universal upper bound is proved via the greedy algorithm! In particular, we consider the regime of stretch , , and . This is a very natural regime, since is the most obvious and widely-studied norm other than and , and stretch is the smallest value for which nontrivial sparsification can occur.
Our theorems about UB and LB imply that . But we show that in this setting (and in fact for any as long as and the stretch is ) the greedy algorithm is an -approximation. Thus we show that, unlike and , for the greedy algorithm provides an approximation guarantee that is strictly better than the generic bound, both for specific values of and when considering the worst case .
We begin in Section 2 with some basic definitions and preliminaries. In order to illustrate the basic concepts in a simpler and more understandable setting, we then focus in Section 3 on the special case of stretch : we prove the stretch- version of Theorem 1.1 in Section 3.1, and then show that the greedy algorithm has approximation ratio better than the generic guarantee in Section 3.2. We then prove our upper and lower bounds in full generality: the upper bound (i.e., the proof of Theorem 1.2) in Section 4, and then our universal lower bound in Section 5. Due to space constraints, all missing proofs can be found in the appendices.
2 Definitions and Preliminaries
Let be a graph, possibly with lengths on the edges. For any vertex , we let denote the degree of and let denote the neighbors of . We will also generalize this notation slightly by letting denote the set of vertices that are exactly hops away from (i.e., their distance from if we ignore lengths is exactly ), and we let . Note that by definition, and for all . We will sometimes use to denote the ball around of radius .
We let denote the shortest-path distances in . A subgraph of a graph is a -spanner of if for all . Recall that for any and . To measure the “cost” of a spanner, for any graph , let denote the vector of degrees in and for any , let . For any subset , we let denote the norm of the vector obtained from by removing the coordinate of every node not in (note that we do not remove the nodes from the graph, i.e., is the norm of the degrees in of the nodes in , not in the subgraph induced by ).
3 Warmup: Stretch
We begin by analyzing the special case of stretch , particularly for the -norm. More specifically, we will focus on bounding . This is one of the simplest cases, but demonstrates (at a very high level) the outlines of our upper bound. Moreover, in this particular case we can prove that the greedy algorithm performs better than the generic guarantee, showing a fundamental difference between the norm and the more traditional and norms.
3.1 Upper Bound
Recall that greedy spanner is the spanner obtained from the obvious greedy algorithm: starting with an empty graph as the spanner, consider the edges one at a time in nondecreasing length order, and add an edge if the current spanner does not span it (within the given stretch requirement). It is obvious that when run with stretch parameter this algorithm does indeed return a -spanner, and moreover it will return a -spanner that has girth at least (if there is a -cycle then the algorithm would not have added the final edge).
Our main goal in this section will be to prove the following theorem
Let be a graph and let be the greedy -spanner of . Then for all .
In other words, when the greedy -spanner has , and when we get that that .
To prove this theorem, we will use first show that nodes with “large” degree cannot be incident on too many edges in any graph of girth at least (like the greedy -spanner). This is the most important step, since for the -norm of a graph gives greater “weight” to nodes with larger degree.
Let be a graph with girth at least 5. Then
Suppose for the sake of contradiction that these vertices have total degree greater than , and let be a minimal set with this property. That is, all these vertices have degree at least , and furthermore .
Because has girth at least 5, any two vertices in this set have at most one common neighbor. That is, . Thus, for every , the number of “new” neighbors contributed by is .
On the other hand, we have , and so we have . Thus, every contributes at least new neighbors, and so we get , which contradicts our assumption that . ∎
We can now prove Theorem 3.1.
Proof of Theorem 3.1.
Let , and let . Since has girth at least , we can apply Lemma 3.2. So using this lemma and standard algebraic inequalities, we get that
which implies the theorem. ∎
It is easy to show that the above bound is tight: for every there are graphs in which every -spanner has size at least . In fact, we can generalize slightly to also account for different values of . Theorem 1.2 can be interpreted as claiming that . In Appendix A we show (Theorem A.1) that this is tight: for all and .
3.2 Greedy vs Generic
It is not hard to show that in the traditional settings in which spanners have been studied, the and norms, the greedy algorithm does no better than the generic guarantee, for all relevant parameter regimes. In slightly more detail, for it is relatively easy to show that , while . Thus the generic guarantee , and moreover we can build graphs in which the approximation ratio of the greedy algorithm is also . Similarly, for the -norm, classical results on spanners imply that and , so the generic guarantee is and there are graphs for all parameter regimes where this is the approximation ratio achieved by greedy.
We show that the behavior of the greedy spanner in intermediate -norms is fundamentally different: in some parameter regimes of interest, greedy outperforms the generic guarantee!
To demonstrate this, consider the regime of stretch with the norm and with . In this regime, the results of Section 3.1 imply that . On the other hand, our results on the universal lower bound from Section 5 (Corollary 5.2 in particular) directly imply that . Thus the generic guarantee is , and this is the worst case over and thus . However, we show that the greedy algorithm is a strictly better approximation, even without parameterizing by .
The greedy algorithm is an -approximation for the problem of computing -spanner with smallest -norm.
To prove this, let be a graph with , let be the greedy -spanner of , and let be the -spanner of with minimum . Let , so ; note that . We first prove a lemma which uses to bound neighborhoods.
for all and .
We use induction on . For the base case , since we know that has degree at most , and thus .
Now suppose that the theorem is true for some integer . Let (by induction). Since , the average degree (in ) of the nodes in is at most . Thus we get that , as claimed. ∎
Using this lemma, we can now prove Theorem 3.3.
Proof of Theorem 3.3.
Lemma 3.4 implies that for all . Since is a -spanner of , every vertex in must be in , and thus . Now we can use this to bound the number of -paths in . Let denote the number of paths of length in . Since is the greedy -spanner of it must have girth at least . This means that every path of length in which starts from must have a different other endpoint: there cannot be two different paths of the form and in , or else would have girth at most . Thus the number of -paths in which start from is bounded by , and thus .
On the other hand, note that instead of counting -paths in by their starting vertex, we could instead count them by their middle vertex. The number of -paths where is the middle node is , and thus . Combining these two inequalities implies that , and hence the greedy spanner has approximation ratio of at most . ∎
4 Upper Bound: General Stretch
We now want to generalize the bounds from Section 3 to hold for larger stretch ( in particular) in order to prove Theorem 1.2. A natural approach would be an extension of the stretch analysis: if in Lemma 3.2 we replaced the the bound of with , then the proof of Theorem 3.1 could easily be extended to prove Theorem 1.2. Unfortunately this is impossible: there are graphs of girth at least where it is not true that the number of edges incident on nodes of degree at least is at most . This can be seen from, e.g.,  for .
So we cannot just break the vertices into “high-degree” and “low-degree” as we did for stretch . Instead, our decomposition is more complicated. We will still have low-degree nodes, which can be analyzed trivially. But our definition of “high” will actually be parameterized by a distance , and we will define a node to be “high-degree” at distance if its degree is large relative to the expansion of its neighborhood at approximately distance . We will also introduce a new type of “medium-degree” node. In Section 4.1 we define this decomposition and prove that it is a full decomposition of , and then in Sections 4.2 and 4.3 we show that no part in this decomposition can contribute too much to the overall cost.
First, though, we make one simple observation that will allow us to simplify notation by only considering one particular value of . While we could analyze general values of as we did for stretch in Section 3.1, it is actually sufficient to prove the bound for the special case of and where the two terms in the maximum are equal, i.e., when . The following is a straightforward application of Hölder’s inequality.
Let be an integer, let be a graph, and let be the greedy -spanner of . If for then for all .
First note that if and only if . So we break into two cases, one for and one for . For the first case, where , the result follows simply because of the monotonicity of -norms: .
For the second case, where , let be the value such that and . Recall that is the degree vector of . Then Hölder’s inequality implies that . Since by assumption we have , this implies that , as claimed. ∎
4.1 Graph Decomposition
Recall that denotes the number of vertices at distance exactly from . This will let us define the following vertex sets.
Let be a graph of girth at least , with . Then define
It is not hard to see that this notion of high still corresponds to a deviation from regularity, as in the stretch setting; the difference is that this deviation is relative to the size of the neighborhood at distance vs the neighborhood at distance .
As we will see in Sections 4.2 and 4.3, analyzing the contribution of to the -norm of the greedy spanner is in some sense the “main” technical step: analyzing is straightforward, and analyzing , while nontrivial, turns out to be easier than the case for . Before we do this, though, we will show that we have a full decomposition of .
Let be a graph of girth at least , with . Then .
We prove the case when
is odd. The other case is similar.
Assume that . Then by the definition of , we know that for all . Then a straightforward induction on implies that
If further we assume that , then , and thus
Finally, assuming that implies that
4.2 Structural Lemmas for High-Girth Graphs
With Theorem 4.3 in hand, it remains to bound the contribution to the -norm of the spanner of these different vertex sets. In order to do this, we start with a few useful lemmas. We first prove a simple lemma: if the girth is large enough, then the neighborhoods around a node can be bounded by the neighborhoods around its neighbors.
Let have girth at least with . Then for all .
Since has girth at least , for every and there is exactly one path of length from to (or else there would be a cycle of length at most ). Thus the -neighborhoods of the neighbors of form a partition of (when intersected with ). More formally, , and for all . Moreover, the part of which is not in is a subset of , since the path from to any such node would go through as its first hop (where we consider ). Thus we get that , as claimed. ∎
With this lemma in hand, we will now prove a more complicated technical lemma which will likewise hold for all high-girth graphs. For a given with , we can consider the fraction of the -neighborhood of which is also contained in the -neighborhood of . Then if we sum this fraction over all neighbors of , we would of course get since the girth constraint would imply that any two neighbors of cannot both be first hops on paths to the same node in . But what if we consider the slightly different ratio of ? This is notably different since it includes in the numerator not just , but also . It will prove useful for us to reason about these values, so we show that “on average” they behave approximately the same: if we sum up the neighbors of any given node then these fractions can add up to something quite large (not ), but overall they only add up to .
Let be an integer, and let have girth at least and minimum degree at least . Then .
For ease of notation, let . We will prove that and that for all . These two statements clearly imply the lemma by a simple induction.
Let us first prove that , which is the base case of the induction. Starting from the definition of , (and noting that by definition), we get that
For , we can begin similarly, using the definition of and now also Lemma 4.4 to get that
So now we need to prove that . Let us first fix some and try to lower bound . Our assumption that every vertex has degree at least implies that , and so . This gives a lower bound on :
Now again using the fact that all vertices have degree at least (in fact, degree at least would be sufficient), and the fact that the girth is at least , we get a different lower bound: for all . Combining this with (4) gives us the bound
Now we can apply Lemma 4.4 to the numerator, giving us
The right hand side of this inequality is clearly (twice) the arithmetic mean of the values
. Since the arithmetic mean is at least the harmonic mean, we get that
This is now finally the lower bound on that we will use to prove that . In particular, we immediately obtain
As shown, this implies the lemma. ∎
While Lemma 4.5 is the main structural result that we will use to bound the “high” degree nodes, the following corollary makes it slightly simpler to use.
Let be an integer, and let have girth at least and minimum degree at least . Then