Revisiting Radius, Diameter, and all Eccentricity Computation in Graphs through Certificates

03/13/2018 ∙ by Feodor Dragan, et al. ∙ 0

We introduce notions of certificates allowing to bound eccentricities in a graph. In particular , we revisit radius (minimum eccentricity) and diameter (maximum eccentricity) computation and explain the efficiency of practical radius and diameter algorithms by the existence of small certificates for radius and diameter plus few additional properties. We show how such computation is related to covering a graph with certain balls or complementary of balls. We introduce several new algorithmic techniques related to eccentricity computation and propose algorithms for radius, diameter and all eccentricities with theoretical guarantees with respect to certain graph parameters. This is complemented by experimental results on various real-world graphs showing that these parameters appear to be low in practice. We also obtain refined results in the case where the input graph has low doubling dimension, has low hyperbolicity, or is chordal.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The radius and diameter of a graph are part of the basic global parameters that allow to apprehend the structure of a practical graph. More broadly, the eccentricity of each node, defined as the furthest distance from the node, is also of interest as a classical centrality measure [19]

. It is tightly related to radius which is the minimum eccentricity and to diameter which is the maximum eccentricity. On the one hand, efficient computation of such parameters is still theoretically challenging as truly sub-quadratic algorithms would improve the state of the art for other “hard in P” related problems such as finding two orthogonal vectors in a set of vectors or testing if one set in a collection is a hitting set for another collection 

[2]. A sub-quadratic diameter algorithm would also refute the strong exponential time hypothesis (SETH) [25] and would improve the state of the art of SAT solvers as noted for similar problems in [24]. On the other hand, a line of practical algorithms has been proposed based on performing selected Breadth First Search traversals (BFS) [23, 27, 28, 8, 5] allowing to compute the diameter of very large graphs [3]. However, such practical efficiency is still not well understood.

What are the structural properties that make practical graphs tractable? This paper answers this question with the lens of certificate, that is a piece of information about a graph that allows to compute its radius and diameter in truly sub-quadratic time. We propose a notion of certificate tightly related to the class of algorithms based on one-to-all distance computations from selected nodes. Existing practical algorithms fall into this category that we call one-to-all distance based algorithms. Based on this approach, we propose algorithms with proven guarantees with respect to several graph properties which appear to be generally met in practice.

Another intriguing question concerns the relationship between diameter and radius computations. The most advanced algorithms [28, 5] compute both parameters at the same time. Would computing one parameter help for computing the other? We answer by the affirmative based on the notion of certificate.

The paper is presented in the context of unweighted undirected graphs but all the notions and algorithms extend to the weighted and/or directed cases.

1.1 Our contribution

We introduce the notion of certificate as a set of nodes such that the distances from these nodes to all nodes (rather than all-to-all pairs) allow to deduce the value of the radius or the diameter with certainty. Given a graph with radius , we define a radius certificate as a set of nodes such that any node of is at distance at least from a node of . Given in addition a node with eccentricity , the set allows to certify that the radius of , i.e. the minimum eccentricity, is indeed : we can compute with a BFS from and certify that all nodes have eccentricity or more by checking that their distance to some node in is at least using BFS traversals. If has size , this opens the possibility of breaking the quadratic barrier for radius computation if one can efficiently find a small certificate when there exists one. This raises the problem of approximating the minimum certificate for radius. Interestingly, the size of the minimum radius certificate gives a lower bound on the complexity of one-to-all distance based algorithms for radius: such an algorithm must perform at least one-to-all distance computations. We also raise similar approximation problems for diameter and all eccentricity computations.

Figure 1: An example of graph (for and ) with small certificates for radius (), diameter () and all eccentricities (). Its diameter is (eccentricity of blue nodes). Its radius is (eccentricity of green nodes). Plain lines correspond to edges while a dashed line with label corresponds to a path of length .

We show that a radius certificate can be formally defined as a covering of the node set with complementary of open balls of radius (excluding nodes at distance ). We also define a diameter certificate as a covering with balls of radius where is the diameter of the graph and is the eccentricity of the center of the ball. Similarly, an all eccentricity certificate can be defined by combining two coverings as a pair of lower/upper (see definitions in Section 3). Finding a minimum radius (or diameter) certificate is shown to be equivalent to minimum set cover. It is thus NP-hard while

-approximation (only) is doable in polynomial time. Compared to set cover, it has an additional difficulty: the sets are not directly available and computing all of them would require quadratic time at least. It should be noted that these notions of certificate are independent of any algorithm: it is a graph property to have small or big certificates. As an example, for odd

, a square grid has a one-node diameter certificate (its center) and a radius certificate with four nodes (its corners).

Figure 1 presents an example of bow-tie shaped graph for integral parameters and that has small certificates. The diameter certificate contains three nodes . The central (green) node has eccentricity . Note that its eccentricity is minimal () and is called a center. Any node at distance from has eccentricity at most as it can reach any node by following a path of length to and then a path from to of length at most. In set-cover terms, covers the ball . The rest of the graph is covered by the balls of radius centered at and , implying that is a diameter certificate. The radius certificate has five nodes such that any node is at distance at least from one of them. In other words, the complement of open balls of radius centered on them cover the whole graph.

We propose algorithms for radius, diameter and all-eccentricity certificate computation (as a byproduct, our algorithms also provide radius, diameter and all eccentricities). They follow a primal-dual approach that allows to obtain guarantees on the size of computed certificates and on the number of BFS traversals performed with respect to graph parameters that seem to be low in practical graphs. Our experiments on practical graphs from various sources show that these graphs not only have small certificates but also small coverings with much reduced sets: we can still cover the node set with few complementary of balls with increased radii (resp. decreased radii) compared to radii required for a radius (resp. diameter) certificate. Such properties explain the efficiency of a primal dual approach. Although our algorithms have some similarities with previous algorithms, this primal-dual flavor was not noticed before. They have similar performances in practice but provide significantly smaller certificates. Their proven guarantees also make them more robust. In particular, our radius and diameter algorithms handle the graph of Figure 1 with BFS traversals while previous exact algorithms require BFS traversals.

Our experiments show a striking phenomenon concerning specifically lower-bounding eccentricities (as in radius computation) that we call “antipode sparsity”. Given a ranking of the nodes (e.g., their ID order), we define the antipode of a node as the node at furthest distance from having highest rank (the ranking is used for breaking ties among nodes at the same distance). We say that a node is an antipode if it is the antipode of some other node. We observe that practical graphs have very few antipodes for several rankings (i.e., large groups of nodes share the same antipode): most the practical graphs tested (with up to hundred of thousands of nodes) have less than 100 antipodes. Although our notion of antipode is reminiscent of the usage of antipodes on the Earth, we see that it can significantly deviate from it. On the sphere, the antipode of a point is the unique furthest point from it and the antipode of the antipode is the point itself. The same situation can be met in graphs such as a cycle or a grid torus. However it appears to be much different in practical graphs: the relation is highly asymmetric, most of the nodes have multiple furthest nodes (i.e., nodes at furthest distance from them) while there are very few antipodes overall. This situation is indeed highly favorable to one-to-all distance based algorithms as shown by the following theorem summarizing our algorithmic results.

Theorem 1

Given a connected graph having edges and antipodes overall (according to a given ranking), it is possible to compute:

  • its radius, a center and a radius certificate of size at most,

  • its diameter, a diametral node and a diameter certificate of size at most where is the maximum packing size for open balls ,

  • all eccentricities, a lower certificate of size at most and a minimum upper certificate ,

using BFS traversals per node of associated certificates (i.e., in , and time respectively).

Concerning diameter (second item), we analyse a minimalist algorithm inspired by previous practical algorithms. A basic primal-dual argument implies that the maximum size of a packing for (closed) balls for is a lower bound of the minimum size of a diameter certificate. The above theorem thus indeed proves that this basic approach approximates minimum diameter certificate within a ratio of . While the value (with radii reduced by a factor ) appears to be generally small in practice, the bound can be much higher than . However, this provides a first answer for the efficiency of practical diameter algorithms that can be complemented with the following observation. A second property often met by practical graphs is a high diameter to radius ratio (over 1.5 in our experiments) so that a large part of the graph is included in any ball centered at a central node . Such a ball corresponds to the nodes covered by adding to a diameter certificate in the associated set cover problem. We confirm this with a refinement of the parameter in the above theorem when combining radius and diameter computation where a center of the graph is used to initialize the basic diameter algorithm. We observe values for that refined parameter that are generally within a small constant factor of . This graph property associated with high diameter to radius ratio and the discovery of a node with small eccentricity as part of diameter computation is thus our second element for explaining the efficiency of practical diameter algorithms.

Concerning radius computation, practical algorithms tend to perform even faster than predicted by the first point of the above theorem (including the algorithm analyzed in the theorem). This large diameter to radius ratio also allows us to give an intuition for this. Our radius algorithm iteratively selects a node with minimal eccentricity lower-bound (according to the radius certificate computed so far) and adds its antipode to the candidate radius certificate. We can show that the selected node is always in the intersection of all balls of radius centered at previous discovered antipodes. As antipodes tend to have high eccentricity to graph-radius ratio (in the order of the diameter to radius ratio), this intersection quickly shrinks toward the set of graph centers. As selecting a node with minimal lower-bound combined with discovering high eccentricity nodes is a classical approach, this gives a second element for understanding the efficiency of practical radius algorithms. Note that the idea of using antipodes systematically for finding high eccentricity nodes is new.

Although we reuse classical algorithmic tools, our radius and all eccentricity algorithms rely on a new algorithmic technique that we call minimum eccentricity selection which has its own interest. It specifically leverages on antipode sparsity for enabling efficient selection of a node with minimum eccentricity within a set maintained online. Its amortized complexity is low when the number of antipodes is small. Interestingly, this technique also allows to design algorithms based on an oracle giving access to all eccentricities. Such algorithm can then be efficiently implemented using our technique as long as eccentricity values are used to iteratively select a node such that is minimal for a given computable function satisfying some non-decreasing property. It is based on the idea of using antipodes to enhance a (lower) certificate until an adequate node is found. The technique also appears to be useful for optimizing diameter computation. We also introduce a new technique for diameter computation that we call delegate certificate in order to obtain both theoretical guarantees and efficient practical performances.

Finally, a surprising fact concerns the complexity of finding an optimum upper certificate (a certificate with minimum size that provides a tight upper-bound of the eccentricity of each node) as provided by our all eccentricity algorithm. Contrarily to radius and diameter certificates (as discussed above), it appears to be tractable in polynomial time. In comparison, finding an optimum lower certificate is also shown to be as hard as set-cover. Moreover, our all eccentricity algorithm roughly performs one BFS traversal per node of the optimum upper certificate when the number of antipodes is much smaller than the size of this upper certificate (as observed in our experiments). Note that this is close to the best possible for an algorithm based on one-to-all distance computations.

We additionally refine the performance analysis of our algorithms in particular cases when the input graph 1) has bounded doubling dimension, 2) has small hyperbolicity, or 3) is a chordal graph.

We believe that our certificate approach provides new insight on the efficiency of practical algorithms for radius and diameter, allows to propose more robust practical algorithms with complexity guarantees, and significantly enhance the state of the art for all eccentricity computation. Moreover, the new techniques proposed here could enable new types of radius and diameter algorithms with parametrized complexity.

1.2 Related work

The concept of certificate is somehow implicit in the method introduced in [27, 28] that consists in maintaining lower and upper bounds on the eccentricity of each node. After each BFS traversal these bounds are improved based on distances from the source of the traversal. The sources used for the BFS traversals performed by the algorithm form what we call a certificate. Contrarily to this approach, we distinguish nodes used for improving lower bounds (the lower certificate) from those used for improving upper bounds (the upper certificate). Our definition of lower certificate uses a looser lower-bounding inequality because of this distinction. The main approach proposed for diameter computation [27]

consists in alternating nodes with small lower bound and nodes with large upper bound as BFS sources. This can be seen as a mix of our basic diameter algorithm with a heuristic for finding nodes with small eccentricities.

The two-sweeps heuristic [23]

performs only 2 traversals to provide a diameter estimate that appears to be tight in practice. The idea is to use the last visited node in the first traversal to start the second traversal. It thus introduces the idea of using what we call antipodes as tentative diametral nodes. The technique was first introduced for trees 

[20] where it happens to be exact. It was also shown to provide good approximation (up to a small constant) for chordal graphs and various graph classes [14].

A four-sweeps heuristic is proposed in [8] and complemented with an exact diameter algorithm called iFub. The four-sweep heuristic performs twice the two-sweep method, using a mid-point of the longest path found in the first round as the starting point of the second one. The idea is that mid-points of longest paths make good candidates for central nodes or at least nodes with small eccentricity. The iFub method additionally inspects furthest nodes from the best candidate center found with the four-sweep heuristic until exact value of the diameter can be inferred.

The exact-sum-sweep method computes (exactly) both radius and diameter while performing few BFS traversals in practice [5]. It integrates many techniques proposed in previous practical algorithm plus an heuristic based on sum of distances that boosts the discovery of nodes with large eccentricity in an initial phase. It also handles the directed case in a very general manner.

The structure of random power law graphs is analyzed in [6] and the efficiency of practical diameter and radius algorithms is discussed for that type of graphs. It is shown that random power law graphs satisfy similar properties as those we insist on. The main argument proposed for efficiency of practical algorithms resides in the fact that such graphs have few furthest nodes (that is nodes that appear to be furthest from some other node). However, we observe much less antipodes than furthest nodes in practice and some graphs do have a fairly high number of furthest nodes. Our work provides a finer parameter and allows to extend such explanation to other types of practical graphs such as road networks and grid like networks.

Packing and covering of hyperbolic graphs with balls is investigated in [9], although slightly different problems are considered. It would be interesting to derive similar results in hyperbolic graphs for the collections of balls (or complementary of balls) we consider here.

1.3 Structure of the paper

We introduce basic graph and set-cover terminology in Section 2. The notions of certificate for radius, diameter, and all eccentricities are given in Section 3. We show how such notion can be related to one-to-all distance based algorithms in Section 4. Section 5 is devoted to our radius algorithm. We introduce in Section 6 the technique of minimum eccentricity selection which is the core of this radius algorithm. We analyse a basic diameter algorithm and propose an optimization based on radius computation and minimum eccentricity selection in Section 7. Computation of all eccentricities is studied in Section 8. Theorem 1 is a consequence of the theorems proven in Sections 57 and 8. We present some experimental results in Section 9 about the measurement on various practical graphs of the parameters involved in our theorems. Section 10 is devoted to graphs with low doubling dimension: a refined algorithm for diameter computation is proposed and our radius and diameter algorithms are analyzed in terms of radius and diameter approximation respectively. Section 11 refines the analysis of our radius algorithm in the case of graphs with low hyperbolicity. Finally, we study chordal graphs in Section 12: we show that centers form a diameter certificate while diametral nodes form a radius certificate, this allows to derive a linear time algorithm for computing all eccentricities of a bounded degree chordal graph.

2 Preliminaries

Given an undirected unweighted graph we denote by its set of nodes. Let be the distance between two nodes and in , that is the length of a shortest path from to . The eccentricity of a node is the maximum length of a shortest path from , that is . The furthest nodes of are the nodes at furthest distance from , i.e., . Given a ranking of the nodes, the antipode of a node for is its furthest node with highest rank. Formally, where pairs are ordered lexicographically. A node is called a furthest node (resp. an antipode) if it is a furthest node (resp. an antipode) of some other node. Given a set , we let denote the set of antipodes from nodes in . The diameter of is the maximum eccentricity in and the radius is the minimum eccentricity in . A diametral node is a node with maximum eccentricity (). A central node (or simply center) is a node with minimum eccentricity (). We let (resp. ) denote the (closed) ball (resp. open ball) with radius centered at a node . Similarly, we define its coball of radius as , that is the complementary of .

We restrict ourselves to algorithms based on one-to-all distance queries: we suppose that an algorithm for one-to-all distances is given (typically BFS or Dijkstra). It takes a graph and a node as input and returns distances from . More precisely, returns a vector such that for all . In particular, can be obtained as the maximum value in the vector and the antipode of as the index with highest rank were this value appears in . We may measure the complexity of an algorithm by the number of one-to-all distance queries it performs when its cost mainly comes from these operations. A one-to-all distance based algorithm accesses the graph only through one-to-all distance queries and relies solely on distances known from queries, triangle inequality, and non-negativeness of distances for bounding unknown distances.

Given a collection of subsets of such that , a covering with is a sub-collection of sets such that their union covers all : . (A set is said to cover elements in .) Recall that the set-cover problem consists in finding a covering of minimum size. We define a packing for as a subset such that any set of contains at most one element in . The denomination comes from the fact that elements of correspond to pairwise disjoint subsets of the dual collection . A hitting set for is a set that intersects all sets of . (Equivalently, a hitting set can be defined as a covering for but it may be more convenient to consider a collection rather than its dual.) We let denote the maximum size of a packing for , and denote the minimum size of a covering with . As a covering must cover each element of a packing with distinct sets, we obviously have (weak duality). We say that a collection is restricted compared to if there exists a one-to-one mapping from to such that for all sets . Note that this mapping then turns any covering with into a covering with and we thus have . Similarly, a packing for is also a packing for and we have . In other words, restricting the sets of a collection to smaller subsets increases maximum packing size and minimum covering size.

3 Lower and upper certificates for eccentricities

Our notion of certificate is based on the fact that knowing all distances from a given node allows to derive some bounds on the eccentricities of other nodes:

(1)

The first inequality derives directly from the eccentricity definition while the second one is a consequence of the triangle inequality. A possibly tighter lower-bound of could be used as in [27] but this optimization does not allow to reduce drastically certificate size (see Section 4).

We say that a set (resp. ) of nodes is a lower certificate (resp. an upper certificate) of when it is used to obtain lower bounds (resp. upper bounds) of eccentricities in . Given the distances from a node to all nodes in and the eccentricities of nodes in , we have the following lower and upper bounds for the eccentricity of any node (as a direct consequence of Inequation 1):

A lower (resp. upper) certificate (resp. ) is said to be tight when (resp. ) for all . An all-eccentricty certificate is defined as a pair of a tight lower certificate and a tight upper certificate .

Given a bound and a node , we have if and only if . Given an upper certificate we thus have if and only if . A diameter certificate is a set such that for all . Equivalently it can be defined as a covering with using balls whose radius equals minus eccentricity of the center (and identifying a ball with its center).

Similarly, given a lower certificate and a bound , we obviously have for all nodes whose coball intersects (i.e., there exists a node in at distance at least from ). We thus define a radius certificate as a such that for all or equivalently as a hitting set for the collection of coballs of radius . As if and only if , the collection of coballs of radius is its own dual, and a radius certificate can equivalently be defined as a covering for this collection.

Note that a tight lower certificate can equivalently be defined as a hitting set for the collection . Similarly, a tight upper certificate can equivalently be defined as a covering with the collection .

Examples.

A path with nodes has a radius certificate with two nodes (the two extremities) and a diameter certificate with one node (its mid-point). More generally, any graph such that has a one node diameter certificate (a center). It can be shown that any tree has a radius certificate of two nodes (two well chosen leaves) while its centers (at most two nodes) form a diameter certificate. A square grid has a radius certificate with four nodes (the corners) while its centers (at most four nodes) form a diameter certificate. As an extreme example of graph with large certificates, consider a cycle . Its only radius and diameter certificates are both the whole set of its nodes. More generally, the whole set of nodes is the only diameter certificate of any graph where all nodes have same eccentricity (when radius equals diameter).

Hardness of approximation.

Similarly to [9], we note that set cover can easily be encoded with ball cover: given a collection of subsets of of an instance of the set-cover problem, consider the split graph where the sets of form a clique and the elements form a stable set so that the nodes and are adjacent if and only if . Without loss of generality, we may assume that no subset of equals (otherwise, the problem is trivial), no set is empty (otherwise, we can remove it) and that there exists two elements such that no set contains both of them (if needed, we add a singleton to where is a new dummy element added to ). In this graph, sets and elements have eccentricity 2 and 3 respectively. Any minimum diameter certificate is a covering with balls of radius 1 or 0 (if centered on a set or an element). One can easily transform it into a covering with balls of radius 1 centered at nodes corresponding to sets (only). It then corresponds to an optimal solution of the original set-cover problem. Now consider the complementary graph which is also a split graph where elements form a clique while sets form a stable set and where and are adjacent if and only if . Similarly, a minimum radius certificate for this complementary graph corresponds to a covering with coballs of radius 2 centered at sets and is also an optimal solution to the original set-cover problem. For that graph, finding a radius certificate is equivalent to finding a tight lower certificate. The hardness of set-cover approximation [16] thus implies that computing a minimum diameter (resp. radius or tight lower) certificate is NP-hard and that no polynomial time algorithm can approximate it with a factor unless . Surprisingly, we will see that finding a minimum tight upper certificate can be done in polynomial time.

4 Lower bound for radius computation

We first show that the notion of radius certificate is related to the minimum number of queries a one-to-all distance based algorithm must perform.

Theorem 2

Given a graph , if a one-to-all distance based algorithm for radius queries a set of nodes then is a radius certificate for any ranking . Such a radius algorithm thus requires at least one-to-all distance queries where is the minimum size of a radius certificate.

Proof. Consider a one-to-all distance based algorithm and let denote the set of nodes it queries for one-to-all distances. A proof of correctness of the algorithm allows to conclude that all nodes have eccentricity at least based on triangle inequality and the distances known to the algorithm. That is for each node , there is a node such that we can prove based on triangle inequality and distances from nodes in . Consider a node such that the proof uses a minimum number of triangle inequalities. If neither nor is in , the proof must use a triangle inequality for some node and a proof of . In the case , we would have a shorter proof in contradiction with the choice of . We thus consider only the case . We then have a proof of . Either or the proof uses a node such that . The choice of again implies (otherwise would provide a shorter proof). By repeating this argument, we deduce that a shortest proof of uses a sequence of nodes such that for and . Consider the antipode . We then have . By triangle inequality, we have and . We thus have . In all cases, must contain a node at distance or more from , it is thus a radius certificate.  

Note that a similar result does not hold for diameter certificate. A one-to-all distance based algorithm for diameter could query a set of node such that for any pair there is satisfying which implies by triangle inequality. Note that checking that a set has this property requires quadratic time in general (under SETH) even if has size (see the reduction from SAT to diameter computation in [25]). Our diameter certificate definition requires that for each node a single node allows to bound all distances for using . The reason for this stronger requirement is to enable sub-quadratic time verification that a certificate is indeed a certificate when it has size.

5 Radius computation and certification

We now propose a radius algorithm with complexity parametrized by the number of antipodes in the input graph. Similarly to previous algorithms [27, 5], it maintains lower bounds on eccentricities of all nodes and performs one-to-all distance queries from nodes with minimal lower bound as a first ingredient. Similarly to the two-sweeps and four-sweeps heuristics [20, 23, 8], it performs one-to-all distance queries from antipodes of previous query source as a second ingredient. Contrarily to these heuristics, it iterates until an exact solution is obtained (together with a radius certificate).

The idea of the algorithm is to maintain both a set of nodes with distinct antipodes and a lower certificate (initally empty). We iteratively select a node with minimal lower-bound and perform a one-to-all distance query from . As long as this bound is not tight (i.e., ), we add to and to while eccentricity lower-bounds are improved accordingly. (The fact that the bound is not tight implies that no antipode of could previously be in .) As soon as the bound is tight (i.e., ), we then claim that is a center (i.e., its eccentricity is minimal) and return as the radius and as radius certificate. Algorithm 1 formally describes the whole method.

Note the primal-dual flavor of this algorithm as the set (which has same size as ) is a packing for which is a restricted collection of for which the computed certificate is a covering.

Input: A graph and a ranking of its node set .
Output: The radius of , a center and a radius certificate .
 /* Lower certificate (tentative covering with ) */
Maintain (initially 0) for all .
 /* Packing for . */
Do
       Select such that is minimal.
       /* Distances from . */
       /* Eccentricity of . */
       If then
             return , , and
      else
             /* Antipode of for . */
             /* Distances from . */
            
            
             For do
            
      
while .Return , and where .
Algorithm 1 Computing the radius, a center and a radius certificate.
Theorem 3

Given a graph and a ranking on its node set , Algorithm 1 computes its radius , a center and a radius certificate with one-to-all distance queries.

Proof. We first prove the termination of Algorithm 1. Consider an iteration where we add the antipode of the selected node to . We cannot have as we would then have which is the termination case. In other words, nodes added to have distinct antipodes and is a packing for . As long as the do-while loop runs, each iteration adds a new node to . If ever we reach the point where , then the lower-bound of each node is tight: . The next iteration must then terminate. The complexity is straightforward: at most one-to-all distance queries are performed and as .

We now prove the correctness of Algorithm 1. Consider an iteration of the do-while loop. By the choice of , we then have for all . If the termination case occurs, we have for all . This ensures that has minimum eccentricity (it is a center). We thus have and is a radius certificate as for all . Finally, if ever the condition for continuing the do-while loop is false, we have . For , we thus have . That is, is a radius certificate and is a center.  

In practice, we observe very fast convergence compared to (see Section 9). We can give the following argument for that. The node selected at each iteration satisfies . We thus have , that is . It appears that the eccentricity of antipodes is generally large compared to radius in practical graphs, and the set tends to quickly shrink toward the set of centers as we add antipodes to .

6 Minimum eccentricity selection

The core of the above radius algorithm is a general technique depending on a user-defined function that we call minimum eccentricity selection (minES) for . It is a procedure that returns a node with minimum eccentricity with respect to . Its amortized complexity is low in graphs with few antipodes. More precisely, for a given graph and a function that maps a node and an estimation of to a value, it provides a function returning a node such that is minimum as long as is non-decreasing, i.e., for for all . A similar function returns the value of for such node . The challenge here is to avoid the computation of all eccentricities.

/* Lower certificate. */
Maintain (initially 0) for all .
Function
       Repeat
            
             /* Distances from . */
             /* Eccentricity of . */
             If then
                   return
            else
                   /* Antipode of for . */
                   /* Distances from . */
                  
                   For do
                  
            
      
Function
      
       Return
Algorithm 2 Minimum eccentricity selection with respect to function .

We implement such a selection by maintaining lower bounds of all eccentricities as in Algorithm 1 and by using these lower bounds as estimates for true eccentricities. When the selection procedure is called, a node which is minimum according to lower bounds is considered. Such a node is found by evaluating for all where denotes the lower bound stored for a node . A one-to-all distance query from is then performed. If its eccentricity happens to be equal to its lower-bound we claim that is minimum and return that node. Otherwise, the antipode of is used to improve lower bounds before trying again. Algorithm 2 formally describes this.

Proposition 1

Given a graph and a ranking of its node set , we consider a function such that can be evaluated for any and . If is non-decreasing for all , i.e., for , function of Algorithm 2 returns a node such that is minimal and updates the lower certificate such that and for all . Moreover it can perform computations of using one-to-all distance queries and calls to where denotes the set of nodes added to .

Note that calls to represent less than one-to-all distance queries with respect to time cost when can be evaluated in constant time.

Proof. The correctness of the selection comes from the fact that is non-decreasing: if , we then have . The case can only occur if the antipode of was not in and happens at most times in total. In particular, each call to terminates. If an algorithm makes calls to the , the number of successful iterations where is precisely while the number of unsuccessful iterations is at most the number of nodes added to . For each such iteration we perform 2 one-to-all distance queries instead of 1. The total number of queries is thus . In all cases, we perform calls to per iteration.  

As an example of usage, Algorithm 1 for radius is equivalent to the following algorithm using our minimum eccentricity selection for the basic function .

; for all .
Function : return
return

As another example, the function can be used to select a node with minimum eccentricity in a set of nodes when returns for and otherwise. One can easily check that is non-decreasing for all . We use our minimum eccentricity selection as an optimization for diameter computation and as a core tool for computing all eccentricities in the next sections.

7 Diameter computation and certification

We now analyze a simple diameter algorithm. The main ingredient of the algorithm consists in maintaining upper bounds of all eccentricities and performing one-to-all distance queries from nodes with maximum upper bound. It thus follows the main line of previous practical algorithms [27, 5]. However, we present the algorithm with a more general primal-dual approach which was not noticed before. Moreover, we introduce a new technique called delegate certificate: after selecting a node with maximal upper bound, it consists in performing a one-to-all distance query from any node such that . We call such a node a tight upper certificate for as we have . A possible choice for is itself in which case the algorithms becomes a very basic version of [27]. However, we observe that choosing a node with minimal eccentricity offers much better performances in practice (see Section 9). Our complexity analysis is independent of the choice of , we thus present the algorithm in the most general manner.

The algorithm grows both a packing and an upper certificate until the upper bound on the eccentricity of any node is at most the maximum eccentricity of nodes in . As long as this condition is not satisfied, a node with maximal upper bound is selected and added to . We then choose a tight upper certificate for and add it to . Note that we now have and cannot be selected again. This ensures that the termination condition is reached at some point when is a certificate that all nodes have eccentricity at most that of a maximum eccentricity node in which must thus be equal to diameter. See Algorithm 3 for a formal description.

We claim that the set is a packing for the collection of open balls. As it has same size as the certificate returned by the algorithm in the end, this allows to state the following theorem.

Theorem 4

Given a graph , Algorithm 3 computes the diameter of , a diametral node and a diameter certificate of size at most with one-to-all distance queries where is the maximum packing size for the collection of open balls for . It approximates minimum diameter certificate within a factor where is the maximum packing size for the collection .

Input: A graph and a ranking of its node set .
Output: The diameter of , a diametral node and a diameter certificate .
/* Upper certificate (tentative covering with ). */
Maintain (initially ) for all .
/* Packing for . */
Do
       Select such that is maximal.
       /* Distances from . */
       /* Eccentricity of . */
      
      1 Select such that . /* Delegate certificate for . */
      2 /* Distances from . */
       /* Eccentricity of . */
      
       For do
      
while Return , and where satisfies .
Algorithm 3 Computing diameter and a diameter certificate. The basic version of the algorithm consists in selecting in Line 3. (The redundant one-to-all distance query in Line 3 can then be omitted.)

Proof.We already argued the termination and the correctness of the algorithm above. We thus show the packing property of . Suppose for the sake of contradiction that is not a packing for . Consider the first iteration where a node is added to while some open ball in contains both and some node added previously. Let be the tight upper certificate for that was added to . By triangle inequality, we have . The choice of implies . Combining the two inequalities, we obtain . As and are in , we have and finally get which implies . However, it is required that has maximal upper bound when it is selected for being added to , in contradiction with . We conclude that must be a packing for . Both sizes of and are bounded by . As any diameter certificate is a covering for and has size at least, this guarantees that the size of is within a factor at most from optimum.  

This analysis can be complemented when we start Algorithm 3 with and initially where is a center of the graph computed with Algorithm 1. We reference this combination as Algorithm 1+3 in the sequel. A similar proof then allows to show that is a packing for where for and . We obtain the following corollary from Theorems 3 and 4.

Corollary 1

Given a graph and a ranking of its node set , Algorithm 1+3 computes the diameter of , a diametral node and a diameter certificate of size at most with one-to-all distance queries at most where is a center of returned by Algorithm 1 and is the maximum packing size for the collection of open balls with radii factors for and (for ).

Proof. In addition to the proof of Theorem 4, we just have to consider the case when a node would be added to while having . As , we then have . This would raise a contradiction as the choice of relies on .  

This explains efficiency of practical algorithms as we observe that coverings of small size often exist for in practical graphs (see Section 9). As mentioned before, a further optimization consists in selecting a tight upper certificate for with minimal eccentricity. Using a function such that returns when and returns otherwise, it can be obtained through our minimum eccentricity selection procedure by replacing Line 3 with . The algorithm is referenced as Algorithm 1+3’ in the sequel. This optimization through the delegate certificate technique provides performances similar to previous practical algorithms (see Section 9) while providing the above complexity guarantees (Corollary 1 also applies to this variant).

8 All eccentricities

We now present a novel algorithm for all eccentricities. It relies on minimum eccentricity selection and properties of tight upper certificates.

8.1 Optimal tight upper certificate

We first characterize the minimum tight upper certificate of a graph which is tightly related to the notion of tight upper certificate.

Proposition 2

Given a graph , being a tight upper certificate defines a binary relation which is a partial order ( stands for ). Moreover, the set of all maximum elements of this partial order is the unique tight upper certificate of with minimum size.

Proof. We first prove that the relation is a partial order. It is obviously reflexive as the distance from a node to itself is zero, implying . It is antisymmetric: if and are both tight upper certificates one for the other, we then have and , and thus . We finally show transitivity. Suppose that is a tight upper certificate for and that is an tight upper certificate for , that is and . We thus have . As triangle inequality implies , we obtain , and thus by triangle inequality again. We finally get and is a tight upper certificate for .

We now show that the set of maximal elements for is the unique optimal tight upper certificate of . For any non-maximal element , we can build a chain where has a tight upper certificate , if is not in , it has a tight upper certificate , and so on. As the partial order is finite, the chain must be finite and must be in for some . The transitivity of implies that is a tight upper certificate for implying . This shows that is a tight upper certificate. As each element of is the only tight upper certificate for itself (as a maximal element), is included in any tight upper certificate of . In particular, any minimum tight upper certificate must indeed equal .  

Note that includes in particular all centers of the graph: a center cannot have a tight upper certificate (otherwise we have in contradiction with the minimality of ).

8.2 All eccentricity computation and certification

We now propose to compute all eccentricities of a graph as follows (see Algorithm 4 for a formal description). We maintain both a lower certificate and an upper certificate . As long as some node has untight upper bound, we select a node with untight upper bound and minimal eccentricity using our minimum eccentricity selection procedure which then additionally ensures . (We use for that purpose a function returning when the eccentricity value equals the upper bound.) We claim that is in (see Lemma 1 bellow). We thus add to the upper certificate and update upper bounds accordingly. When our minimum eccentricity selection procedure detects that all nodes have tight upper bounds, lower bounds must be tight also. The algorithm then terminates with the following guarantees.

Input: A graph and a ranking of .
Output: All eccentricities, a tight lower certificate of and a tight upper certificate of .
/* Lower certificate (tentative hitting set for ) */
Maintain (initially 0) for all .
/* Upper certificate (maximal nodes for ) */
Maintain (initially ) for all .
Function
       If then return else return
While do
      
       /* Distances from . */
       /* Eccentricity of . */
      
       For do