# Shortest path queries, graph partitioning and covering problems in worst and beyond worst case settings

In this thesis, we design algorithms for several NP-hard problems in both worst and beyond worst case settings. In the first part of the thesis, we apply the traditional worst case methodology and design approximation algorithms for the Hub Labeling problem; Hub Labeling is a preprocessing technique introduced to speed up shortest path queries. Before this work, Hub Labeling had been extensively studied mainly in the beyond worst case analysis setting, and in particular on graphs with low highway dimension. In this work, we significantly improve our theoretical understanding of the problem and design (worst-case) algorithms for various classes of graphs, such as general graphs, graphs with unique shortest paths and trees, as well as provide matching inapproximability lower bounds for the problem in its most general settings. Finally, we demonstrate a connection between computing a Hub Labeling on a tree and searching for a node in a tree. In the second part of the thesis, we turn to beyond worst case analysis and extensively study the stability model introduced by Bilu and Linial in an attempt to describe real-life instances of graph partitioning and clustering problems. Informally, an instance of a combinatorial optimization problem is stable if it has a unique optimal solution that remains the unique optimum under small multiplicative perturbations of the parameters of the input. Utilizing the power of convex relaxations for stable instances, we obtain several results for problems such as Edge/Node Multiway Cut, Independent Set (and its equivalent, in terms of exact solvability, Vertex Cover), clustering problems such as k-center and k-median and the symmetric Traveling Salesman problem. We also provide strong lower bounds for certain families of algorithms for covering problems, thus exhibiting potential barriers towards the design of improved algorithms in this framework.

## Authors

• 6 publications
10/19/2018

### Bilu-Linial stability, certified algorithms and the Independent Set problem

We study the notion of Bilu-Linial stability in the context of Independe...
11/14/2016

### Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems

Max-cut, clustering, and many other partitioning problems that are of si...
02/23/2021

### Maximizing Agreements for Ranking, Clustering and Hierarchical Clustering via MAX-CUT

In this paper, we study a number of well-known combinatorial optimizatio...
03/28/2019

### Probabilistic Analysis of Facility Location on Random Shortest Path Metrics

The facility location problem is an NP-hard optimization problem. Theref...
10/19/2015

### Clustering is Easy When ....What?

It is well known that most of the common clustering objectives are NP-ha...
05/16/2018

### Wireless coverage prediction via parametric shortest paths

When deciding where to place access points in a wireless network, it is ...
11/14/2019

### Graph Spanners in the Message-Passing Model

Graph spanners are sparse subgraphs which approximately preserve all pai...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

#### List of Algorithms

LIST OF ALGORITHMSLIST OF ALGORITHMSloa

#### 2.1 Introduction

Computing shortest-path queries has become an essential part of many modern-day applications. A typical setting is a sparse input graph of millions of nodes (we denote ) with a length function , and a very large number of queries that need to be answered in (essentially) real time. Two classical approaches to such a problem are the following. One could precompute all pairwise distances and store them in an matrix, and then respond to any distance query in constant time (and, by using the appropriate data structures of size , one can also recover the vertices of the shortest path in time linear in the number of vertices that the path contains). This approach, although optimal with respect to the query time, is potentially wasteful with respect to space. Moreover, in many applications quadratic space is simply prohibitive. A second approach then would be to simply store an efficient graph representation of the graph, which for sparse graphs would result in a representation of size . In this second approach, whenever a query arrives, one can run Dijkstra’s algorithm and retrieve the distance and the corresponding shortest path in linear time for undirected graphs with integer weights [123], and, more generally, in time for arbitrary weighted directed graphs [71]. Two obvious problems with this latter approach are that linear time is nowhere close to real time, and moreover, such an approach requires a network representation that is global in nature, and so it does not allow for a more distributed way of computing shortest-path queries. Thus, a natural question that arises is whether we can get a trade-off between space and query time complexity, and whether we can obtain data structures that inherently allow for distributed computations as well (the latter is a desirable property in many applications, when, ideally, one would not want a central coordination system).

Data structures that allow for responding to distance queries are usually called distance oracles, and have been intensively studied in the last few decades, mainly focusing on the optimal trade-offs between space and query time, as well as exact/approximate recovery (e.g. see [124, 118, 95, 59, 77]).

Here, we will mainly be interested in a slightly different approach based on vertex labelings, that allows for simple schemes that are easy to implement in a distributed setting. The starting point is the observation that in an explicit representation of (parts of) the adjacency matrix of a graph, the names of the vertices are simply place holders, not revealing any information about the structure of the graph. This motivates the search for more informative names (or labels) for each vertex, that would allow us to derive some information about the vertex.

The first such approach was introduced by Breuer and Folkman [40, 41], and involves using more localized labeling schemes that allow one to infer the adjacency of two nodes directly from their labels, without using any additional information, while achieving sublinear space bounds. A classic work of Kannan, Naor and Rudich [93] further explored the feasibility of efficient adjacency labeling schemes for various families of graphs. Taking this line of research a step further, Graham and Pollak [80] were among the first to consider the problem of labeling the nodes of an unweighted graph such that the distance between two vertices can be computed using these two labels alone. They proposed to label each node with a word of symbols (where is the number of vertices of the graph) from the set , such that the distance between two nodes corresponds to the Hamming distance of the two words (the distance between and any symbol being zero). Referenced as the Squashed Cube Conjecture, Winkler [126] proved that for every (note though that the scheme requires linear query time to decode the distance of a pair).

Moving towards the end of the 90s, Peleg [119] revisited the problem of existence of efficient labeling schemes of any kind that could answer shortest-path queries. The setting now is quite general, in that we are allowed as much preprocessing time as needed for the whole network, and the goal is to precompute labels for each vertex of the graph such that any shortest-path query between two vertices can be computed by looking only at the corresponding labels of the two vertices (and applying some efficiently computable “decoding” function on them that actually computes the distance). If the labels are short enough on average, then the average query time can be sublinear. In [119], Peleg did manage to give polylogarithmic upper bounds for the size of the labels needed to answer exact shortest-path queries for weighted trees and chordal graphs, and also gave some bounds for distance approximating schemes. Gavoille et al. [75] continued along similar lines and proved various upper and lower bounds for the label size for various classes of (undirected) graphs. They also modified the objective function and, besides getting bounds for the size of the largest label, they also obtained bounds for the average size of the labels. Shortly after that work, Cohen et al. [58] presented their approach for the problem, proposing what is now known as the Hub Labeling framework for generating efficient labeling scheme for both undirected and directed weighted graphs.

###### Definition 2.1 (Hub Labeling [58, 75]).

Consider an undirected graph with edge lengths . Suppose that we are given a set system with one set for every vertex . We say that is a hub labeling if it satisfies the following covering property: for every pair of vertices ( and are not necessarily distinct), there is a vertex in (a common “hub” for and ) that lies on a shortest path between and . We call vertices in sets hubs: a vertex is a hub for .

In the Hub Labeling problem (HL), our goal is to find a hub labeling with a small number of hubs; specifically, we want to minimize the -cost of a hub labeling.

###### Definition 2.2.

The -cost of a hub labeling equals for ; the -cost is . The hub labeling problem with the -cost, which we denote by HL, asks to find a hub labeling with the minimum possible -cost.

We note here that, although our presentation will only involve undirected graphs, most of our results extend to the directed setting as well (see Section 2.7). In the next few sections, we will study HL and design approximation algorithms for various classes of graphs, as well as show strong lower bounds for general graphs. But first, we will explain why we care about the Hub Labeling problem, and how it is related to the shortest-path problem.

Nowadays hundreds of millions of people worldwide use web mapping services and GPS devices to get driving directions. That creates a huge demand for fast algorithms for computing shortest paths (algorithms that are even faster than the classic Dijkstra’s algorithm). Hub labelings provide a highly efficient way for computing shortest paths and is used in many state-of-the-art algorithms (see also the paper of Bast et al. [28] for a review and discussion of various methods for computing shortest paths that are used in practice).

We will now demonstrate the connection between the Hub Labeling and the problem of computing shortest paths. Consider a graph with edge lengths . Let be the shortest-path metric on . Suppose that we have a hub labeling . During the preprocessing step, we compute and store the distance between every vertex and each hub of . Observe that we can now quickly answer a distance query: to find we compute . By the triangle inequality, , and the covering property guarantees that there is a hub on a shortest path between and ; so . We can compute and answer the query in time . We need to keep a lookup table of size to store the distances between the vertices and their hubs. So, if, say, all hub sets are of polylogarithmic size, the algorithm answers a distance query in polylogarithmic time and requires space. The outlined approach can be used not only for computing distances but also shortest paths between vertices. It is clear from this discussion that it is important to have a hub labeling of small size, since both the query time and storage space depend on the number of hubs.

Recently, there has been a lot of research on algorithms for computing shortest paths using the hub labeling framework (see e.g. the following papers by Abraham et al. [6, 4, 3, 5, 1, 2]). It was noted that these algorithms perform really well in practice (see e.g. [4]). A systematic attempt to explain why this is the case led to the introduction of the notion of highway dimension [6]. Highway dimension is an interesting concept that managed to explain, at least partially, the success of the above methods: it was proved that graphs with small highway dimension have hub labelings with a small number of hubs; moreover, there is evidence that most real-life road networks have low highway dimension [29]. Even more recently, Kosowski and Viennot [100], inspired by the notion of highway dimension, introduced another related notion, the skeleton dimension, that is a slightly more tractable and elegant notion that again explains, to some extent, why the hub labeling framework is successful for distance queries.

However, most papers on Hub Labeling offer only algorithms with absolute guarantees on the cost of the hub labeling they find (e.g. they show that a graph with a given highway dimension has a hub labeling of a certain size and provide an algorithm that finds such a hub labeling); they do not relate the cost of the hub labeling to the cost of the optimal hub labeling. There are very few results on the approximability of the Hub Labeling problem. Only very recently, Babenko et al. [20] and White [125] proved respectively that HL and HL are NP-hard. Cohen et al. [58] gave an -approximation algorithm for HL by reducing the problem to a Set Cover instance and using the greedy algorithm for Set Cover to solve the obtained instance (the latter step is non-trivial since the reduction gives a Set Cover instance of exponential size); later, Babenko et al. [19] gave a combinatorial -approximation algorithm for HL, for any .

###### Our results.

In this thesis, we will present the following results (most of which were published in 2017 [14]). We prove an hardness for HL and HL on graphs that have multiple shortest paths between some pairs of vertices (assuming that ). The result (which easily extends to HL for on graphs with vertices) shows that the algorithms by Cohen et al. and Babenko et al. are optimal, up to constant factors. Since it is impossible to improve the approximation guarantee of for arbitrary graphs, we focus on special families of graphs. We consider the family of graphs with unique shortest paths — graphs in which there is only one shortest path between every pair of vertices. This family of graphs appears in the majority of prior works on Hub Labeling (see e.g. [1, 20, 5]) and is very natural, in our opinion, since in real life all edge lengths are somewhat random, and, therefore, any two paths between two vertices and have different lengths. For such graphs, we design an approximation algorithm with approximation guarantee , where is the shortest-path diameter of the graph (which equals the maximum hop length of a shortest path; see Section 2.2.1 for the definition); the algorithm works for every fixed (the constant in the -notation depends on ). In particular, this algorithm gives an factor approximation for graphs of diameter , while previously known algorithms give only an approximation. Our algorithm crucially relies on the fact that the input graph has unique shortest paths; in fact, our lower bounds of on the approximation ratio apply to graphs of constant diameter (with non-unique shortest paths). We also extensively study HL on trees. Somewhat surprisingly, the problem is not at all trivial on trees. In particular, the standard LP relaxation for the problem is not integral. In [14] we presented the following results for trees.

1. Design a polynomial-time approximation scheme (PTAS) for HL for every .

2. Design an exact quasi-polynomial time algorithm for HL for every , with running time .

3. Analyze a simple combinatorial heuristic for trees, proposed by Peleg in 2000, and prove that it gives a 2-approximation for HL (we also show that this heuristic does not work well for HL when is large).

After the publication of our work [14], Gawrychowski et al. [76] observed that an algorithm of Onak and Parys [116], combined with a structural result of ours, shows that HL can be solved exactly on trees in polynomial time. Their main observation is that the problem of computing an optimal hub labeling on trees can be cast as a problem of “binary search” in trees; this implies that the algorithm of Onak and Parys [116] solves HL optimally, and moreover, the work of Jacob et al. [90] can be adapted in order to obtain a polynomial-time algorithm for HL on trees for fixed and for (for any fixed . Since we believe that our original DP approach might still be of interest, we will present it, and then we will formally state and analyze the algorithm of [90] and how it can be used to solve HL on trees.

###### Organization of material.

In Section 2.3 we start with a simple rounding scheme that gives a relaxation-based -approximation algorithm for HL for every , thus matching the guarantees of the known combinatorial algorithms. In Section 2.5 we present an approximation algorithm for graphs with unique shortest paths; we first present the (slightly simpler) algorithm for HL, and then the algorithm for HL for any fixed . Then, in Section 2.6, we prove an -hardness for HL and HL by constructing a reduction from Set Cover. As mentioned, the result easily extends to HL for on graphs with vertices. Chapter 2 concludes with a brief section that explains how our results extend to the case of directed graphs (see Section 2.7). Finally, in Chapter 3 we present several algorithms for HL on trees, and also discuss the equivalence of HL on trees with the problem of searching for a node in a tree.

#### 2.2 Preliminaries

##### 2.2.1 Definitions

Throughout the rest of this chapter, we always assume (unless stated otherwise) that we have an undirected graph with positive edge lengths , . We denote the number of vertices as . We will say that a graph

has unique shortest paths if there is a unique shortest path between every pair of vertices. We note that if the lengths of the edges are obtained by measurements, which are naturally affected by noise, the graph will satisfy the unique shortest path property with probability 1.

One parameter that our algorithms’ performance will depend on is the shortest path diameter of a graph , which is defined as the maximum hop length of a shortest path in (i.e. the minimum number such that every shortest path contains at most edges). Note that is upper bounded by the aspect ratio of the graph:

 D≤ρ≡maxu,v∈Vd(u,v)min(u,v)∈El(u,v).

Here, is the shortest path distance in w.r.t. edge lengths . In particular, if all edges in have length at least , then , where .

We will use the following observation about hub labelings: the covering property for the pair (technically) requires that , and from now on, we will always assume that , for every .

##### 2.2.2 Linear/Convex programming relaxations for HL

In this section, we introduce a natural LP formulation for HL. Let be the set of all (unordered) pairs of vertices, including pairs , which we also denote as , . We use indicator variables , for all , that represent whether or not. Let be the set of all vertices that appear in any of the (possibly many) shortest paths between and (including the endpoints and ). We also define . Note that, although the number of shortest paths between and might, in general, be exponential in , the set can always be computed in polynomial time. In case there is a unique shortest path between and , we use both and to denote the vertices of that unique shortest path. One way of expressing the covering property as a constraint is “, for all ”. The resulting LP relaxation is given in Figure 2.1.

We note that the constraint “” can be equivalently rewritten as follows: , and for all , and , where we introduce variables for every pair and every . Observe that these constraints are linear, and moreover, the total number of variables and constraints remains polynomial in . Thus, an optimal solution can always be found efficiently.

One indication that the above LP is indeed an appropriate relaxation for HL is that we can reproduce the result of [58] and get an -approximation algorithm for HL by using a very simple rounding scheme. But, we will use the above LP in more refined ways, mainly in conjunction with the notion of pre-hubs, which we introduce later on.

We also generalize the above LP to a convex relaxation for HL, for any . The only difference with the above relaxation is that we use a convex objective function and not a linear one. More concretely, the convex program for HL, for any is given in Figure 2.2. In the case of , we end up with an LP, whose objective is simply “”, and there are more constraints of the form “”, for each . To make our presentation more uniform, we will always refer to the convex relaxation of Figure 2.2, even when .

##### 2.2.3 Hierarchical hub labeling

We now define and discuss the notion of hierarchical hub labeling (HHL), introduced by Abraham et al. [5]. The presentation in this section follows closely the one in [5].

###### Definition 2.3.

Consider a set system . We say that if . Then, the set system is a hierarchical hub labeling if it is a hub labeling, and is a partial order.

We will say that is higher ranked than if . Every two vertices and have a common hub , and thus there is a vertex such that and . Therefore, there is the highest ranked vertex in .

We now define a special type of hierarchical hub labelings. Given a total order , a canonical labeling is the hub labeling that is obtained as follows: if and only if for all . It is easy to see that a canonical labeling is a feasible hierarchical hub labeling. We say that a hierarchical hub labeling respects a total order if the implied (by ) partial order is consistent with . Observe that there might be many different total orders that respects. In [5], it is proved that all total orders that respects have the same canonical labeling , and is a subset of . Therefore, is a minimal hierarchical hub labeling that respects the partial order that implies.

From now on, all hierarchical hub labelings we consider will be canonical hub labelings. Any canonical hub labeling can be obtained by the following process [5]. Start with empty sets , choose a vertex and add it to each hub set . Then, choose another vertex . Consider all pairs and that currently do not have a common hub, such that lies on a shortest path between and . Add to and . Then, choose , …, , and perform the same step. We get a hierarchical hub labeling. (The hub labeling, of course, depends on the order in which we choose vertices of .)

This procedure is particularly simple if the input graph is a tree. In a tree, we choose a vertex and add it to each hub set . We remove from the tree and recursively process each connected component of . No matter how we choose vertices , we get a canonical hierarchical hub labeling; given a hierarchical hub labeling , in order to get a canonical hub labeling , we need to choose the vertex of highest rank in (w.r.t. to the order defined by ) when our recursive procedure processes subinstance . A canonical hub labeling gives a recursive decomposition of the tree to subproblems of gradually smaller size.

#### 2.3 Warm-up: a relaxation-based O(logn)-approximation algorithm for HLp

In this section, we describe and analyze a simple rounding scheme (inspired by Set Cover) for the convex relaxation for HL (see Figure 2.2), that gives an -approximation for HL, for every , and works on all graphs (even with multiple shortest paths). This matches the approximation guarantee of the combinatorial algorithms of Cohen et al. [58] and Babenko et al. [19]. For any graph with vertices, the rounding scheme is the following (see Algorithm 1).

###### Theorem 2.4.

For every , Algorithm 1 is an -approximation algorithm for HL that succeeds with high probability.

###### Proof.

First, it is easy to see that for each , we can write , where if , and 0 otherwise. We have , and so, by linearity of expectation, we get . We now observe that for each , the variables are independent. Thus, we can use the standard Chernoff bound, which, for any , gives

 Pr[|Hu|≥(1+δ)⋅E[|Hu|]]≤(eδ(1+δ)1+δ)E[|Hu|].

We set and get (where the last inequality holds since and thus ). Taking a union bound, we get that with probability at least , for all , . We conclude that with probability at least ,

 (∑u∈V|Hu|p)1/p≤((9lnn)p∑u∈V(∑v∈Vxuv)p)1/p=9lnn⋅OPTCP,

where is the optimal value of the convex program.

We will now prove that the sets are indeed a feasible hub labeling with high probability. It is easy to verify that we always get . So, let . We have

 Pr[Hu∩Hv∩Suv=∅] =∏w∈SuvPr[tw>min{xuw,xvw}]=∏w∈Suv(1−3lnn⋅min{xuw,xvw}) ≤∏w∈Suve−3lnn⋅min{xuw,xvw}=e−3lnn⋅∑w∈Suvmin{xuw,xvw} ≤e−3lnn=1/n3.

Taking a union bound over all pairs of vertices, we get that the probability that the algorithm does not return a feasible hub labeling is at most . Thus, we conclude that the algorithm returns a feasible solution of value at most with probability at least . ∎

#### 2.4 Pre-hub labeling

We now introduce the notion of a pre-hub labeling that we will use in designing algorithms for HL. From now on, we will only consider graphs with unique shortest paths.

###### Definition 2.5 (Pre-hub labeling).

Consider a graph and a length function ; assume that all shortest paths are unique. A family of sets , with , is called a pre-hub labeling, if for every pair , there exist and such that ; that is, vertices , , , and appear in the following order along : (possibly, some of the adjacent, with respect to this order, vertices coincide).

Observe that any feasible HL is a valid pre-hub labeling. We now show how to find a pre-hub labeling given a feasible LP solution.

###### Lemma 2.6.

Consider a graph and a length function ; assume that all shortest paths are unique. Let be a feasible solution to (see Figure 2.1). Then, there exists a pre-hub labeling such that . In particular, if is an optimal LP solution and is the -cost of the optimal hub labeling (for HL), then . Furthermore, the pre-hub labeling can be constructed efficiently given the LP solution .

###### Proof.

Let us fix a vertex . We build the breadth-first search tree (w.r.t. edge lengths; i.e. the shortest path tree) from ; tree is rooted at and contains those edges that appear on a shortest path between and some vertex . Observe that is indeed a tree and is uniquely defined, since we have assumed that shortest paths in are unique. For every vertex , let be the subtree of rooted at vertex . Given a feasible LP solution , we define the weight of to be .

We now use the following procedure to construct set . We process the tree bottom up (i.e. we process a vertex after we have processed all other vertices in the subtree rooted at ), and whenever we detect a subtree of such that , we add vertex to the set . We then set for all , and continue (with the updated values) until we reach the root of . Observe that every time we add one vertex to , we decrease the value of by at least . Therefore, . We will now show that sets form a pre-hub labeling. To this end, we prove the following two claims.

###### Claim 2.7.

Consider a vertex and two vertices such that . If , then .

###### Proof.

Consider the execution of the algorithm that defined . Consider the moment when we processed vertex . Since we did not add to , we had . In particular, since lies in , we have , where is the value of at the moment . Since none of the vertices on the path were added to , none of the variables for had been set to . Therefore, (where is the initial value of the variable) for . We conclude that , as required. ∎

###### Claim 2.8.

For any pair , let be the vertex closest to among all vertices in and be the vertex closest to among all vertices in . Then . (Note that , since we always have and hence ; similarly, .)

###### Proof.

Let us assume that this is not the case; that is, . Then and (otherwise, we would trivially have ). Let be the first vertex after on the path , and be the first vertex after on the path . Since , every vertex of lies either on or , or both (i.e. ).

By our choice of , there are no pre-hubs for on . By Claim 2.7, . Similarly, . Thus,

 1>∑w∈Puv′′xvw+∑w∈Pu′′vxuw≥∑w∈Puvmin{xuw,xvw}.

We get a contradiction since is a feasible LP solution. ∎

Claim 2.8 shows that is a valid pre-hub labeling. ∎

#### 2.5 Hub labeling on graphs with unique shortest paths

In this section, we present an -approximation algorithm for HL on graphs with unique shortest paths, where is the shortest path diameter of the graph. The algorithm works for every fixed (the hidden constant in the approximation factor depends on ). We will first present the (slightly simpler) algorithm for HL, and then extend the algorithm and make it work for HL, for arbitrary fixed .

##### 2.5.1 An O(logD)-approximation algorithm for HL1

Consider Algorithm 2. The algorithm solves the LP relaxation (see Figure 2.1) and computes a pre-hub labeling as described in Lemma 2.6. Then it chooses a random permutation of and goes over all vertices one-by-one in the order specified by : , ,…, . It adds to if there is a pre-hub such that the following conditions hold: lies on the path , there are no pre-hubs for between and (other than ), and currently there are no hubs for between and .

###### Theorem 2.9.

Algorithm 2 always returns a feasible hub labeling . The cost of the hub labeling is in expectation, where is the optimal value of .

###### Remark 2.10.

Algorithm 2 can be easily derandomized using the method of conditional expectations: instead of choosing a random permutation , we first choose , then and so on; each time we choose so as to minimize the conditional expectation .

###### Proof.

We first show that the algorithm always finds a feasible hub labeling. Consider a pair of vertices and . We need to show that they have a common hub on . The statement is true if since and thus . So, we assume that . Consider the path . Because of the pre-hub property, there exist and such that . In fact, there may be several possible ways to choose such and . We choose and so that (for instance, choose the closest pair of and among all possible pairs). Consider the first iteration of the algorithm such that . We claim that the algorithm adds to both and . Indeed, we have: (i) lies on , (ii) there are no pre-hubs of on other than , (iii) is the first vertex we process on the path , thus currently there are no hubs on . Therefore, the algorithm adds to . Similarly, the algorithm adds to .

Now we upper bound the expected cost of the solution. We will charge every hub that we add to to a pre-hub in ; namely, when we add to (see line 5 of Algorithm 2), we charge it to pre-hub . For every vertex , we have . We are going to show that every is charged at most times in expectation. Therefore, the expected number of hubs in is at most .

Consider a vertex and a pre-hub (). Let be the closest pre-hub to on the path . Observe that all hubs charged to lie on the path . Let . Note that . Consider the order in which the vertices of were processed by the algorithm ( is a random permutation). Note that charges if and only if is closer to than . The probability of this event is . We get that the number of hubs charged to is , in expectation. Hence, . ∎

##### 2.5.2 An Op(logD)-approximation algorithm for HLp

In this section, we analyze Algorithm 2, assuming that we solve the convex program of Figure 2.2. To analyze the performance of Algorithm 2 in this case, we need the following theorem by Berend and Tassa [31].

###### Theorem 2.11 (Theorem 2.4, [31]).

Let

be a sequence of independent random variables for which

, and let . Then, for all ,

 (E[Xp])1/p≤0.942⋅pln(p+1)⋅max{E[X]1/p,E[X]}.

In order to simplify our analysis, we slightly modify Algorithm 2 and get Algorithm 3.

###### Theorem 2.12.

For any , Algorithm 3 is an -approximation algorithm for HL.

###### Proof.

First, it is easy to see that, since all leaves of are pre-hubs of the set , we have , and so .

Let be the collection of subpaths of defined as follows: belongs to if is a path between consecutive pre-hubs and of , with being an ancestor of in , and no other pre-hub appears in . For convenience, we exclude the endpoint that is closer to : . Note that any such path is uniquely defined by the pre-hub of , and so we will denote as . The modification we made in the algorithm allows us now to observe that , for , .

Let be the cost of the solution that the modified algorithm (i.e. Algorithm 3) returns. We have (by Jensen’s inequality).

We can write , where is the random variable indicating how many vertices are added to “because of” the pre-hub (see line 6 of the algorithm). Observe that we can write as follows: , with being 1 if is added in , and 0 otherwise. The modification that we made in the algorithm implies, as already observed, that any variable , , is independent from , , for , as the corresponding paths and are disjoint.

Let , and let be the induced permutation when we restrict (see line 4 of the algorithm) to the vertices of . We can then write , , where is 1 if the vertex considered by the algorithm that belongs to (i.e. the vertex of permutation ) is added to and 0 otherwise. It is easy to see that . We now need one last observation. We have . To see this, note that the variables do not reveal which particular vertex is picked from the permutation at each step, but only the relative order of the current draw (i.e. random choice) with respect to the current best draw (where best here means the closest vertex to that we have seen so far, i.e. in positions ). Thus, regardless of the relative order of , there are exactly possibilities to extend that order when the permutation picks