RoutingKit is a C++ library that provides advanced route planning functionality.
We consider the problem of quickly computing shortest paths in weighted graphs given auxiliary data derived in an expensive preprocessing phase. By adding a fast weight-customization phase, we extend Contraction Hierarchies by Geisberger et al to support the three-phase workflow introduced by Delling et al. Our Customizable Contraction Hierarchies use nested dissection orders as suggested by Bauer et al. We provide an in-depth experimental analysis on large road and game maps that clearly shows that Customizable Contraction Hierarchies are a very practicable solution in scenarios where edge weights often change.READ FULL TEXT VIEW PDF
Succinct representations of a graph have been objects of central study i...
Graph partitioning has many applications. We consider the acceleration o...
In this note, we consider the following problem: given a connected graph...
We study the problem of computing shortest paths in massive road network...
Recent work has considered the problem of extending to the case of itera...
Yang et al. (2013) introduced LORS, a method that jointly models the
The only systematic mapping of the HLAI technical landscape was conducte...
RoutingKit is a C++ library that provides advanced route planning functionality.
RoutingKit is a C++ library that provides advanced route planning functionality.
Computing optimal routes in road networks has many applications such as navigation, logistics, traffic simulation or web-based route planning. Road networks are commonly formalized as weighted graphs and the optimal route is formalized as the shortest path in this graph. Unfortunately, road graphs tend to be huge in practice with vertex counts in the tens of millions, rendering Dijkstra’s algorithm  impracticable for interactive use: It incurs running times in the order of seconds even for a single path query. For practical performance on large road networks, preprocessing techniques that augment the network with auxiliary data in a (possibly expensive) offline phase have proven useful. See  for an overview. Many techniques work by adding extra edges called shortcuts to the graph that allow query algorithms to bypass large regions of the graph efficiently. While variants of the optimal shortcut selection problem have been proven to be NP-hard , determining good shortcuts is feasible in practice even on large road graphs. Among the most successful speedup techniques using this building block are Contraction Hierarchies (CH) by [20, 21]
. At its core the technique consists of a systematic way of adding shortcuts by iteratively contracting vertices along a given order. Even though ordering heuristics exist that work well in practice, the problem of computing an optimal ordering is NP-hard in general . Worst-case bounds have been proven in  in terms of a weight-dependent graph measure called highway dimension and  have shown that many of these bounds are tight on many graph classes.
A central restriction of CHs as proposed by  is that their preprocessing is metric-dependent, that is edge weights, also called metric, need to be known. Substantial changes to the metric, e. g., due to user preferences or traffic congestion, may require expensive recomputation. For this reason, a Customizable Route Planning (CRP) approach was proposed in , extending the multi-level-overlay MLD techniques of [36, 26]. It works in three phases: In a first, expensive phase, auxiliary data is computed that solely exploits the topological structure of the network, disregarding its metric. In a second, much less expensive phase, this auxiliary data is customized to a specific metric, enabling fast queries in the third phase. In this work we extend CH to support such a three-phase approach.
Most existing CH papers focus solely on road graphs, with  being a notable exception, but there are many other applications with differently structured graphs in which fast shortest path computations are important. One of such applications is games. Think of a real-time strategy game where units quickly have to navigate across a large map with many choke points. The basic topology of the map is fixed, however, when buildings are constructed or destroyed, fields are rendered impassable or freed up. Furthermore, every player has his own knowledge of the map. Many games include a feature called fog of war: Initially only the fields around the player’s starting location are revealed. As his units explore the map, new fields are revealed. Since a unit must not route around a building that the player has not yet seen, every player needs his own metric. Furthermore, units such as hovercrafts may traverse water and land, while other units are bound to land. This results in vastly different, evolving metrics for different unit types per player, making metric-dependent preprocessing difficult to apply. Contrary to road graphs one-way streets tend to be extremely rare, and thus being able to exploit the symmetry of the underlying graph is a useful feature.
One of the central building blocks of this paper is the use of metric-independent nested dissection orders (ND-orders)  for CH precomputation instead of the metric-dependent order of . This approach was proposed by , and a preliminary case study can be found in . A similar idea was followed by , where the authors employ partial CHs to engineer subroutines of their customization phase. They also refer to preliminary experiments on full CH but did not publish results. Similar ideas have also appeared in : They consider graphs of low treewidth (see below) and leverage this property to compute CH-like structures, without explicitly using the term CH. Related techniques by [40, 11] work directly on the tree decomposition. Interestingly, our experiments show that even large road networks have relatively low treewidth: Real-world road networks with vertex counts in the have treewidth in the .
Customizable speedup techniques for shortest path queries are a very recent development but the idea to order vertices along nested dissection orders is significantly older. To the best of our knowledge the idea first appeared in 1973 in  and was refined in . They use nested dissection orders to reorder the columns and rows of sparse matrices to assure that Gaussian elimination preserves as many zeros as possible. From the matrix they derive a graph and show that vertex contraction in this graph corresponds to Gaussian variable elimination. Inserting an extra edge in the graph destroys a zero in the matrix. The additional edges are called the fill-in. The minimum fill-in problem asks for a vertex order that results in a minimum number of extra arcs. In CH terminology these extra edges are called shortcuts. The super graph constructed by adding the additional edges is a chordal graph. The treewidth of a graph can be defined using chordal super graphs: For every super graph consider the number of vertices in the maximum clique minus one. The treewidth of a graph is the minimum of this number over all chordal super graphs of . This establishes a relation between sparse matrices and treewidth and in consequence with CHs. We refer to  and  for an introduction to the broad field of treewidth and tree decompositions.
Minimizing the number of extra edges, i. e., minimizing the fill-in, is NP-hard  but fixed parameter tractable in the number of extra edges . Note, however, that from the CH point of view, optimizing the number of extra edges, i. e., the number of shortcuts, is not the sole optimization criterion. Consider for example a path graph as depicted in Figure 1: Optimizing the CH search space and the number of shortcuts are competing criteria. A tree relevant in the theory of treewidth is the elimination tree.  have shown that the maximum search space size in terms of vertices corresponds to the height of this elimination tree. Unfortunately, minimizing the elimination tree height is also NP-hard . For planar graphs, it has been shown that the number of additional edges is in . However, this does not imply a search space bound in terms of vertices as search spaces can share vertices.
Real-world road networks contain one-way streets and highways. Such networks are thus usually modeled as directed graphs. Our algorithms fully support direction of traffic—however, we introduce it at a different stage of the toolchain than most related techniques, which should not be confused with only supporting undirected networks. Our first preprocessing phase works exclusively on the underlying undirected and unweighted graph, obtained by dropping all edge directions and edge weights. Direction of traffic as well as traversal weights are only introduced in the second (customization) phase, where every edge can have two weights: an upward and a downward weight. If an edge corresponds to an one-way street, then one of these weights is set to . Note that this setup is a strength of our algorithm: Throughout large parts of the toolchain we are not confronted with additional complexity induced by directed edges. This contrasts with many other techniques, where considering edge direction induces changes in nearly every step of the algorithm.
The main contribution of our work is to show that Customizable Contraction Hierarchies (CCH) solely based on the ND-principle are feasible and practical. Compared to CRP  we achieve a similar preprocessing–query tradeoff, albeit with slightly better query performance at slightly slower customization speed and we need somewhat more space. Interestingly, for less well-behaved metrics such as travel distance, we achieve query times below the original metric-dependent CH of . Besides this main result, we show that given a fixed contraction order, a metric-independent CH can be constructed in time essentially linear in the size of the Contraction Hierarchy with working memory consumption linear in the input graph. Our specialized algorithm has better theoretic worst-case running time and performs significantly better empirically than the dynamic adjacency arrays used in . Another contribution of our work are perfect witness searches. We show that for a fixed metric-independent vertex order it is possible to construct CHs with a provably minimum number of arcs in a few seconds on continental road graphs. Our construction algorithm has a running time performance completely independent of the weights used. We further show that an order based on nested dissection gives a constant factor approximation of the maximum and average search space sizes in terms of the number of arcs and vertices for metric-independent CHs on a class of graphs with very regular recursive vertex separators. Experimentally, we show that road graphs have such a recursive separator structure.
Section 2 sets necessary notation. Section 3 discusses metric-dependent orders as used by , highlighting specifics of our implementation. Next, we discuss metric-independent orders in Section 4. In Section 5, we describe how to efficiently construct the arcs of the CH. The next Section 6 discusses how to efficiently enumerate triangles in the CH — an operation needed throughout the customization process detailed in Section 7. In Section 7 we further describe the details of the perfect witness search. Finally, Section 8 concludes the algorithm description by introducing the algorithms used in the query phase to compute shortest path distances and compute the corresponding paths in the input graph. We then present an extensive experimental study that thoroughly evaluates the proposed algorithm. We finish with the conclusion, and with directions for future work.
We denote by an undirected -vertex graph where is the set of vertices and the set of edges. Furthermore, denotes a directed graph, where is the set of arcs. A graph is simple if it has no loops or multi-edges. Graphs in this paper are simple unless noted otherwise, e. g., in parts of Section 5. Furthermore, we assume that input graphs are strongly connected. We denote by the neighborhood of vertex , i. e., the set of vertices adjacent to ; for directed graphs the neighborhood ignores arc direction. A vertex separator is a vertex subset whose removal separates into two disconnected subgraphs induced by the vertex sets and . The sets , and are disjoint and their union forms . Note that the subgraphs induced by and are not necessarily connected and may be empty. A separator is balanced if .
A vertex order is a bijection. Its inverse assigns each vertex a rank. Every undirected graph can be transformed into a upward directed graph with respect to a vertex order , i. e., every edge with is replaced by an arc . Note that all upward directed graphs are acyclic. We denote by the upward neighborhood of , i. e., the neighbors of with a higher rank than , and by the downward neighborhood of , i. e., the vertices with a lower rank than . We denote by the upward degree and by the downward degree of a vertex.
Undirected edge weights are denoted using . With respect to a vertex order we define an upward weight and a downward weight . For directed graphs, one-way streets are modeled by setting or to .
A path is a sequence of adjacent vertices and incident edges. Its hop-length is the number of edges in . Its weight-length with respect to is the sum over all edges’ weights. Unless noted otherwise, length always refers to weight-length in this paper. A shortest -path is a path of minimum length between vertices and . The minimum length in between two vertices is denoted by . We set if no path exists. An up-down path with respect to is a path that can be split into an upward path and a downward path . The vertices in the upward path must occur by increasing rank and the vertices in the downward path must occur by decreasing rank . The upward and downward paths meet at the vertex with the maximum rank on the path. We refer to this vertex as the meeting vertex.
The vertices of every acyclic directed graph (DAG) can be partitioned into levels such that for every arc it holds that . We only consider levels such that each vertex has the lowest possible level. Note that such levels can be computed in linear time given a directed acyclic graph.
The unweighted vertex contraction of in consists of removing and all incident edges and inserting edges between all neighbors if not already present. The inserted edges are referred to as shortcuts and the other edges are original edges. Given an order the core graph is obtained by contracting all vertices in order of their rank. We call the original graph augmented by the set of shortcuts a contraction hierarchy . Furthermore, we denote by the corresponding upward directed graph.
Given a fixed weight , one can exploit that in many applications it is sufficient to only preserve all shortest path distances . Weighted vertex contraction of a vertex in the graph is defined as the operation of removing and inserting (a minimum number of shortcuts) among the neighbors of to obtain a graph such that for all vertices and . To compute , one iterates over all pairs of neighbors of increasing by . For each pair one checks whether a -path of length exists in , i. e., one checks whether removing destroys the -shortest path. This check is called witness search  and the -path is called witness, if it exists. If a witness is found, the considered vertex pair is skipped and no shortcut added. Otherwise, if an edge already exists, its weight is decreased to , or a new shortcut edge with that weight is added to . This new shortcut edge is considered in witness searches for subsequent neighbor pairs as part of . If shortest paths are not unique, it is important to iterate over the pairs increasing by , because otherwise more edges than strictly necessary can be inserted: Shorter shortcuts can make longer shortcuts superfluous. However, if we insert the shorter shortcut after the longer ones, the witness search will not consider them. See Figure 2 for an example. Note that the witness searches are expensive and therefore the witness search is usually aborted after a certain number of steps . If no witness was found until, we assume that none exists and add a shortcut. This does not affect the correctness of the technique but might result in slightly more shortcuts than necessary. To distinguish, perfect witness search is without such an one-sided error.
For an order and a weight the weighted core graph is obtained by contracting all vertices . The original graph augmented by the set of weighted shortcuts is called a weighted contraction hierarchy . The corresponding upward directed graph is denoted by .
The search space of a vertex is the subgraph of (respectively ) reachable from . For every vertex pair and , it has been shown that a shortest up-down path must exist. This up-down path can be found by running a bidirectional search from restricted to and from restricted to . A graph is chordal if for every cycle of at least four vertices there exists a pair of vertices that are non-adjacent in the cycle but are connected by an edge. An alternative characterization is that a vertex order exists such that for every the neighbors of in , i. e., the core graph before the contraction of , form a clique . Such an order is called a perfect elimination order. Another way to formulate this characterization in CH terminology is as follows: A graph is chordal if and only if a contraction order exists such that the CH construction without witness search does not insert any shortcuts. A chordal super graph can be obtained by adding the CH shortcuts.
The elimination tree is a tree directed towards its root . The parent of vertex is its upward neighbor of minimal rank . Note that this definition already yields a straightforward algorithm for constructing the elimination tree. As shown in , the set of vertices on the path from to is the set of vertices in . Computing a contraction hierarchy without witness search of graph consists of computing a chordal super graph with perfect elimination order . The height of the elimination tree corresponds to the maximum number of vertices in the search space. Note that the elimination tree is only defined for undirected unweighted graphs.
Most publications on applications and extensions of Contraction Hierarchies use greedy orders in the spirit of , but details of vertex order computation and witness search vary. For reproducibility, we describe our precise approach in this section, extending on the general description of metric-dependent CH preprocessing given in Section 2. Our witness search aborts once it finds some path shorter than the shortcut—or when both forward and backward search each have settled at most vertices. For most experiments we choose . The only exception is the distance metric on road graphs, where we set . We found that a higher value of increases the time per witness-search but leads to sparser cores. For the distance metric we needed a high value because otherwise our cores get too dense. This effect did not occur for the other weights considered in the experiments. Our weighting heuristic is similar to the one of . We denote by a value that approximates the level of vertex . Initially all are . If is contracted, then for every incident edge we perform . We further store for every arc a hop length . This is the number of arcs that the shortcut represents if fully unpacked. Denote by the set of arcs removed if is contracted and by the set of arcs that are inserted. Note that is not necessarily a full clique because of the witness search and because some edges may already exist. We greedily contract a vertex that minimizes its importance defined by
We maintain a priority queue that contains all vertices weighted by . Initially all vertices are inserted with their exact importance. As long as the queue is not empty, we remove a vertex with minimum importance and contract it. This modifies the importance of other vertices. However, our weighting function is chosen such that only the importance of adjacent vertices is influenced, if the witness search was perfect. We therefore only update the importance values of all vertices in the queue that are adjacent to . In practice, with a limited witness search, we sometimes choose a vertex with a sightly suboptimal . However, preliminary experiments have shown that this effect can be safely ignored. Hence, for the experiments presented in Section 9, we do not use lazy updates or periodic queue rebuilding as proposed in .
The metric-dependent orders presented in the previous section lead to very good results on road graphs with travel time metric. However, the results for the distance metric are not as good and the orders are completely impracticable to compute Contraction Hierarchies without witness search as our experiments in Section 9 show. To support metric-independence, we therefore use nested dissection orders as suggested in  or ND-orders for short. An order for is computed recursively by determining a balanced separator of minimum cardinality that splits into two parts induced by the vertex sets and . The vertices of are assigned to in an arbitrary order. Orders and are computed recursively and assigned to and , respectively. The base case of the recursion is reached when the subgraphs are empty. Computing ND-orders requires good graph bisectors, which in theory is -hard. However, recent years have seen heuristics that solve the problem very well even for continental road graphs [34, 15, 14]. This justifies assuming in our particular context that an efficient bisection oracle exists. We experimentally examine the performance of nested dissection orders computed by NDMetis  and KaHIP  in Section 9. After having obtained the nested dissection order we reorder the in-memory vertex IDs of the input graph accordingly, i. e., the contraction order of the reordered graph is the identity function. This improves cache locality and we have seen a resulting acceleration of a factor 2 to 3 in query times. In the remainder of this section we prepare and provide a theoretical approximation result.
For , let , be a class of graphs that is closed under subgraph construction and admits balanced separators of cardinality .
For every a ND-order results in vertices in the maximum search space.
The proof of this lemma is a straightforward argument using a geometric series as described in . As a direct consequence, the average number of vertices is also in and the number of arcs in .
For every connected graph with minimum balanced separator and every order , the chordal super graph contains a clique of vertices. Furthermore, there are at least vertices such that this clique is a subgraph of their search space in .
This lemma is a minor adaptation and extension of , who only prove that such a clique exists but not that it lies within enough search spaces. We provide the full proof for self-containedness.
Consider the subgraph of induced by the vertices . Do not confuse with the core graph . Choose the smallest , such that a connected component exists in such that . As is connected, such an must exist. We distinguish two cases:
: Consider the set of vertices adjacent to in but not in . Let be the set of all remaining vertices. is by definition a separator. It is balanced because and . As is minimum, we have that . For every pair of vertices and there exists a path through as is connected. The vertices and are not in as otherwise they could be added to . The ranks of and are thus strictly larger than . On the other hand, the ranks of the vertices in are at most as they are part of . The vertices and thus have the highest ranks on the path. They are therefore contracted last and therefore an edge in must exist. is therefore a clique. Furthermore, from every to every there exists a path such that has the highest rank. Hence, is in the search space of , i. e., there are at least vertices whose search space contains the full -clique.
: As is minimum, we know that and that removing it disconnects into connected subgraphs . We know that for all because is minimum. We further know that . We can therefore select a subset of components such that the number of their vertices is at most but at least . Denote by their union. Note that does not contain . Consider the vertices adjacent to in . The set contains . Using an argument similar to Case 1, one can show that . But since is not connected, we cannot directly use the same argument to show that forms a clique in . Observe that is connected and thus the argument can be applied to showing that it forms a clique. This clique can be enlarged by adding as for every a path through one of the components exists where and have the highest ranks and thus an edge must exist. The vertex set therefore forms a clique of at least the required size. It remains to show that enough vertices exist whose search space contains the clique. As has the lowest rank in the clique, the whole clique is contained within the search space of . It is thus sufficient to show that is contained in enough search spaces. As is adjacent to each component , a path from each vertex to exists such that has maximum rank showing that is contained in the search space of . This completes the proof as .
Let be a graph from with a minimum balanced separator with vertices. Then a ND-order gives an -approximation of the average and maximum search spaces of an optimal metric-independent contraction hierarchy in terms of vertices and arcs.
The key observation of this proof is that the top level separator solely dominates the performance. Denote by the ND-order and by an optimal order. First, we show a lower bound on the performance of . We then demonstrate that achieves this lower bound showing that is an -approximation.
As the minimum balanced separator has cardinality , we know by Lemma 2 that a clique with vertices exists in . As this clique is in the search space of at least one vertex with respect to , we know that the maximum number of vertices in the search space is at least . Further, as this clique contains arcs we also have a lower bound of on the maximum number of arcs in a search space. From these bounds for the worst case search space, we cannot directly derive bounds for the average search space. Fortunately, Lemma 2 does not only tell us that this clique exists but that it must also be inside the search space of at least vertices. For the remaining vertices we use a very pessimistic lower bound: We assume that their search space is empty. The resulting lower bound for the average number of vertices is and the lower bound for the average number of arcs is .
We required that , i. e., that recursive balanced separators exists. This allows us to apply Lemma 1. We therefore know that the number of vertices in the maximum search space of is in . In the worst-case this search space contains arcs. As the average case can never be better than the worst case, these upper bounds directly translate to upper bounds for the average search space.
As the given upper and lower bounds match, we can conclude that is a -approximation in terms of average and maximum search space in terms of vertices and arcs. ∎
In this section, we describe how to efficiently compute the hierarchy for a given graph and order . Weighted contraction hierarchies are commonly constructed using a dynamic adjacency array representation of the core graph. Our experiments show that this approach also works for the unweighted case, however, requiring more computational and memory resources because of the higher growth in shortcuts. It has been proposed  to use hash-tables on top of the dynamic graph structure to improve speed but at the cost of significantly increased memory consumption. In this section, we show that the contraction hierarchy construction can be done significantly faster on unweighted and undirected graphs. Note that in our toolchain, graph weights and arc directions are accounted for during the customization phase.
Denote by the number of vertices in (and ), by the number of edges in , by the number of arcs in , and by the inverse Ackermann function. For simplicity we assume that is connected. Our approach enumerates all arcs of in running time and has a memory consumption in . To store the arcs of , additional space in is needed. The approach is heavily based upon the method of the quotient graph . To the best of our knowledge it has not yet been applied in the context of route planning and there exists no complexity analysis for the specific variant employed by us. Therefore we discuss both the approach and present a running time analysis in the remainder of the section.
Recall that to compute the contraction hierarchy from a given input graph and order , one iteratively contracts each vertex, adding shortcuts between its neighbors. Let be the core graph in iteration . We do not store explicitly but employ a special data structure called contraction graph for efficient contraction and neighborhood enumeration. The contraction graph contains both yet uncontracted core vertices as well as an independent set of virtually contracted super vertices, see Figure 3 for an illustration. These super vertices enable us to avoid the overhead of dynamically adding shortcuts to . For each vertex in we store a marker bit indicating whether it is a super vertex. Note that can be obtained by contracting all super vertices in .
A vertex in is contracted by turning it into a super vertex. However, creating new super vertices can violate the independent set property. We restore it by merging neighboring super vertices: Denote by a super vertex that is a neighbor of . We rewire all edges incident to to be incident to and remove from . To support efficiently merging vertices in , we store a linked list of neighbors for each vertex. When merging two vertices we link these lists together. Unfortunately, combining these lists is not enough as the former neighbors of still have in their list of neighbors. We therefore further maintain a union-find data structure: Initially all vertices are within their own set. When merging and , the sets of and are united. We chose as representative as was deleted.111Or alternatively, we can let the union-find data structure choose the new representative. We then denote by the new representative and by the other vertex. In this variant, it is possible that the new is the old , which can be confusing. For this reason, we describe the simpler variant, where is always chosen as representative and thus always refers to the same vertex. When enumerates its neighbors, it finds a reference to . It can then use the union-find data structure to determine that the representative of ’s set is . The reference in ’s list is thus interpreted as pointing to .
It is possible that merging vertices can create multi-edges and loops. For example, consider that the neighborhood list of contains . After merging, the united list of will therefore contain a reference to . Similarly it will contain a reference to , which after looking up the representative is actually . Two loops are thus created at per merge. Furthermore, consider a vertex that is a neighbor of both and . In this case the neighborhood list of will contain two references to . These multi-edges and loops need to be removed. We do this lazily and remove them in the neighborhood enumeration instead of removing them in the merge operation.
Suppose that we want to enumerate the neighbors of a vertex in . Note that ’s neighborhood in differs from its neighborhood in . The neighborhood of in can contain super vertices, as super vertices are only contracted in . We maintain a boolean marker that indicates which neighbors have already been enumerated. Initially no marker is set. We iterate over ’s neighborhood list. For each reference we lookup the representative . If was already marked or is , we remove the reference from the list. If was not marked and is not , we mark it and report it as a neighbor. After the enumeration we reset all markers by enumerating the neighbors again.
However, during the execution of our algorithm, we are not interested in the neighborhood of in , but we want the neighborhood of in , i. e., the algorithm should not list super vertices. Our algorithm conceptually first enumerates the neighborhood of and then contracts . We actually do this in reversed order. We first contract . After the contraction is a super vertex. Because of the independent set property, we know that has no super vertex neighbors in . We can thus enumerate ’s neighbors in and exploit that in this particular situation the neighborhoods of in and coincide.
As there are no memory allocations, it is clear that the working space memory consumption is in . Proving a running time in is less clear. Denote by the degree of just before is contracted. coincides with the upward degree of in and thus . We first prove that we can account for the neighborhood cleanup operations outside of the actual algorithm. This allows us to assume that they are free within the main analysis. We then show that contracting a vertex and enumerating its neighbors is in . Processing all vertices has thus a running time in .
The neighborhood list of can contain duplicated references and thus its length can be larger than the number of neighbors of . Further, for each entry in the list, we need to perform a union find lookup. The costs of a neighborhood enumeration can thus be larger than . Fortunately, the first neighborhood enumeration compactifies the neighborhood list and thus every subsequent enumeration runs in . Removing a reference has a cost in . Our algorithm never adds references. Initially there are references. The total costs for removing references over the whole algorithm are thus in . As our graph is assumed to be connected, we have that and therefore . We can therefore assume that removing references is free within the algorithm. As removing a reference is free, we can assume that even the first enumeration of the neighbors of is within . Merging two vertices consists of redirecting a constant number of references within a linked list. The merge operation is thus in .
Our algorithm starts by enumerating all neighbors of to determine all neighboring super vertices in time. There are at most neighboring super vertices and therefore the costs of merging all super vertices into is in . We subsequently enumerate all neighbors a second time to output the arcs of . The costs of this second enumeration is also within . The whole algorithm thus runs in time as , which completes the proof.
While the described algorithm is efficient in theory, linked lists cause too many cache misses in practice. We therefore implemented a hybrid of a linked list and an adjacency array, which has the same worst case performance, but is more cache-friendly in practice. An element in the linked list does not only hold a single reference, but a small set of references organized as small arrays called blocks. The neighbors of every original vertex form a single block. The initial linked neighborhood list are therefore composed of a single block. We merge two vertices by linking their blocks together. If all references are deleted from a block, we remove it from the list.
A triangle is a set of 3 adjacent vertices. A triangle can be an upper, intermediate or lower triangle with respect to an arc , as illustrated in Figure 4. A triangle is a lower triangle of if has the lowest rank among the three vertices. Similarly is a upper triangle of if has the highest rank and is a intermediate triangle of if ’s rank is between the ranks of and . The triangles of an edge can be characterized using the upward and downward neighborhoods of and . There is a lower triangle of an arc if and only if . Similarly, there is an intermediate triangle of an arc with if and only if and an upper triangle of an arc if and only if . The triangles of an arc can thus be enumerated by intersecting the neighborhoods of the arc’s endpoints.
Efficiently enumerating all lower triangles of an arc is an important base operation of the customization (Section 7) and path unpacking algorithms (Section 8). It can be implemented using adjacency arrays or accelerated using extra preprocessing. Note that in addition to the vertices of a triangle we are interested in the IDs of the participating arcs as we need these to access the metric of an arc.
Triangles can be efficiently enumerated by exploiting their characterization using neighborhood intersections. We construct an upward and a downward adjacency array for , where incident arcs are ordered by their head respectively tail vertex ID. The lower triangles of an arc can be enumerated by simultaneously scanning the downward neighborhoods of and to determine their intersection. Intermediate and upper triangles are enumerated analogously using the upward adjacency arrays. For later access to the metric of an arc, we also store each arc’s ID in the adjacency arrays. This approach requires space proportional to the number of arcs in .
Instead of merging the neighborhoods on demand to find all lower triangles, we propose to create a triangle adjacency array structure that maps the arc ID of to the pair of arc ids of and for every lower triangle of . This requires space proportional to the number of triangles in , but allows for a very fast access. Analogous structures allow us to efficiently enumerate all upper triangles and all intermediate triangles.
For less well-behaved graphs the number of triangles can significantly outgrow the number of arcs in . In the worst case is the complete graph and the number of triangles is in whereas the number of arcs is in . It can therefore be prohibitive to store a list of all triangles. We therefore propose a hybrid approach. We only precompute the triangles for the arcs where the level of is below a certain threshold. The threshold is a tuning parameter that trades space for time.
Triangle preprocessing has similarities with micro and macro code in CRP . In the following, we compare the space consumption of these two approaches against our lower triangles preprocessing scheme. However, note that at this stage we do not yet consider travel direction on arcs. Hence, let be the number of undirected triangles and be the number of arcs in ; further let be the number of directed triangles and be the number of arcs used in . If every street is a one-way street, then and ; otherwise, without one-way streets, and .
Micro code stores an array of triples of pointers to the arc weights of the three arcs in a directed triangle, i. e., it stores the equivalent of arc IDs. Computing the exact space consumption of macro code is more difficult. However, it is easy to obtain a lower bound: Macro code must store for every triangle at least the pointer to the arc weight of the upper arc. This yields a space consumption equivalent to at least arc IDs. In comparison, our approach stores for each triangle the arc IDs of the two lower arcs. Additionally, the index array of the triangle adjacency array, which maps each arc to the set of its lower triangles, maintains entries. Each entry has a size equivalent to an arc ID. Our total memory consumption is thus arc IDs.
Hence, our approach always requires less space than micro code. It has similar space consumption as macro code if one-way streets are rare, otherwise it needs at most twice as much data. However, the main advantage of our approach over macro code is that it allows for random access, which is crucial in the algorithms presented in the following sections.
Up to now we only considered the metric-independent first preprocessing phase. In this section, we describe the second metric-dependent preprocessing phase, known as customization. That is, we show how to efficiently extend the weights of the input graph to a corresponding metric with weights for all arcs in . We consider three different distances between the vertices: We refer to as the shortest -path distance in the input graph . With we denote the shortest -path distance in when only considering up-down paths. Finally, let be the shortest -path distance in , i. e., when allowing arbitrary not-necessarily up-down paths in .
For correctness of the CH query algorithms (cf. Section 8) it is necessary that between any pair of vertices and a shortest up-down -path in exists with the same distance as the shortest -path in the input graph . In other words, must hold for all vertices and . We say that a metric that fulfills respects the input weights. If additionally holds, we call the metric customized. Note that customized metrics are not necessarily unique. However, there is a special customized metric, called perfect metric , where for every arc in the weight of this arc is equal to the shortest path distance . We optionally use the perfect metric to perform perfect witness search.
Constructing a respecting metric is trivial: Assign to all arcs of that already exist in their input weight and to all other arcs . Computing a customized metric is less trivial. We therefore describe in Section 7.1 the basic customization algorithm that computes a customized metric given a respecting one. Afterwards, we describe the perfect customization algorithm that computes the perfect metric given a customized one (such as for example ). Finally, we show how to employ the perfect metric to perform a perfect witness search.
A central notion of the basic customization algorithm is the lower triangle inequality, which is defined as following: A metric fulfills it if for all lower triangles of each arc of , it holds that . We show that every respecting metric that also fulfills this inequality is customized. Our algorithm exploits this by transforming the given respecting metric in a coordinated way that maintains the respecting property and assures that the lower triangle inequality holds. The resulting metric is thus customized. We first describe the algorithm and prove that the resulting metric is respecting and fulfills the inequality. We then prove that this is sufficient for the resulting metric to be customized.
Our algorithm iterates over all arcs ordered increasingly by the rank of in a bottom-up fashion. For each arc , it enumerates all lower triangles and checks whether the path is shorter than the path . If this is the case, then it decreases so that both paths are equally long. Formally, it performs for every arc the operation . Note, that this operation never assigns values that do not correspond to a path length and therefore remains respecting. By induction over the vertex levels, we can show that after the algorithm is finished, the lower triangle inequality holds for every arc, i. e., for every arc and lower triangle the inequality holds. The key observation is that by construction the rank of must be strictly smaller than the ranks of and . The final weights of and have therefore already been computed when considering . In other words, when the algorithm considers the arc , the weights and are guaranteed to remain unchanged until termination.
Every respecting metric that additionally fulfills the lower triangle inequality is customized.
We need to show that between any pair of vertices and a shortest up-down -path exists. As we assumed for simplicity that is connected, there always exists a shortest not-necessarily up-down path from to . Either this is an up-down path, or a subpath with and must exist. As is contracted before and , an edge must exist. Because of the lower triangle inequality, we further know that and thus replacing by does not make the path longer. Either the path is now an up-down path or we can apply the argument iteratively. As the path has only a finite number of vertices, this is guaranteed to eventually yield the up-down path required by the theorem and thus this completes the proof. ∎
Given a customized metric , we want to compute the perfect metric . We first copy all values of into . Our algorithm then iterates over all arcs decreasing by the rank of in a top-down fashion. For every arc it enumerates all intermediate and upper triangles and checks whether the path over is shorter and adjusts the value of accordingly, i. e., it performs . After all arcs have been processed is the perfect metric, as is shown in the following theorem.
After the perfect customization, corresponds to the shortest -path distance for every arc , i. e., is the perfect metric.
We have to show that after the algorithm has finished processing a vertex , all of its outgoing arcs in are weighted by the shortest path distance. We prove this by induction over the level of the processed vertices. The top-most vertex is the only vertex in the top level. It does not have any upward arcs and thus the algorithm does not have anything to do. This forms the base case of the induction. In the inductive step, we assume that all vertices with a strictly higher level have already been processed. As consequence, we know that the upward neighbors of form a clique weighted by shortest path distances. Denote these neighbors by . The situation is depicted in Figure 5. The weights of the encode a complete shortest path distance table between the upward neighbors of .
Pick some arbitrary arc . We show the correctness of our algorithm by proving that either is already the shortest path distance or a neighbor must exist such that is a shortest up-down path. For the rest of this paragraph assume the existence of , we prove its existence in the next paragraph. If is already the shortest -path distance, then enumerating triangles will not change and is thus correct. If is not the shortest -path distance, then enumerating all intermediate and upper triangles of is guaranteed to find the path and thus the algorithm is correct. The upper triangles correspond to paths with while the intermediate triangles to paths with .
It remains to show that the
shortest up-down path actually exists. As the metric is customized at every moment during the perfect customization, we know that a shortest up-down-path exists. As is an up-down path, we can conclude that the second vertex of must be an upward neighbor of . We denote this neighbor by . thus has the following structure: . As has a higher rank than , is guaranteed to be the shortest -path distance, and therefore we can replace the subpath of by and we have proven that the required shortest up-down path exists, which completes the proof. ∎
Using the perfect customization algorithm, we can efficiently compute the weighted CH with a minimum number of arcs with respect to the same contraction order. We present two variants of our algorithm. The first variant consists of removing all arcs whose weight after the basic customization does not correspond to the shortest -path distance . This variant is simple and always correct, but it does not remove as many arcs as possible, if a pair of vertices and exists in the input graph such that there are multiple shortest -paths. The second variant222Note that the second algorithm variant exploits that we defined weights as being non-zero. If zero weights are allowed, it may remove too many arcs. A workaround consists of replacing all zero weights with a very small but non-zero weight. also removes these additional arcs. An arc is removed if and only if an upper or intermediate triangle exists such that the shortest path from over to is no longer than the shortest -path. However, before we can proof the correctness of the second variant, we need to introduce some technical machinery, which will also be needed in the correctness proof of the stalling query algorithm. We define the “height” of a not-necessarily up-down path in . We show that with respect to every customized metric, for every path that is not up-down, an up-down path must exist that is strictly higher and is no longer.
The first algorithm variant consists of removing all arcs from the CH for which . It is optimal if shortest path are unique in the input graph, i. e., between every pair of vertices and there is only one shortest -path. This simple algorithm is correct as the following theorem shows.
If the input graph has unique shortest paths between all pairs of vertices, then we can remove an arc from the CH if and only if .
We need to show that after removing all arcs, there still exists a shortest up-down path between every pair of vertices and . We know that before removing any arc a shortest up-down -path exists. We show that no arc of is removed and thus also exists after removing all arcs. Every subpath of must be a shortest path as is a shortest path. Every arc of is a subpath. However, we only remove arcs such that , i. e., which are not shortest paths.
To show that no further arcs can be removed we need to show that if , then the path is the only shortest up-down path. Denote the path by . Suppose that another shortest up-down path existed. must be different than , i. e., a vertex must exist that lies on but not on . As must be reachable from , we know that is higher than . Unpacking the path in the input graph yields a path where and are the highest ranked vertices and thus this unpacked path cannot contain . Unpacking yields a path that contains and is therefore different. Both paths are shortest paths from to in the input graph. This contradicts the assumption that shortest paths are unique. We have thus proven that, if the input graph has unique shortest paths, we can remove an arc if and only if . ∎
Using the first variant of our algorithm, even when shortest paths are not unique in the original graph is not wrong. However, it is possible that some arcs are not removed that could be removed. Our second algorithm variant does not have this weakness. It removes all arcs for which an intermediate or upper triangle exists such that . These arcs can efficiently be identified while running the perfect customization algorithm. An arc is marked for removal if an upper or intermediate triangle with is encountered. However, before we can proof the correctness of the second variant, we need to introduce some technical machinery.
We want to order paths by “height”. To achieve this, we first define for each path in its rank sequence. We order paths by comparing the rank sequences lexicographically. Denote by the vertices in . For each edge in the rank sequence contains . The numbers in the rank sequences are order decreasingly. Two paths have the same height if one rank sequence is a prefix of the other. Otherwise we compare the rank sequences lexicographically. This ordering is illustrated in Figure 6. We proof the following technical lemma:
Let be some customized metric. For every -path that is no up-down path, an up-down -path exists, such that is strictly higher and is no longer than with respect to .
Denote by the vertices on the path . As is no up-down path, there must exist a vertex on that has lower ranks than its neighbors and . and are different vertices because they are part of a shortest path and zero weights are not allowed. Further, as is contracted before its neighbors, there must be a edge between and . As the metric is customized, must hold. We can thus remove from and replace it with the are without making the path longer. Denote this new path by . is higher than as we replaced in the rank sequence by , which must be larger. Either, is an up-down path or we apply the argument iteratively. In each iteration the path looses a vertex and therefore we can guarantee that eventually we obtain an up-down path that is higher than and no longer. This is the desired up-down path that is no longer than and strictly higher. ∎
Note that, this lemma does not exploit any property that is inherent to CHs with a metric-independent contraction ordering and is thus applicable to every CH.
Given this technical lemma, we can prove the correctness of the second variant of our algorithm.
We can remove an arc if and only if an upper or intermediate triangle exists with .
We need to show that for every pair of vertices and a shortest up-down -path exists, that uses no removed arc. We show that a highest shortest up-down -path has this property. As the metric is customized, we know that a shortest up-down -path exists before removing any arcs. If does not contain an arc for which an upper or intermediate triangle exists with , then there is nothing to show. Otherwise, we modify by inserting between and . This does not modify the length of , but we can no longer guarantee that is an up-down path. If was an intermediate triangle, then is still an up-down path. However, it is strictly higher, as we added into the rank sequence, which is guaranteed to be larger than . If was an upper triangle, then is still no up-down path anymore. Fortunately, using Lemma 3 we can transform into an up-down path, that is no longer and strictly higher. In both case, the new is an up-down path or we apply the argument iteratively. As gets strictly higher in each iteration and the number of up-down paths is finite, we know that we will eventually obtain a shortest up-down -path where no arc can be removed.
Further, we need to show that if no such triangle exists, then an arc cannot be removed, i. e., we need to show that the only shortest up-down path from to is the path consisting only of the arc. Assume that no such triangle and a further up-down path existed. must contain a vertex beside and and all vertices in must have the rank of or higher. Consider the vertex that comes directly after in . As is contracted before and , an arc between and must exist. Therefore, a triangle must exist that is an intermediate triangle, if has a lower rank than and is an upper triangle, if has a higher rank than . However, we assumed that no such triangle can exist. We have thus proven that we can remove an arc if and only if an upper or intermediate triangle exists with . ∎
The basic customization can be parallelized by processing the arcs that depart within a level in parallel. Between levels, we need to synchronize all threads using a barrier. As all threads only write to the arc they are currently assigned to and only read from arcs processed in a strictly lower level, we can thus guarantee that no read/write conflict occurs. Hence, no locks or atomic operations are needed.
On most modern processors, the perfect customization can be parallelized analogously to the basic customization algorithm. However, seeing why this is correct is non-obvious because the exact order in which the threads are executed influences intermediate results. Fortunately, the end result is always the same and independent of the execution order. Our algorithm works as following: We iterate over all arcs departing within a level in parallel and synchronize all threads between levels. For every arc we enumerate all upper and intermediate triangles and update accordingly. Consider the situation from Figure 5. Suppose that thread processes the arc at the same time as thread processes the arc . Further, suppose that thread updates at the same moment as thread enumerates the triangle. In this situation it is unclear what value thread will see. However, our algorithm is correct as long it is guaranteed that thread will either see the old value or the new value.
In the proof of Theorem 3, we have shown, that for every vertex and arc either the arc already has the shortest path distance or an upper or intermediate triangle exists, such that is a shortest path. No matter in which order the threads process the arcs, they do not modify shortest path weights. This implies that the shortest path is thus retained, regardless of the execution order. This shortest path is not modified and is guaranteed to exist before any arcs outgoing from the current level are processed. Every thread is thus guaranteed to see it. However, other weights can be modified. Fortunately, this is not a problem as long as we can guarantee that no thread sees a value that is below the corresponding shortest path distance. Therefore, if we can guarantee that thread either sees the old value or the new value, as is the case on x86 processors, then the algorithm is correct. If thread can see some mangled combination of the old value’s bits and new value’s bits, then we need to use locks or make sure that all outgoing arcs of are processed by the same thread.
Up to now we have focused on customizing undirected graphs. If the input graph is directed, our toolchain works as follows: Based on the undirected unweighted graph induced by we compute a vertex ordering (Section 4), build the upward directed Contraction Hierarchy (Section 5), and optionally perform triangle preprocessing (Section 6). For customization, however, we consider two weights per arc in , one for each direction of travel. One-way streets are modeled by setting the weight corresponding to the forbidden traversal direction to . With respect to we define an upward metric and a downward metric on . For each arc in the directed input graph with input weight , we set if , i. e., if is ordered before ; otherwise, we set . All other values of and are set to . In other words, each arc of the Contraction Hierarchy has upward weight if , downward weight if , and otherwise.
The basic customization considers both metrics and simultaneously. For every lower triangle of it sets and . The perfect customization can be adapted analogously. For every intermediate triangle of the perfect customization sets and . Similarly for every upper triangle of the perfect customization sets and . The perfect witness search might need to remove an arc only in one direction. It therefore produces, just as in the original CHs, two search graphs: an upward search graph and a downward search graph. The forward search in the query phase is limited to the upward search graph and the backward search to the downward search graph, just as in the original CHs. The arc is removed from the upward search graph if and only if an intermediate triangle with exists or an upper triangle with exists. Analogously, the arc is removed from the downward search graph if and only if an intermediate triangle with exists or an upper triangle with exists.
The weights attached to each arc in the CH can be replaced by an interleaved set of
weights by storing for every arc a vector ofelements. Vectors allows us to customize all metrics in one go, amortizing triangle enumeration time and they allow us to use single instruction multiple data (SIMD) operations. Further, as we use essentially two metrics to handle directed graphs, we can store both of them in a 2-dimensional vector. This allows us to handle both directions in a single processor instruction. Similarly, if we have directed input weights we can store them in a -dimensional vectors.
The processor needs to support component-wise minimum and saturated addition, i. e., must hold in the case of an overflow. In the case of directed graphs it additionally needs to support efficiently swapping neighboring vector components. A current SSE-enabled processor supports all the necessary operations for 16-bit integer components. For 32-bit integer saturated addition is missing. There are two possibilities to work around this limitation: The first is to emulate saturated-add using a combination of regular addition, comparison and blend/if-then-else instruction. The second consists of using 31-bit weights and use as value for instead of . The algorithm only computes the saturated addition of two weights followed by taking the minimum of the result and some other weight, i. e., if computing for all weights , and is unproblematic, then the algorithms works correctly. We know that and are at most and thus their sum is at most which fits into a 32-bit integer. In the next step we know that is at most and thus the resulting minimum is also at most .
Until now we have only considered computing metrics from scratch. However, in many scenarios this is overkill, as we know that only a few edge weights of the input graph were changed. It is unnecessary to redo all computations in this case. The ideas employed by our algorithm are somewhat similar to those presented in , but our situation differs as we know that we do not have to insert or remove arcs. Denote by the set of arcs whose weights should be updated, where is the arc ID and the new weight. Note that modifying the weight of one arc can trigger further changes. However, these new changes have to be at higher levels. We therefore organize as a priority queue ordered by the level of . We iteratively remove arcs from the queue and apply the change. If new changes are triggered we insert these into the queue. The algorithm terminates once the queue is empty.
Denote by the arc that was removed from the queue and by its new weight and by its old weight. We first have to check whether can be bypassed using a lower triangle. For this reason, we iterate over all lower triangles of and perform . Furthermore, if is an edge in the input graph , we might have overwritten its weight with a shortcut weight, which after the update might not be shorter anymore. Hence, we additionally test that