Despite remarkable progress of route planning algorithms in road networks , public transit routing still requires specific algorithms due to its temporal nature. Various efficient methods were proposed such as CSA , RAPTOR , Transfer Pattern [3, 5], PTL . They all consider a graph with two types of edges: the connections that correspond to a vehicle traveling from a stop to the next one, and the transfers that correspond to walking from a stop to another nearby stop. While each connection is scanned only once per query, transfer edges from a stop are considered each time an event is detected at the stop. Efficiency of such techniques thus relies on the sparsity of the transfer graph. Additionally, they all share the requirement that the graph resulting from walking transfers is transitively closed and are generally experimented with a sparse transfer graph by restricting transfers to very short distances only. Allowing unrestricted transfers, that is walking from a stop to any other stop, is indeed out of reach with these methods although it would allow to find better answers. Indeed, recent work  shows the benefit of using unrestricted walking over sparse transfers by measuring that it can reduce travel time by hours in Switzerland and Germany networks.
This paper is devoted to enable unrestricted walking in efficient public transit routing. The motivation for considering unrestricted walking goes beyond the gain of quality in the answers. It is indeed a fundamental step towards computing multimodal journeys as it is considered as a main bottleneck in . Note that bicycle or taxi transfers can be handled similarly as walking transfers with different speed and cost. Techniques developed for unrestricted walking can thus generalize to other modes of transportation.
A first step towards unrestricted walking was made by MCR  and UCCH  algorithms that both use a contracted version of the full walking graph, inspired by contraction hierarchies , for representing the full walking graph which is much bigger. However, this contracted graph is not transitively closed and has to be globally scanned several times during a query. Accelerating such computations with unrestricted walking is still challenging as multi-criteria or profile queries require seconds to be performed on practical networks with these methods.
In static graphs such as road networks, hub labeling [1, 10] (also called 2-hop labeling ) is a remarkable technique that achieves state-of-art response-time to shortest path queries. It consists in selecting for each node a small set of access nodes called hubs such that any shortest path can be described as a two hop travel through a common hub of the extremities. Intersecting the two lists of hubs of a source and a destination indeed allows to find efficiently the shortest path between them. Such technique was used in PTL  on the time expanded graph representation of a network to obtain fast transit routing. A similar approach is followed by TTL  which revisits hierarchical hub labeling  in the context of public transit networks. However, these approaches still assume sparse transfers. Note that the time expanded graph representation duplicates transfer edges from a stop for all events at that stop, and its size can blow up with dense transfer graphs.
In this work, we propose a new approach for handling unrestricted walking in public transit routing based on a different usage of hub labeling. It basically consists in decomposing walking transfers into two consecutive hops. We use hub labeling in the classical setting of a static graph but in a novel manner compared to distance or shortest path queries: we scan hub lists to propagate reachability information. Interestingly, the technique can easily be adapted to both RAPTOR and CSA based algorithms which are the two main classical approaches with restricted transfers. HLRaptor, our variant of RAPTOR obtains significant speedup compared to MCR. HLCSA, our variant of CSA obtains competitive running times for earliest arrival time and profile queries.
The paper is organized as follows. Section 2 defines public transit networks, describes briefly RAPTOR and CSA algorithms, and introduces hub labeling. Sections 3 and 4 present HLRaptor and HLCSA respectively. We describe in Section 5 public transit data used to evaluate our algorithms. The results of our experiments are presented in Section 6.
We define a public transit network with a triple representing trips of vehicles (buses, trains, etc.): is the set of stops where passengers can enter or disembark from a vehicle, is the set of trips made by vehicles and are grouped into routes represented by the set . More precisely, a trip is given by a sequence of stops served by a vehicle and for each stop in the sequence, an arrival time of the vehicle at stop and a departure time of the vehicle from stop . A route consists in a set of trips with same stop sequence. This set of trips can be represented by a two-dimensional timetable where each line lists the arrival and departure times of a trip for all stops in the sequence. Note that the sequence of times listed in a line is non-decreasing. Similarly to RAPTOR authors , we assume that no trip of a route can overtake another trip of the same route. In other words, the lines of the timetable can be sorted so that each column is non-decreasing. This property can easily be enforced by splitting the set of trips with same stop sequence into smaller subsets of trips if necessary.
The public transit network is complemented by a weighted footpath graph with , and denotes the time needed to walk from a node to a node . In the unrestricted walking setting, the graph is not assumed to be transitively closed and we expect that is much larger than . Each edge typically corresponds to a segment of street than can be traversed by walking. We let denote the length of a shortest path in from to , i.e., the minimum walking time from to .
We consider several journey computation problems. Given a source stop and a target stop , a journey from to is an alternating sequence of trips and footpaths in the public transit network, which starts with and ends with . The goal of public transit planning is to compute journeys from to optimizing one or several criteria. Given a departure time , an earliest arrival time query consists in computing a journey with minimum arrival time at that departs from at or later. In a multi-criteria query, we are additionally interested in the number of transfers and the overall walking time of the journey, and ask for all Pareto-optimal journeys. Recall that a journey is Pareto-optimal if no other journey is better on one criterion and at least as good on all criteria. In a profile (or range) query, we ask for journeys whose departure time falls within a given time interval while optimizing both departure time and arrival time (where later departure time is considered better). The required answer again consists in all Pareto-optimal journeys within the given time interval.
The RAPTOR algorithm and its variants [11, 7] compute journeys starting from a given stop at a given time in rounds, where each round extends partial journeys by one trip. More precisely, each round consists of two phases: the first phase explores each route of the public transit network and extends partial journeys arriving at stops served by a route using the first trip arriving at each stop. In the second phase, each partial journey arriving at a stop is extended by walking paths from that stop. In the regular RAPTOR version , single-edge paths only are considered and the footpath graph is assumed to be transitively closed. In the unrestricted walking setting , a multi-source Dijkstra is performed on a contracted version of the footpath graph in order to find all stops whose arrival times can be improved by walking from the stops that were scanned during the first phase.
The Connection Scan Algorithm (CSA)  breaks each trip into consecutive connections, which represent a vehicle traveling from a stop to the next one in the stop sequence of the trip. All connections are sorted by departure times in a pre-computation step. The algorithm scans all the connections and transfers to update the earliest arrival time at each reachable stops. More precisely, for each connection in increasing order of departure time, we need to check whether a passenger can travel on or not: either the trip containing has been reached earlier, or we can arrive at the departure stop of before its departure time. Then we update the arrival time at the arrival stop of if necessary, and scan the footpath transfers from the arrival stop of . Similarly to RAPTOR, CSA also requires the footpath graph to be transitively closed.
Two-hop labeling , or equivalently, hub labeling [1, 10], for a (weighted, directed) graph consists in assigning two subsets of nodes and to each node . Nodes in (resp. ) are called in-hubs (resp. out-hubs) and serve as intermediate nodes to reach (resp. to leave ). The following two-hop property is required: for any pair of nodes, there must exist a common hub lying on a shortest path from to , i.e., satisfying . Equivalently, (resp. ) can be seen as a graph with vertex set and edges with weight for every pair such that (resp. ). The two-hop property can then be stated as , where denotes the transitive closure of and denotes the graph product resulting from the -matrix product of adjacency matrices (the weight of an edge in is ). In other words, any shortest path in corresponds to a two-hop path in . The interest for such representation comes from the fact that it is possible to compute very small hub sets (less than 100 nodes on average) in large road networks and footpath graphs , and thereby obtain the fastest known practical oracles for computing distances and shortest paths in such networks .
3 HLRaptor: RAPTOR with Two-Hop Transfers
Using a hub labeling of the footpath graph , we propose the following modification of RAPTOR that we call HLRaptor. We replace the second phase of a round by two sub-phases: in the first sub-phase we scan every stop for which arrival time was improved in the regular first phase of the round, and update arrival time at its out-hubs to . In the second sub-phase, we scan every hub whose arrival time was improved in the first sub-phase and update arrival time at nodes such that to .
The correctness of HLRaptor comes from the two-hop property of the hub labeling that ensures . Our two sub-phases using and are thus equivalent to the second phase of the regular RAPTOR algorithm using the transitive closure of . However, its performance depends on the out-degrees of and rather than that of .
3.0.1 Target pruning optimization.
The lists and can be pre-computed for all . Additionally, these lists can be sorted according to walking time from (resp. ) in non-decreasing order. This enables a target pruning optimization where we stop scanning a list as soon as the arrival time computed for a node in the list exceeds the best arrival time known at the target.
3.0.2 HLprRaptor: profile queries with HLRaptor.
We can follow the same approach as  to compute all Pareto-optimal journeys with respect to departure time and arrival time in a given interval of time. The difference is that we use HLRaptor instead of MCR. The idea is to use HLRaptor to compute the best arrival time when starting at a givent time . Then we use a reverse version of HLRaptor (or simply a reversed version of the transit data) to compute the last departure time such that arrival at is still possible. We then repeat this procedure for departure time for sufficiently small (we simply use second, which is the time unit in our datasets). We iterate this until all Pareto-optimal journeys in the given time interval have been found.
3.0.3 HLmcRaptor: HLRaptor with multiple criteria.
To deal with more criteria than arrival time and number of transfers, we can keep multiple non-dominating labels for each stop in round in a bag structure similarly to McRAPTOR . For each route with a stop improved in the previous round, we scan the first trip departing after any improved arrival time at a stop of the route and update bags accordingly at the stops served by the trip after . In the second phase of the round, each newly inserted label is first propagated along out-hubs links and then newly inserted labels at hubs are propagated along in-hubs links similarly. We can adapt local and target pruning as in McRAPTOR. We can also adapt our target pruning optimization specific to HLRaptor to stop scanning hub lists as soon as the propagated label is dominated by the destination bag.
4 HLCSA: Connection Scan with Two-Hop Transfers
Given a hub labeling of the walking graph , we propose the following modification of CSA. For an earliest arrival time query from to , we first scan out-hubs and update arrival time to them by walking from . Similarly to CSA, we then scan connections by non-decreasing departure time. When considering a connection , we first scan the in-hubs of its departure stop and update the arrival time at through walking from a hub. The connection can be boarded if the trip has been marked as boarded or if the arrival time at plus the minimum transfer time at is no later than the departure time of . In that case, we update the arrival time at the arrival stop of and scan its out-hubs to update their arrival times through walking from . Finally, we scan the in-hubs of the destination and update the arrival time at by walking from any of them.
The correctness of the algorithm comes again from the two-hop property of hubs. For any possible transfer from a connection to another connection in a journey, must be considered before . Let denote a common hub for the arrival stop of and the departure stop of such that according to the two-hop property. After is considered, arrival time at is thus no more than , where is the arrival time of at and is the walking time from to . When is then considered, arrival time to is updated to as if a transfer from to had been considered. A similar reasoning applies for a journey starting with a walk from or ending with a walk to . HLCSA thus behaves as in a regular CSA execution where all transitive transfers in would be considered.
In addition to all CSA classical optimizations, we can again sort out-hub lists in non-decreasing order of walking time, and apply target pruning similarly as in HLRaptor. In addition, we scan the in-hub list of the departure stop of a connection when the trip is not marked as boarded. Again, this list can be sorted by non-decreasing walking time and we stop scanning the list as soon as the walking time from the hub exceeds the estimated travel time to the departure stop (local pruning).
4.0.2 HLprCSA: Profile queries with HLCSA.
Similarly to the original extension of CSA to solve the profile problem , we store for each stop a bag containing Pareto-optimal pairs of departure time at stop with associated arrival time at destination. We also store such information for hubs. We also consider connections in non-increasing order of departure time. When scanning a connection , we use the bags of the out-hubs of its arrival stop to obtain the best arrival time through walking after . If the arrival time of the trip of is improved, we then update the bags of the in-hubs of the departure stop of for that arrival time with departure time corresponding to walking from the hub for boarding right in time the connection. We also scan in-hubs of the destination at the beginning of the procedure and out-hubs of the source at end in order to take care of walking from source and to destination. The correctness of the modification follows similar lines as for HLCSA.
5 Public Transit Data
To evaluate the algorithms, we use datasets from three locations: London, Paris, and Switzerland. The dataset for London was obtained from Transport for London . The dataset for Paris was obtained from Open Data RATP . And the dataset for Switzerland was provided by Wagner & Zündorf 444https://i11www.iti.kit.edu/PublicTransitData/Switzerland/. The extracted dates are 2015-11-06 for London and 2018-03-30 for Paris.
The public transit data of Paris already has transfers between stops, we simply need to make the transfer graph transitively closed for appropriate use with RAPTOR and CSA. However, the dataset of London does not have transfers, thus we have created transfers by linking any pair of stops separated by 75 meters of walk one from another. This threshold was chosen to obtain a transitively closed transfer graph with similar size as in previous works. The graph obtained by transitive closure of restricted transfers is called transfer graph in the sequel.
The footpath graphs for London and Paris were extracted from Geofabrik’s data , which is itself extracted from OpenStreetMap’s data . We call walking graph the union of this unrestricted footpath graph and the transfer graph. The method to merge a stop of the public transit network into the walking graph is the following. For each stop , we find the closest node in the walking graph. If the distance between and is less than 5 meters, we identify and , connecting with the in- and out-neighbors of using the same weights. Otherwise, we find the 5 closest nodes of in the walking graph, and connect with those at distance 100 meters at most. If there are no nodes in the walking graph within the radius of 100 meters from , then is isolated. Walking times are computed according to a walking speed of 4 km/h. Table 1 provides statistics concerning the datasets. The columns stops and transfers provide the number of nodes and edges in the transitively closed restricted walking graph, while the last two columns give the numbers of vertices and edges in the unrestricted walking graphs.
We computed hub labelings of the walking graphs using the sampling-based algorithm by Delling et al.  (1-2h of pre-computation per graph). Table 2 provides statistics on the degrees of transfer graphs vs. in-hubs and out-hubs graphs: and designate the average and maximum out-degree resp. of the transfer graph , and designate the average and maximum out-degree resp. of the out-hub graph , and designate the average and maximum in-degree resp. of the in-hub graph . We let and designate the number of edges in and resp. while designates the number of hubs (including stops). We note that the size of hub lists is comparable to the number of events and their storage do not increase too much space requirements.
We also prepared two sets of roughly 1000 queries for each dataset. In the first one, source and destinations are selected independently uniformly at random among all stops similarly to experiments in [11, 12, 7]. In the second one, we select sources and destinations similarly to : one hundred sources are selected uniformly at random. For each source, we order the destinations by increasing walking distance and select a random one uniformly among those with rank in for . For Switzerland, we use exactly the same pairs as in 
where sources are selected with probability proportional the number of trips serving them. In both sets, we additionally selected uniformly at random a departure time infor each source-destination pair. We will reference the two sets of queries as “uniform” and “rank” respectively. Note that most of the uniform queries (those in the uniform set) correspond to high rank pairs ( or higher) while the rank set of queries has a strong bias towards low rank pairs.
The datasets are made publicly available555https://files.inria.fr/gang/graphs/public˙transport/.
Our algorithms were implemented in C++ and compiled with GCC version 7.2.0 (with flag -O3). Experiments were conducted on one core of a dual 10-core Intel Xeon E5-2670-v2 with with 25 MiB of L3 cache and 64 GiB of DDR3-1866 RAM. The code is made available666https://github.com/lviennot/hl-csa-raptor.
Table 3 presents the average running times in milliseconds of HLRaptor and HLCSA variants on the three datasets. We indicate for each algorithm which criteria are optimized: arrival time (Arr.), number of transfers (Nb. tr.), overall walking time (Walk), and whether the query spans a range of departure times (Range).
In the restricted walking setting, our algorithms are equivalent to the corresponding Raptor or CSA based version. On the London instance with restricted walking and uniform queries, we obtain similar results as Raptor  for earliest arrival, multi-criteria and 2h range queries: 5.1ms vs. 7.3ms, 87.3ms vs. 107ms, and 76.5ms vs. 87ms, respectively (we compare times reported in Table 3 to times reported in ). Our running times are 15-30% faster, probably due to the use of more recent hardware. We also obtain similar results as CSA  for earliest arrival and 24h range queries: 2.2ms vs. 1.2ms and 312 vs. 107ms. Our running times are 2-3 times slower than those reported in , probably due to less optimized code.
In the unrestricted walk setting, our algorithms are significantly faster than previous works. On the London instance with uniform queries and unrestricted walking, HLmcRaptor is 3.4 times faster than times reported for MCR in  (417ms vs. 1438ms) and HLRaptor is 1.7 times faster than the MR- variant of MCR (26.4ms vs. 44.4ms). On the Switzerland instance with ranked based queries and unrestricted walking, HLprRaptor computes profile queries roughly 7 times faster than the profile variant of MCR proposed in : 751ms vs. 5.5s approximately. Most uniform queries have high rank, and HLprRaptor obtains their profiles in roughly 3.5s compared to 20s approximately as reported in .
Interestingly, our hub-labeling-based versions of CSA obtain rather good performances with respect to Raptor based versions in the unrestricted walk setting: they are nearly as fast or even faster on uniform queries, and at most 2-3 times slower on rank queries. (Note that on low rank queries, Raptor-based solutions benefit from target pruning.)
|Unif.||Unif. 6h-20h||Rank||Rank 6h-20h|
|London||12% / 5.8%||6.9% / 2.9%||24% / 13%||16% / 5.0%|
|Paris||22% / 15%||15% / 13%||31% / 21%||22% / 17%|
|Switzerland||47% / 46%||37% / 39%||47% / 47%||35% / 37%|
6.0.1 Gain of unrestricted walking.
We confirm the results of  showing the benefit of considering unrestricted walking. Table 4 presents the percentage of time gained by using a journey with unrestricted walking compared to the travel time with restricted walking. The average gain ranges from 12% to 47% on uniform queries depending on the dataset. City networks (especially London) seem to benefit less from unrestricted walking than the train network of Switzerland. As observed in , the gain is less important during daytime that is queries with departure time in the range 6h-20h here. We observe a higher gain on low rank queries. The median gain ranges from 13% to 47% for them. More precisely, the gain is at least 13% on half of the low rank queries for London, 21% for Paris and 47% for Switzerland.
We have demonstrated the efficiency of using a two-hop representation of unrestricted walk transfers in conjunction with CSA and RAPTOR algorithm. This shows that is possible to enable unrestricted walking in practical public transit routing engines and opens new perspectives for allowing complex multimodal scenarios. We also want to further investigate how this approach could be integrated in other efficient public transit routing algorithms.
-  Ittai Abraham, Daniel Delling, Andrew V. Goldberg, and Renato Fonseca F. Werneck. A hub-based labeling algorithm for shortest paths in road networks. In Experimental Algorithms - 10th International Symposium, SEA, volume 6630 of Lecture Notes in Computer Science, pages 230–241, 2011.
-  Ittai Abraham, Daniel Delling, Andrew V. Goldberg, and Renato Fonseca F. Werneck. Hierarchical hub labelings for shortest paths. In Algorithms - ESA European Symposium, volume 7501 of Lecture Notes in Computer Science, pages 24–35, 2012.
-  Hannah Bast, Erik Carlsson, Arno Eigenwillig, Robert Geisberger, Chris Harrelson, Veselin Raychev, and Fabien Viger. Fast routing in very large public transportation networks using transfer patterns. In Algorithms - ESA 2010, 18th Annual European Symposium, volume 6346 of Lecture Notes in Computer Science, pages 290–301. Springer, 2010.
-  Hannah Bast, Daniel Delling, Andrew V. Goldberg, Matthias Müller-Hannemann, Thomas Pajor, Peter Sanders, Dorothea Wagner, and Renato F. Werneck. Route planning in transportation networks. In Algorithm Engineering - Selected Results and Surveys, volume 9220 of Lecture Notes in Computer Science, pages 19–80. Springer, 2016.
-  Hannah Bast, Matthias Hertel, and Sabine Storandt. Scalable transfer patterns. In Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2016, pages 15–29. SIAM, 2016.
-  Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. Reachability and distance queries via 2-hop labels. SIAM J. Comput., 32(5):1338–1355, 2003.
-  Daniel Delling, Julian Dibbelt, Thomas Pajor, Dorothea Wagner, and Renato F. Werneck. Computing multimodal journeys in practice. In Experimental Algorithms, 12th International Symposium, SEA, volume 7933 of Lecture Notes in Computer Science, pages 260–271, 2013.
-  Daniel Delling, Julian Dibbelt, Thomas Pajor, and Renato F. Werneck. Public transit labeling. In Experimental Algorithms - 14th International Symposium, SEA 2015, Paris, France, June 29 - July 1, 2015, Proceedings, volume 9125 of Lecture Notes in Computer Science, pages 273–285. Springer, 2015.
-  Daniel Delling, Andrew V. Goldberg, Thomas Pajor, and Renato F. Werneck. Robust distance queries on massive networks. In Algorithms - ESA European Symposium, volume 8737 of Lecture Notes in Computer Science, pages 321–333, 2014.
-  Daniel Delling, Andrew V. Goldberg, and Renato F. Werneck. Hub labeling (2-hop labeling). In Encyclopedia of Algorithms, pages 932–938. Springer, 2016.
-  Daniel Delling, Thomas Pajor, and Renato F. Werneck. Round-based public transit routing. Transportation Science, 49(3):591–604, 2015.
-  Julian Dibbelt, Thomas Pajor, Ben Strasser, and Dorothea Wagner. Connection scan algorithm. ACM Journal of Experimental Algorithmics, 23, 2018.
-  Julian Dibbelt, Thomas Pajor, and Dorothea Wagner. User-constrained multimodal route planning. ACM Journal of Experimental Algorithmics, 19(1), 2014.
-  Robert Geisberger, Peter Sanders, Dominik Schultes, and Christian Vetter. Exact routing in large road networks using contraction hierarchies. Transportation Science, 46(3):388–404, 2012.
-  Geofabrik. http://download.geofabrik.de/.
-  Open Data RATP. https://data.ratp.fr/.
-  OpenStreetMap. https://www.openstreetmap.org/.
-  Transport for London Unified API. https://api.tfl.gov.uk/.
-  Dorothea Wagner and Tobias Zündorf. Public transit routing with unrestricted walking. In 17th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems, ATMOS, volume 59 of OASICS, pages 7:1–7:14, 2017.
-  Sibo Wang, Wenqing Lin, Yi Yang, Xiaokui Xiao, and Shuigeng Zhou. Efficient route planning on public transportation networks: A labelling approach. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD ’15, pages 967–982, New York, NY, USA, 2015. ACM.