1 Introduction
Datadriven mobility modeling and prediction are important aspects of modern urban planning. With respect to travel forecasting, the two major areas of research are demand modeling and traveltime estimation [1], where demand modeling involves generating accurate statistics of the number of trips between origin–destination (O–D) pairs, and traveltime estimation involves predicting the travel times for trips between O–D pairs. The focus of this work is on the latter, specifically, streetlevel traveltime estimation. In existing research on traveltime estimation, interstate link models have received disproportionate attention from the transportation research community, due primarily to the availability of large amounts of freeway sensor data. Although equally important, the same is not true for arterial models, where coverage is limited due to the costs related to installing probe sensors and associated infrastructure. Under these circumstances, significant insights can be gained at a fraction of the cost with coarsegrained data sets such as the summary statistics of Uber trips.
Uber Movement datasets [22] provide anonymized, aggregated, and coarsegrained O–D travel time statistics at the TAZ (traffic analysis zone) level for many metropolitan areas around the world. TAZs are small geographical units into which a given metropolitan area is divided, characterized by factors such as the total population, type of population, and employment. While Uber Movement datasets can be coarsegrained and have high uncertainty, they cover large geographical regions and are available for multiple metropolitan areas, allowing for generalizability. Similar datasets are also available through other sources (for specific cities) such as the New York City taxicab data [14].
Our work is concerned with filling the gap in arterial traveltime estimation. While our prior work [20] focused on traveltime estimation at the TAZ level, this work is concerned with traveltime estimation at the street level. We start with the graph representation of a given road network, where intersections and road segments are represented as vertices and edges, respectively. Next, using the Uber Movement data and the street network graphs as inputs, we iteratively estimate the travel time on each edge for a given time window. At each iteration, we solve a constrained leastsquares problem on the pseudosparsified graph. While the examples here utilize the coarsegrained, highcoverage Uber Movement data, the developed approach could seamlessly incorporate highquality, lowuncertainty datasets, such as those from loop counters or probe sensors, by including these elements as constraints. While this work uses methods similar to those in [2], which combines shortestpath routing with a convex optimization formulation, we make the following contributions:

We utilize trip sampling in order to leverage coarsegrained, aggregated TAZlevel data. Finergrained data from sensors as well as streetlevel trip data can easily be absorbed into our approach.

We propose a biasing scheme for sampling travel times from the statistics of the aggregated data, which can significantly improve predicted travel time distributions.

We make use of constraints and the convex combinations of sequential iterates as a means of stabilizing solutions and improving convergence rates.

We demonstrate the efficacy of a graph pseudosparsification technique that can improve scalability with little loss of accuracy. We are currently building on this approach to scaling to much larger network sizes that encompass entire metropolitan areas with the aid of high performance computing resources.
The paper is organized as follows. Section 2 describes the Uber Movement and the LA road network data used in this work. In the subsequent section (Section 3), we describe the forward model and simulated trips, followed by the optimization methodology (Section 4) used in this work. Section 5 presents our principal experimental results, followed by the methods and scaling results related to our pseudosparsification technique in Section 6. Section 7
describes related work in the area of graph analytics, optimization, and machine learning applied to road networks. We conclude the paper with a note on future work in Section
8.2 Datasets
Our primary data sources for this work are from Uber Movement [22] and the road network for the LA area. Uber has released a trove of aggregated and anonymized data on traveltime and averagespeed statistics for a large number of cities around the world [22]. Because gathering transportation data is an expensive and cumbersome process, leveraging these surrogate datasets is expected to help researchers and city planners conduct quick and fairly detailed studies of the various aspects of vehicle mobility in urban settings. In this work, we focus on Uber’s travel time data, which includes statistics for travel times between pairs of TAZs or census tracts for each hour of the day and day of the week.
The graph representation of the LA city road network is made available to us as part of the project. Similar analyses is possible with the opensource version of the maps provided by OpenStreetMaps
[16]. Table 1 lists some basic network properties of the full LA road network, along with subnetworks formed with different radii around downtown LA. Other than the number of TAZs, most of the structural properties are similar across all of the networks. The number of TAZs included increases with the size of the graph.Area  Vertices  Edges  TAZ  min(Deg)  max(Deg)  avg(Deg)  Clust. Coeff. 

LA_DT+1  3239  7138  25  2  12  2.2  0.018 
LA_DT+2  9879  22756  79  2  12  2.3  0.025 
LA_DT+3  16906  40929  159  2  12  2.4  0.026 
LA_full  368419  905622  2205  2  14  2.46  0.034 
3 The graphbased forward model
In this section we describe a forward model for traveltime prediction, and in the following section we show how the parameters of this model can be estimated by means of an optimizer. Let denote a set of edges that forms a path from vertex to vertex in the road network graph. The expected travel time between the vertices and can be computed as
(1) 
where represents the expected travel time along edge
, the vector
represents the travel time along all edges in the graph, and the binary vector encodes the edges in . The choice of optimal routing () between the origin and destination vertices is a variable in the model. Routing is typically done with special routing software such as the Open Source Routing Machine [13]. However, in this work, we use shortestpath routes where the edges are weighted by travel times [2, 12, 24]. Specifically, for initialization, we use weights determined by freeflow travel times , the time it takes to travel along a road segment free of congestion, computed by dividing the length of the road segment by the posted speed limit.3.1 Training and testing tasks
Eq. 1 is based on O–D pairs in the road network graph. However, the Uber Movement data is provided at the much coarser granularity of TAZ O–D pairs, consisting of traveltime statistics computed over all trips originating at a given TAZ and ending in another during a particular hour of the day. Furthermore, not all TAZ pairs are included in the data.
For a given hour of the day and geographic area, we first collect all the TAZ O–D pair statistics available from Uber Movement. We then choose a random 9010 split of the available TAZ O–D pairs for training and testing, respectively. For each of the TAZ O–D pairs, we simulate trips by sampling vertices from the origin and destination TAZs to form vertex O–D pairs, assigning trip times by sampling from a lognormal distribution based on the geometric mean and geometric standard deviation travel times given in the Uber dataset.
After estimating edge travel times using our vertexlevel training data, testing is done at the TAZ level using the geometric mean travel time for all vertex O–D pairs present in a given test TAZ O–D pair. For example, suppose there are simulated trips from TAZ to TAZ . The estimated geometric mean travel time would then be computed as
(2) 
where vector encodes the edges in the weighted shortestpath between sampled vertex O–D pair . The vertex O–D sampling and traveltime sampling are described in detail below.
3.1.1 Vertex sampling
For each iteration of our edge traveltime estimation algorithm, we sample vertex O–D pairs each from our training and test sets, letting the number of simulated trips for each TAZ O–D pair be proportional to the size of the two TAZs. Specifically, for each TAZ O–D pair in the current subset of the Uber dataset, we let be the product of the number of vertices in origin TAZ and destination TAZ , then we set the number of simulated trips .
We sample the
origin and destination vertices for a given TAZ O–D pair uniformly from all vertices within their respective TAZ. We note that this process selects the shortestpath edges with a probability proportional to their local betweenness centrality, which in turn correlates with the importance of the edge with respect to traffic flow
[5, 10, 17].3.1.2 Traveltime sampling
Let represent the freeflow shortestpath matrix for our simulated trips. We assign travel times to trips based on the ordering of freeflow shortestpath travel times for all vertex O–D pairs within a TAZ O–D pair, summarized in Algorithm 1 below. Here denotes the vector of freeflow travel times for each of the edges.
In practice, biased traveltime sampling appears to improve results when there is a relatively large number of trips for a given TAZ O–D pair. For example, Fig. 1 illustrates the difference between estimated trip traveltime distributions with and without biased traveltime sampling. For 2000 trip times sampled from the lognormal distribution of a given TAZ O–D pair (left), the distribution of estimated trip travel times deviates less from the target distribution (red line, same across panels) when we bias the travel times assigned to each vertex O–D pair (center) than when we do not (right).
4 The optimization process
Our approach solves a series of constrained leastsquares problems to fit edge travel times to simulated trips (vertex O–D pairs, travel times, and routes) with traveltime statistics consistent with the Uber dataset. For each iteration , we sample vertex O–D pairs with sampled travel times vector and weighted shortestpaths matrix . Our goal is to then estimate edge travel times vector so that to satisfy our forward model in Eq. 1. Thus we have a system of equations with unknowns where is the number of road segments (or edges in the graph). The estimates are then updated using a convex combination of the previous estimates and the solution found by minimizing the meansquared error with constraints on the unknown coefficients, illustrated by Eqs. 35 below.
(3)  
(4)  
(5) 
In our implementation, we initialize with the freeflow estimates and let , we constrain the elements of the vector to be bounded below by and above by , and we update our weight using the constant = 0.9 for all . Our weighted shortestpaths matrix is recomputed each iteration using our current estimates as weights, and our constrained leastsquares subproblems are solved using the lsq_linear function from the wellknown Python scientific computing library SciPy. We run the optimizer until the estimates converge (measured by the magnitude of the average change in the solution vector between iterations, Eq. 6 with ), or we reach a maximum number of iterations . This iterative scheme is similar to the one proposed in [2]
but with a choice of optimizer that favors scalability in conjunction with a number of heuristics that improve convergence. The workflow of our proposed traveltime estimation algorithm is given in Algorithm
2.(6) 
5 Experimental results
We implement the workflow Algorithm 2 on a road network encompassing a three mile radius of downtown Los Angeles (network LA_DT+3 in Table 1). Fig. 2 shows the convergence of the traveltime error (both training and test sets) for the road network at 3am and 6pm, measured by
(7) 
where is the total number of trips in the training (test) data, is the number of trips for TAZ O–D pair in the training (test) subset, is the current estimated geometric mean travel time (Eq. 2), and is the geometric mean traveltime from the Uber Movement dataset. This RMSLE error metric represents the relative meansquared error in the logarithm of the estimated geometric mean travel time [2]. Fig. 3 shows the estimated travel times (as a percent of freeflow travel time ) for both 3am and 6pm, where congested regions (red) are clearly visible in the 6pm plot.
6 Scalability and graph sparsification
The computational complexity of our proposed model is proportional to the number of trips and the number of the edges (Fig. 4). Due to the amount of data available, including vertices and edges in the full LA road network graph and between (3am) and
(6pm) TAZ O–D pairs in the 2019 Quarter 1 WeekdaysOnly Uber dataset, the constrained leastsquares optimizer becomes a computational bottleneck, limiting our ability to solve large systems (i.e., streetlevel traveltime estimation of a large geographical area). Since our goal is to generate a datadriven solution, we aim to use as much data as possible, therefore leading us to systemically reduce the number of unknowns. One strategy would be to sparsify the graph by dropping edges according to a predetermined criterion but still maintaining the graph connectivity. This strategy, however, changes routing, which can result in skewed traveltime estimations. For example, consider two paths between vertex
and : the shortest route through intermediate vertices and , and an alternative, but longer, route through vertices and . If a sparsification algorithm removes the road segment between and based on some metric (e.g., low edge betweenness centrality), our model would incorrectly estimate the travel time from to using the available road segment through and .In light of this, our approach is to pseudosparsify the underlying road network. Rather than remove edges from the graph, we sort road segments into two sets based on their significance with respect to traffic flow (e.g., betweenness centrality). We then estimate the traveltime along the edges with the top significance, setting the travel times along the remaining edges to the freeflow travel times. In our implementation, we sort edges based on their betweenness centrality, computed using shortestpaths weighted by freeflow travel times. For instance, if are the indices of the edges with the top betweenness and are indices of the remaining edges, we solve a modified version of Eqs. 34 each iteration:
(8)  
(9) 
In this way, we can control the number of unknowns and scale our method for larger geographical areas.
We test our pseudosparsification approach using different values of (Figs. 56). In Fig. 5 (left), we first run our algorithm on the full graph (i.e., ), and plot the relative difference between our estimated travel times and the initial freeflow travel times versus the percentile of the edges sorted by betweenness centrality. We observe that roughly of lowest betweenness edges retain their freeflow travel time as the final estimate, suggesting that we can set these edges to their freeflow travel times to reduce the number of unknowns.
Of course, drivers rarely maintain the exact speed limit. To account for this uncertainty in freeflow travel times, we sort the edges into bins based on their betweenness percentiles and plot the fraction of edges within each bin that have a relative difference within a factor of of their freeflow travel times (Fig. 5, right). We observe that the high betweenness edges are less likely to have an estimated travel time close to their freeflow times irrespective of the level of uncertainty . Therefore, we can pseudosparsify the graph to different extents depending upon the level of uncertainty that can be tolerated.
In Fig. 6, we vary the number of edges by estimating only the top 50%, 75%, and 100% of edges by betweenness. Here the time per iteration scales proportionally with the number of unknown edges included in the problem, and the traveltime error given by Eq. 7 increases as we assign more edges to their freeflow travel times.
7 Related work
Urban traffic modeling has seen a recent surge of interest in the use of graph analytics and machine learning methods. Reference [23] provides an overview of many of these methods. In this section we summarize the prior work related to both of these themes as applied to road transportation networks.
The authors in [21] use graph analytics on networks with three different weighting schemes to perform a statistical characterization of the Beijing road network. Many prior publications consider betweenness centrality [6] to be an important metric when applied to road networks, as it is argued to be a direct predictor of important links in urban transport. It has been shown that betweenness centrality is highly correlated with the traffic flow count on a road network [5, 10, 17], a natural result of including travel time as a factor when selecting trip routes. In realworld scenarios, however, route choices are also influenced by timeofday and other socioeconomic factors. Using these observations, the authors in [18] define an augmented betweenness centrality measure where shortest paths are weighted according to a traffic demand model based on census tracts and traffic analysis zones. The authors show that this new centrality measure correlates better with traffic flow than other centrality measures. Similarly, in [4] the authors employ analytics in the form of novel graph centrality measures to derive insights into the traffic flow patters in Singapore. Graph models, along with heterogeneous data sources, were leveraged to understand the urban traffic patterns in [15]. Finally, the authors in [7] utilize a gridbased fabric and cellular automata for modeling arterial traffic, resulting in gains in computational efficiency.
Many approaches make use of machine learning models and optimization methods to model various aspects of urban traffic flow. In [11]
, the authors leverage a deeplearning approach in the form of a diffusion convolutional recurrent neural network (DCRNN) to forecast shortterm freeway traffic counts in the LA and San Francisco Bay Area networks. The authors in
[3]also propose a deeplearning approach that brings together convolutional neural networks and recurrent neural networks with long shortterm memory (LSTM) units, utilizing their architecture for shortterm traffic count extrapolation at 349 locations on the Beijing road network. In a set of articles
[19, 25], the authors leverage data from Bluetooth and GPS probe sensors for traveltime estimation and validation. Coupled hidden Markov models (CHMM) were used in
[8] to model the evolution of traffic states, applied to a sparse taxifleet dataset for the San Francisco Bay area road network. In a subsequent publication [9]leveraging the same dataset, the authors employ a dynamic Bayesian network to learn arterial dynamics.
8 Conclusions and future work
In this work we leveraged coarsegrained Uber Movement data in the form of TAZ O–D pair summary statistics to provide estimates of finegrained, streetlevel travel times. Our techniques for trip simulation and biased traveltime sampling were used in conjunction with weighted shortestpath routing to set up a system of linear equations with unknown edge travel times. The travel times were iteratively refined using a constrained leastsquares optimizer for multiple batches of simulated trips. In our largest road network (40,522 edges and 24,472 TAZ O–D pairs), we achieved a RMSLE of 0.28 on our test set after 13 iterations with 4.7 hours total runtime (32core Intel Xeon E78860 system with 2.27 GHz processor speed and 512 GB primary memory). We demonstrated a graph pseudosparsification algorithm in order to improve the computational efficiency of our estimation routines. Our future work involves improving the optimization process, formalizing the sparsification approach, and scaling the approach to metropolitansized road networks using highperformance computing systems.
References
 [1] E. Beimborn et al., A transportation modeling primer, (2006).
 [2] D. Bertsimas, A. Delarue, P. Jaillet, and S. Martin, Travel time estimation in the age of big data, Operations Research, 67 (2019), pp. 498–515.
 [3] X. Cheng, R. Zhang, J. Zhou, and W. Xu, Deeptransport: Learning spatialtemporal dependency for traffic condition forecasting, in 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, 2018, pp. 1–8.
 [4] Y.Y. Cheng, R. K.W. Lee, E.P. Lim, and F. Zhu, Measuring centralities for transportation networks beyond structures, in Applications of social media and social network analysis, Springer, 2015, pp. 23–39.
 [5] P. Crucitti, V. Latora, and S. Porta, Centrality in networks of urban streets, Chaos: an interdisciplinary journal of nonlinear science, 16 (2006), p. 015113.
 [6] L. C. Freeman, A set of measures of centrality based on betweenness, Sociometry, (1977), pp. 35–41.
 [7] P. Gundaliya, T. V. Mathew, and S. L. Dhingra, Heterogeneous traffic flow modelling for an arterial using grid based approach, Journal of Advanced Transportation, 42 (2008), pp. 467–491.
 [8] R. Herring, A. Hofleitner, P. Abbeel, and A. Bayen, Estimating arterial traffic conditions using sparse probe data, in 13th International IEEE Conference on Intelligent Transportation Systems, IEEE, 2010, pp. 929–936.
 [9] A. Hofleitner, R. Herring, P. Abbeel, and A. Bayen, Learning the dynamics of arterial traffic from probe data using a dynamic bayesian network, IEEE Transactions on Intelligent Transportation Systems, 13 (2012), pp. 1679–1693.
 [10] B. Jiang, Street hierarchies: a minority of streets account for a majority of traffic flow, International Journal of Geographical Information Science, 23 (2009), pp. 1033–1048.
 [11] Y. Li, R. Yu, C. Shahabi, and Y. Liu, Diffusion convolutional recurrent neural network: Datadriven traffic forecasting, in International Conference on Learning Representations, 2018, https://openreview.net/forum?id=SJiHXGWAZ.
 [12] E. H.C. Lu, C.C. Lin, and V. S. Tseng, Mining the shortest path within a travel time constraint in road network environments, in 2008 11th International IEEE Conference on Intelligent Transportation Systems, IEEE, 2008, pp. 593–598.
 [13] D. Luxen and C. Vetter, Realtime routing with openstreetmap data, in Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS ’11, New York, NY, USA, 2011, ACM, pp. 513–516, https://doi.org/10.1145/2093973.2094062, http://doi.acm.org/10.1145/2093973.2094062.
 [14] NYC Taxi and Limousine Commission, NYC yellow and green taxi trip records, . https://www1.nyc.gov/site/tlc/about/tlctriprecorddata.page, 2019.
 [15] K. S. Oberoi, G. Del Mondo, Y. Dupuis, and P. Vasseur, Spatial modeling of urban road traffic using graph theory, in Spatial Analysis and GEOmatics 2017, 2017.
 [16] OpenStreetMap contributors, Planet dump retrieved from https://planet.osm.org . https://www.openstreetmap.org, 2019.
 [17] S. Porta, P. Crucitti, and V. Latora, The network analysis of urban streets: a primal approach, Environment and Planning B: planning and design, 33 (2006), pp. 705–725.
 [18] R. Puzis, Y. Altshuler, Y. Elovici, S. Bekhor, Y. Shiftan, and A. Pentland, Augmented betweenness centrality for environmentally aware traffic monitoring in transportation networks, Journal of Intelligent Transportation Systems, 17 (2013), pp. 91–105.
 [19] H. A. Rakha, H. Chen, A. Haghani, X. Zhang, M. Hamedi, et al., Use of probe data for arterial roadway travel time estimation and freeway mediumterm travel time prediction, tech. report, MidAtlantic Universities Transportation Center, 2015.
 [20] A. V. Sathanur, V. Amatya, A. Khan, R. Rallo, and K. Maass, Graph analytics and optimization methods for insights from the uber movement data, in Proceedings of the 2Nd ACM/EIGSCC Symposium on Smart Cities and Communities, SCC ’19, ACM, 2019, pp. 2:1–2:7.
 [21] Z. Tian, L. Jia, H. Dong, F. Su, and Z. Zhang, Analysis of urban road traffic network based on complex network, Procedia engineering, 137 (2016), pp. 537–546.
 [22] Uber Technologies, Inc., Data retrieved from Uber Movement, (c) 2019 . https://movement.uber.com, 2019.
 [23] E. I. Vlahogianni, M. G. Karlaftis, and J. C. Golias, Shortterm traffic forecasting: Where we are and where we’re going, Transportation Research Part C: Emerging Technologies, 43 (2014), pp. 3–19.
 [24] L. Wu, X. Xiao, D. Deng, G. Cong, A. D. Zhu, and S. Zhou, Shortest path and distance queries on road networks: An experimental evaluation, Proceedings of the VLDB Endowment, 5 (2012), pp. 406–417.
 [25] X. Zhang, M. Hamedi, and A. Haghani, Arterial travel time validation and augmentation with two independent data sources, Transportation Research Record, 2526 (2015), pp. 79–89.