Quasi-Dynamic Traffic Assignment using High Performance Computing

04/26/2021 ∙ by Cy Chan, et al. ∙ berkeley college Berkeley Lab 0

Traffic assignment methods are some of the key approaches used to model flow patterns that arise in transportation networks. Since static traffic assignment does not have a notion of time, it is not designed to represent temporal dynamics that arise as vehicles flow through the network and demand varies through the day. Dynamic traffic assignment methods attempt to resolve these issues, but require significant computational resources if modeling urban-scale regions (on the order of millions of links and vehicles) and often take days of compute time to complete. The focus of this work is two-fold: 1) to introduce a new traffic assignment approach - a quasi-dynamic traffic assignment (QDTA) model and 2) to describe how we parallelized the QDTA algorithms to leverage High-Performance Computing (HPC) and scale to large metropolitan areas while dramatically reducing compute time. We examine and compare different scenarios, including a baseline static traffic assignment (STA) and a quasi-dynamic scenario inspired by the user-equilibrium (UET). Results are presented for the San Francisco Bay Area which accounts for 19M trips/day and an urban road network of 1M links. We utilize an iterative gradient descent method, where the step size is selected using a Quasi-Newton method with parallelized cost function evaluations and compare it to using pre-defined step sizes (MSA). Using the parallelized line search provides a 16 percent reduction in total execution time due to a reduction in the number of gradient descent iterations required for convergence. The full day QDTA comprising 96 optimization steps over 15 minute intervals runs in about 4 minutes on 1,024 cores of the NERSC Cori computer, which represents a speedup of over 36x versus serial execution. To our knowledge, this compute time is significantly lower than other traffic assignment solutions for a problem of this scale.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

page 13

page 19

page 21

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Transportation planners often use static traffic assignment solutions to estimate traffic states over the course of one day in their cities

[1]. Static traffic assignment does not deal with the dynamic behavior (realistically) that results from network dynamics - it simply assigns an origin/destination (O/D) routing solution that minimizes auto travel time for all mobile entities so that no drivers can unilaterally reduce his/her auto travel costs by shifting to another route. This is known as static user equilibrium [2]. To accommodate the temporal changes in the demand profile over an entire day, the problem is typically partitioned into time slots of interest and static traffic assignment solutions are used to estimate average speeds and average flows for each time slot for the network. Example time slots are early morning, morning rush hour, mid-day, evening rush hour, late evening - accounting for two to four hours of time in each time segment. Because of the scale of the network and the complexity of the algorithms, these models often take many days to run depending upon the hardware and software solutions available to city planners. Consequently, planners will often make adjustments to the model to make the computation tractable within their compute capabilities and time to solution requirements. For example, they may remove lower functional class roads from the network and aggregate travel demand to higher functional class roads. The results are then compromised by these adjustments [3]. Several limitations of STA has also been identified and discussed in many studies [3, 1, 4]. The underlying assumptions of STA, related to the stationary demand, long aggregation intervals, and static network loading, can lead to unrealistic results when variations in traffic conditions are high [1].

The alternative to static traffic assignment is dynamic traffic assignment. Dynamic traffic assignment has been a topic of research for almost 50 years - beginning with the famous M-N model [5, 6]. The focus of this work is to use static traffic assignment algorithms implemented on high-performance computing (HPC) to generate results that more effectively estimates a dynamic traffic assignment. We do this by refining the traditional approach in both time and trip O/D locations and introduce a quasi-dynamic optimization - which we will call a Quasi Dynamic Traffic Assignment (QDTA). This new quasi-dynamic traffic assignment refers to assigning traffic demand in the form of time segments by discretizing the traffic demand that is continuously changing over time. We further capture the dynamic impact of trips that span multiple time segments by introducing two mechanisms: path truncation and residual demand. This approach addresses the complexity of dynamic traffic assignment and removes the concern that static traffic assignment can’t provide value to practical traffic flow due to its single time assignment characteristic. In general, while the continuous-time DTA models satisfy the requirements of traffic flow theory, its uniqueness of network equilibrium flows is not necessarily guaranteed [7]. On the other hand, Quasi dynamic models are closer to static traffic assignment, somewhat sharing the formulation and properties of network equilibrium. It is also referred to as discrete-time DTA or semi-dynamic traffic assignment in some literature [8]. QDTA provides a better understanding of the traffic dynamics over STA as it can propagate traffic flows between the time periods using the residual demand.

We use high-performance computing to accelerate the quasi-dynamic traffic assignment algorithm to generate high-fidelity traffic assignments with significantly improved computational efficiency. Optimization on distributed-memory platforms is achieved by utilizing efficient routing algorithms and parallelizing multiple components of the computational workload across threads, including: 1) trip demand routing, 2) cost function evaluation, 3) network flow and weight updates, and 4) the residual demand calculations. The improvements gained using HPC allow us to shorten the time segment duration to 15 minute intervals and address more complex road networks and increased travel demand profiles. Our computational approach scales to one thousand compute cores, allowing us to run assignments with 19 million vehicle trips on a network with 1 million links in under 6 minutes. This capability enables urban-scale traffic assignments in which a variety of scenarios can be addressed with less concern for computational complexity. Examples include: comparisons among counties or cities, integrating parallel discrete event simulation scenarios that implement and explore infrastructure implications of QDTA, and other objective functions for the optimization. For the purpose of brevity in this paper, we will only address the introduction of QDTA in a user equilibrium travel time context. The baseline model used for comparison will be the static traffic assignment. Other scenarios will be described in subsequent papers.

First, we will discuss the general form of the proposed quasi dynamic traffic assignment in its pure mathematical form and introduce a proper instantiation of these class of problems, based on the well-known static traffic assignment. Second, we provide the QDTA models computational flow and algorithmic representations that estimate the dynamics using the discretized time steps. Third, we present the parallelization of the QDTA model and demonstrate the significant computational gains associated with this approach, while providing remarkably improved fidelity than traditional approaches. Finally, we discuss the scenario comparison for the STA and QDTA models for the entire urban region of the San Francisco Bay Area.

2 Quasi-dynamic traffic assignment and high-performance computing in the literature

In the literature, the dynamics of QDTA comes in two components: demand/routing and dynamic/quasi-dynamic network loading. QDTA differs from STA by the use of much smaller assignment intervals. The implication of having this temporal dimension is the presence of residual demand. The residual demand is handled differently in various previous work. [9, 10] uses a point-queue model to include "capacity constraints", preventing flow from exiting the link if larger than the capacity. This gives more realistic travel times. The authors did not demonstrate its application in multi time step situations. In other cases, this residual demand is calculated through "path truncation". Some models did not appear to consider the residual demand, rather by preprocessing the demand. Operation models (more on the DTA side), predicts the residual demand more precisely, but is computationally expensive. Traffic assignment in general may be link-based or path-based [10], though path-based must address the issue of a combinatorial increase of the path set size.

Routing Network loading Residual demand Advantages Limitations Case studies
[9] Route-based SUE formulated as VI solved using MSA first-order node model (i.e., reduction factor based on link demand and supply) formulated as a fixed-point problem solved iteratively Models node capacity at diverges and merges; correct representation of congestion upstream of bottleneck Does not have temporal dynamics ([9] distinguishes quasi-dynamic assignment from semi-static assignment) Gold coast, Australia: 9,565 links, 2,987 nodes, 0.3 hour on a personal computer. Sydney, Australia: 75,379 links, 30,573 nodes, 1.1 hour on a personal computer
[10] System Optimal based on a set of reasonable and independent paths (path-based), assigned using the Method of Successive Averages (MSA) same as above Sinox Falls
[11] SUE, from a shortest path set, no rerouting Move by link cost function (e.g., BPR function) No rerouting, limited path set Rome, Italy: 15,000 links, 6,000 nodes, simulates several hours of traffic in a few minutes on a personal computer
[12]
[13] Dijkstra shortest path Link cost constructed from average of GPS speed measurements, weighted by traffic volume reconstructed from the Licence Plate Recognition (LPR) data No residual demand, likely the OD data is pre-partitioned Data-driven link cost and demand Not applicable where the high resolution data (e.g., GPS or toll demand) is limited (e.g., outside of the highways) Expressway network in Hunan, China. 530 edges and 490 vertexes. Total length is 6725.5km. Computational performance unclear.
Table 1: QDTA in the literature

2.1 Dynamic traffic assignment and routing

In this section, we give a short survey of existing modeling and routing approaches. Due to the vast number of publications in this area of research, the presented list of publications cannot be complete. However, we aim to describe the different approaches specifically and at a technical level for each class. We begin this overview with the famous books in this fields: [14, 15, 16].

Time continuous modeling

Generally, in a time-continuous setting, the link dynamics can be described via ODEs (ordinary differential equations) or PDEs (partial differential equations). The articles

[5, 6, 17, 18, 19] consider dynamic traffic assignment as optimal control problem over a given time horizon. The link dynamics are represented by a (system) of ODEs with in- and outflow functions. An objective function is specified and optimality conditions are deduced. However, there is no natural delay and the optimization is considered on the full time horizon, so that one has to know information about inflow over the full time horizon to determine the routing, i.e. traffic assignment, not to mention that ODEs are generally quite far away from modelling traffic physically accurately (congestion patterns, the named delay, and more).

In [20, 21, 22], more general classes of ODE models (sometimes also with delay) are considered and time-dependent variational inequalities are determined to describe the routing at the junctions. Depending on whether the variational inequality is considered over the full time horizon or at every time (with the ordinary scalar product in and the variational inequality to hold at every time), the computation of a solution again requires full information of the input datum over the full time horizon considered. In [16] a more detailed mathematical analysis concerning existence of solutions is provided. [23] considers the well-posedness for an ODE model with delay and routing operators at every intersection which can dependent on the entire network state up to real-time. In [24] the routing is realized by a specific routing function taking into account the status of the network. It also proves some stability estimates with regard to the routing considered. Another modeling approach which is also considered in [25, 26] consists of using the Vickrey or point-queue model (or a modified version) [27, 28, 29], again aiming for a routing (and departure choice time) based on variational inequalities. These modeling approaches can be generalized to dynamics which are prescribed by partial differential equations (PDEs). We refer to [30, 31, 32, 33, 34, 35] for an overview. Usually, the underlying link dynamics are modelled by the LWR (Lighthill, Whitham, Richards) PDE [36, 37], a hyperbolic conservation law, allowing spillback and congestion. Another approach uses non-local PDE models to simulate traffic flow at junctions [38, 39, 40] and there are also higher order models available, see for instance [41, 42]. However, as mentioned before, a reasonable routing at the intersection needs to be prescribed and the underlying models need to be solved for a given space-time discretization on the entire network, resulting in a quite expensive computational situation. As stated before, the (time dependent) routing itself – which might depend on the entire network state at a given time (or also the future) – will make the problem even more computationally challenging, not to mention the problem of calibrating the entire system reasonably. These are reasons why we consider in this work a class of simpler models which still have a notion of time and delay, but not to the detail level which the previously mentioned models provide. Roughly, the model parameters required for a static traffic assignment are enough for the proposed QDTA. Before we investigate the QDTA in more detail, it is reasonable to mention also other time-discretized models:

Time-discretized modeling

There is a significant number of articles which deal with already discretized time dynamical models. For the sake of an exhaustive overview, see [43, 44, 45, 46, 47, 48]. The articles [49, 50, 51] consider a discretized traffic flow model based on ODEs and time optimal control problem to obtain the routing. [48] studies another discretized ODE model but this time, the routing is implemented by a probabilistic approach depending on the status of the network. Most of the existing literature considers the assignment as an optimization problem across the entire period of interest. In discretized formulation, the choice of the time step interval is important. Generally, it should be much longer than the link travel time ([50]), but not too long that makes each time slice close to static assignment. In addition, the size of the time step sometimes also depend on the availability of the temporal demand data, which may be available only at coarse grain ([12]). The time-dependent demand is usually considered fixed and known, though [46]

proposed algorithms for the cases with uncertainty in the demands (random with certain probability distributions). In terms of the solution algorithms,

[51] formulates the System Optimum (SO) problem as a system of linear equations after substituting the non-linear functions (e.g., the link exit function) with linear segments. [44] incorporates the DYNASMART simulator to generate performance measures under given route assignment in the iterative search step, and use the simulator outputs to inform the search directions.

2.2 High-performance computing for transportation modeling

Parallel computation can be divided broadly into two categories: shared memory (e.g. a workstation or a single server node) and distributed memory (e.g. clusters, cloud platforms, and HPC systems) [52]. The key distinction is that all cores in a shared memory system have direct access to data in the same address space. That is, data that is written by each compute core is immediately visible to all other cores in the same memory. In contrast, distributed memory systems have separate address spaces that require explicit message passing to share data between cores connected to distinct memory spaces. Programming distributed memory systems is more difficult due to the separate memory, as it requires mechanisms to handle data and load distribution, synchronization, and data movement between processes. However, the benefits of distributed memory parallelism are two-fold: access to more compute cores and a greater total memory capacity to enable solving larger problems more efficiently. High-performance computing systems differ from typical clusters and cloud-based systems in that they have high performance network interconnects to accelerate message passing and synchronization between the compute nodes in the system. This network hardware is critical to the performance for latency-sensitive applications where cores are frequently communicating and/or synchronizing.

The use of high performance computing (HPC) in transportation modeling and simulation tools has not yet achieved widespread adoption. Most simulation tools in this domain support either only sequential execution or (shared memory parallelism, which limits parallelism to the number of cores on a single compute node. Examples that use this type of parallelism include the Aimsun [53] and SUMO [54] simulators. Some traffic simulation software projects, such as FastTrans [55], BEAM [56] and POLARIS [57], have also enabled the use of distributed memory systems, such as HPC and cloud computing platforms. Within the domain of traffic assignment, there have been some previous efforts using parallel computation on smaller networks than explored in this paper [58, 59]. In [58], they analyze the Nagoya network consisting of 152K links and 38K nodes, using embarrassingly parallel path finding and processing only the active sub-networks to achieve up to 10x speedups using 25 processors. In  [59], they utilize multi-threading and MPI to parallelize a link transmission model (LTM) based dynamic traffic assignment for a network with 11k nodes and 23K links, achieving up to 3x speedup on a few nodes. As we demonstrate in our paper, the combination of an algorithm that lends itself to efficient parallelization along with the use of distributed computing presents a large opportunity to increase the performance of traffic assignment algorithms for large-scale systems with millions of links and vehicles.

3 A tractable dynamic assignment problem: QDTA formulation

The QDTA model proposed in this article divides the analysis period into small time steps and uses a sequence of STA steps to obtain an optimized route assignment for each time step. The main distinction between the proposed model and the conventional STA framework is the inclusion of route truncation and residual demand, to address the fact that some trip legs cannot be finished in a single analysis time step and need to be split across multiple steps. As the analysis time step gets shorter to capture short-term traffic dynamics (i.e., 15 minutes), the fraction of trips spanning over multiple time steps become particularly prominent. Section 3.2 describes route truncation and residual demand in detail. The integration of the residual demand into the well-understood framework of the STA is given in Section 3.3. Lastly, some discussions are presented in Section 3.4 regarding key questions such as the choice of time step length, the connections with simulation modules, as well as the potential usage in traffic optimization.

3.1 Problem specification

The road network is represented by a directed connected graph , where is the set of graph nodes/vertices and is the set of graph links/edges. A road intersection is represented by a node in the graph , and the stretch of road between two intersections is represented by a graph link . Each link has several static properties, such as the traffic flow capacity , and traversal time at the free-flow conditions (free-flow travel time) . Other static node and link properties, such as the coordinates of the nodes, the geometries of the links, are often stored (e.g., for visualization and post-processing analysis). But they are not described here since they are not integral in the formulation of the QDTA framework.

On the travel demand side, the time-dependent travel demand on this road network is represented by time-stamped origin-destination (OD) flows, . and denote the origin and destination, with and . is the analysis time step. Integer is the total numbers of time steps considered, and each time step is assumed to last for a duration of . For example, if considering a traffic assignment period of 24 hours with time interval of 15 minutes, then equals to 96 and is 15 minutes.

For trips in , let represents the set of acyclic paths in the graph that connect and at time interval . The path set is time dependent, as certain links may be out of service due to dynamic events such as earthquakes, etc. The traffic flow on each path in is denoted by , or for all path flows at a particular time step. Parallelization is accomplished by partitioning the travel demand (trips) across compute threads, where represents thread ’s partition of the corresponding demand. Furthermore, represents the flows on the paths that correspond to thread ’s partition, and represents the flows on all links that result from that partition.

The QDTA framework is to propose an efficient and parallelizable method to generate an optimized solution of according to the dynamic extensions of the widely used traffic assignment principles, such as the Wardrop’s equilibrium, as shown in Sections 3.2 and 3.3. The solution can be used to offer centralized and real-time routing guidance to enhance the traffic flow for a large network. Specifically, the Wardrop’s equilibrium has two variations, namely the user equilibrium (UE) solution and the system optimal (SO) solution (see literature review). In this paper, the dynamic extension of the UE solution is adopted as an example in the mathematical formulation of the QDTA framework, with the detailed formulation given in Equation 3 in Section 3.3.

Symbol Meaning
Network properties
G Road network graph.
, Set of road network vertices, one vertex in the road network.
, Set of road network edges, one edge in the road network.
Flow capacity of edge .
, Free-flow travel time/cost of all edges, or a specific edge .
, Time/Cost of traversing all edges, or a specific edge , at time .
Travel demand
Trip origin.
Trip destination.
Intermediate stop location.
Travel demand (trips) starting at node , ending at node , departing at time .
All trips traveling at time .
, All original, residual trips traveling at time .
, , Thread ’s partition of corresponding trips at time .
Time-step related variables
, Time steps, time step.
Total number of time steps.
Duration of each time step, e.g., 15 minutes.
, All acyclic paths in the road network that connect and at time , a single path in the set.
, Total number of vertices along path , the vertex along path .
, Traffic flow assigned to all paths, or path , at time step .
, Traffic flow assigned to all links, or link , at time step .
, Traffic flow assigned to paths corresponding to thread ’s partition of trips, and their resulting partial link flows.
Table 2: Meanings of mathematical expressions

3.2 Residual demand and route truncation

In QDTA, travel demands are supplied in stages (time steps) according to the analysis period that the departure times are in. For example, if the analysis time period begins at 7 AM with a time step interval , a separate demand set will be used for 7 - 7.15 AM, 7.15 - 7.30 AM, and so on. The assignment framework presented in this paper does not constrain the duration of the time step. As a result, different time step lengths can be used, e.g., 5 minutes or 1 hour. However, a time step that is too long would approximate the STA solution and lack the desired temporal variations in the modeling results. While too short a time step usually means more computational efforts and higher requirements regarding the temporal distribution of the demand inputs.

The complication of dividing the travel demand into smaller analysis period (e.g., 15 minutes) is that some trips cannot finish within one assignment period. This is problematic as established optimization algorithms to obtain traffic assignment solutions (e.g., the Frank-Wolfe’s algorithm) require the specification of the stop positions of the trips. As a result, a modification is introduced to the STA analysis framework that aims at mitigating the inconsistency in the optimized routing caused by the uncertainty in the stop location after each analysis period. This modification involves a route truncation operation that is used repeatedly in the algorithm. In this section, the mathematical formulation of the route truncation operation will be introduced. Its integration into the QDTA framework will be given in Section 3.3.

The route truncation operation estimates the intermediate stop location that a trip in can reach in time . For long trips with short time step intervals, is usually different from the trip destination , and the remaining leg of the trip, i.e., from to , will enter the next time step as the residual demand. The determination of relies on knowing the travel time (or other general cost) of each link, , which is often a function of the link flow . Since the flow and residual demand are dependent on each other, an iterative procedure is needed to get converged results.

The first set of the equation maps link flow to the link-level travel time . The general form of the function, as shown in Equation (1a), satisfies both the monotonicity as well as the separability assumptions. The monotonicity underlines the fact that the more flow is assigned to a road, the longer it takes in average to pass this link. The separability assumption states that travel time on a link only depends on the current flow assignment on the considered link. The separability assumption holds if the traffic flow is in a steady-state, but is not valid in case of congestion spill-back. A typical choice of the travel time function that satisfies both assumptions is the well-known Bureau of Public Roads (BPR) curves [60], as shown in Equation (1b). In Equation (1b), is the travel time on link at time step ; and are the free-flow travel time and capacity associated with the link; and are calibration parameters. , as stated before, is the link-level traffic flow at time step .

equationparentequation

General form: (1a)
Example instantiation: (1b)

Based on the link-level travel time , the stop location can be determined using Equation (2). is the path of the trip to its destination . If the time step duration is long enough, longer than the total time required to traverse , i.e., first row in Equation (2), the intermediate stop location is the trip destination . However, if this condition cannot be met, the stop location is a node along the path . Let denote the total number of nodes along path , and to be the node on the path, the second row of Equation (2) leads to the furthest distance that can be covered during a time duration of .

(2)

Thus, for trip with route , only the first part of the trip till vertex will contribute towards the path/link flow in the time interval , while the second part of the trip will be added to the next time step as the carry-over demand .

3.3 Full formulation

In this section, the temporal update steps to obtain the path/link traffic flow assignment will be presented. At each time step, the travel demand consists of two parts, the original trips whose starting time is at , and the residual demand trips that started before but did not reach destinations in previous time steps. This applies to all time steps except the first time step , where there is no residual demand. Let be a function that maps the travel demand to path flows:

(3)

Equation (4) is an instantiation of that finds an optimum assignment of path flow that satisfies the UE condition, with proof in [60].

equationparentequation

(4a)
subject to (4b)
(4c)
(4d)

is the incidence matrix, whose value is 1 if the link is on the path before the intermediate stop , and 0 otherwise:

(5)

Equation (4) is essentially a dynamic extension of the static UE formulation in [60], with the static travel demand, link and path flow replaced by the time-dependent , and . Specifically, unlike the original static formulation, where the path-link incident matrix only depends on knowing whether a link is on a path, in the quasi-dynamic formulation, a link on a path may not be traversed in the current time step. As a result, the path flow in QDTA is truncated at the stop location, , first using Equation (2), before mapped to the link flow using Equations (4d)-(5). Given that is used in the determination of , an iterative process is needed to reach convergence of these quantities.

3.4 Discussions of the QDTA framework

Some further discussions are provided regarding the proposed QDTA framework. First, regarding the choice of the time step length, , it reflects the trade-offs of a few factors. If the time step length is too long, the solution will approximate the STA solution and fail to capture the changing temporal traffic dynamics. However, if the time step length is too short, it will not only increase the computational time significantly, but also lead to less accurate results as microscopic, sub-link dynamics start to show. For example, consider an extreme scenario, where is shorter than the free flow time of the road link, , Equation (2) indicates that the intermediate stop node is always the origin node. In other words, the trips can never propagate through the link. This limitation can potentially be overcome by using alternative formulations of Equation (2) that tracks the distance of the traffic flow on a link, and cumulatively adds up the distance across time steps if the traffic flow cannot be propagated to the next link.

A key feature of the proposed QDTA framework is the inclusion of in-flight rerouting. At every time step , the routing is updated to reflect the current traffic conditions and potential changes in the network (e.g., road closures due to earthquake damages). The route assignment is obtained through an iterative process between the path flow, , and the path truncation, and reflects the optimum (e.g., UE) traffic distribution in each time step. The in-flight rerouting is not considered in some previous DTA algorithms. But it is particularly relevant, especially in the era of navigation applications that dynamically update the navigation routes for users, to include such phenomena in traffic modeling and predictions.

4 Algorithmic Representations

The mathematical formulation of the QDTA framework in Section 3 is solved using a number of scalable, computerized algorithms. The pseudo code of these algorithms are presented in this section. At the out-most level, the QDTA implementation (Algorithm 1) loops through each time step (Algorithm 1). Inside each time step, an approximate STA-based flow solution is first obtained using the Frank-Wolfe’s algorithm (Algorithm 2, Algorithm 3). Note that path truncation is performed in each iterative step of the Frank-Wolfe’s algorithm (Algorithm 3). This ensures that, for each path, only the approximate portion traversed within the current time segment is considered so as to prevent erroneous congestion effects of traffic that occurs outside the current time segment. Based on the converged solution from Algorithm 2, a last round of route truncation is performed to obtain a more accurate flow assignment and residual demand to be carried over to the next time step (Algorithm 4).

Data: Network graph
   Time step length
   Total time step counts
   Original travel demand of all time steps , with
Initialize empty nested associative array ;
  // No residual demand in the first time step
for  ;
  // Sequential discrete time step simulation
 do
       ;
        // Get original and residual demand
       Traffic_assignment ;
        // Path flow assignment using STA approximation
       Residual_demand ;
        // Truncate path flows and get residual demand based on time step length
      
end for
Result: , with ;
  // Path flow for all time steps
Algorithm 1 Quasi-dynamic traffic assignment
Data: Network graph
   Time step length
   Travel demand of current step
   Free-flow travel time of each link , with
   Frank-Wolfe iteration maximum steps
Take All_or_nothing ;
  // Set initial path flow in the free-flow condition
for  ;
  // Gradient descent step
 do
       ;
        // Calculate the edge travel time
       All_or_nothing ;
        // All-or-nothing path flow with new edge weights
       Cost_function ;
        // Line search
       ;
        // Update path flow
       if  Converged () then
             break
       end if
      
end for
Result: ;
  // Path flow for the current time step using STA approximation
Algorithm 2 Traffic_assignment: using Frank-Wolfe’s algorithm
Data: Network graph
   Time step length
   Travel demand of current step
   Edge travel time/cost
Initialize empty associative array ;
  //

Initialize the path flow vector

for  do
       Get_shortest_path ;
        // Get shortest path to destination
       Truncate_path ;
        // Truncate path according to time step length
       ;
        // Add trips to the path flow vector
      
end for
Result: ;
  // Path flow for the current time step from all-or-nothing assignment
Algorithm 3 All_or_nothing: iterative step of the Frank-Wolfe Algorithm
Data: Network graph
   Time step length
   Intermediate path flow results from the STA approximation
   All paths used in the time step , with
Initialize empty nested associative array ;
  // Initialize the residual demand vector, will be added to the demand of the next step
Initialize empty associative array ;
  // Initialize the truncated path flow vector
Take ;
  // Edge travel time based on path/link flow from STA approximation
for  ;
  // Parallelizable for loop
 do
       Truncate_path ;
        // Truncate path to what can be traversed in the time step
       ;
        // Populate the path flow vector
       if  ;
        // Flow has not reached its destination within the time slice
       then
             Get_last_vertex;
             Get_last_vertex;
             ;
              // Add to residual demand
            
       end if
      
end for
Result: , ;
  // Path flow for the current time step using QDTA, and residual demand to be added to the next time step
Algorithm 4 Residual_demand: trips that cannot finish in one assignment interval

The repeated usage of route truncation resolves the issue of traffic that is resident in the network for a longer time horizon than the given QDTA time segment duration by re-assigning the residual traffic to the next time slot. With this capability, the demand entering the network at the next time interval along with the residual demand is then introduced to the following time segment’s traffic assignment. This process is repeated until the final time of the simulation is reached. The model can be interpreted as an time-expanded network approach where the expansion of the network represents the evolution with respect to time.

In the algorithm, the following operations are used but not explicitly defined:

  • hortest_path returns the shortest path between the origin $p$ and the destination $q$ given the road network graph $G(\V, \A)$ with cost $\bc = \{c_a\}$, with $a \in \A$.
    \item \verb Truncate_path returns the sub-path of a path $r$ that can be traversed in time $T$ given the edge costs $\bc$.
    \item \verb Get_last_vertex returns the last vertex of a path $r$.
    \item \verb Cost_function is the function to be minimized to obtain the UE traffic flow, with one instantiation shown in Equation (\ref{eq:path_flow_formulation_ue}).
    \item \verb Converged checks the convergence of the Frank-Wolfe’s algorithm. The algorithm is considered to be converged if the relative absolute change in total system travel time is less than 0.1\%.
    \end{itemize}
    \section{High-performance computational solutions for QDTA}
    \label{sec:hpc_qdta}
    Our overall goal is to reduce the compute time for modeling urban mobility so that city planners can investigate a large number of scenarios in a reasonably small period of time, while still preserving the fidelity of the inherent traffic dynamics.
    The scope of the models considered in our work are full urban networks that include all functional road classes and a full urban scale travel demand. We currently are working with models for two major urban regions: the
    an Francisco Bay Area and the Los Angeles Area. The road network for San Francisco is shown in Fig. 0(a) and accounts for 0.5M nodes and 1M links ( [61]). We use the SFCTA CHAMP 6 model [62] for the Bay Area model which accounts for 19 million trips during a 24-hour period. We have also successfully applied this computational framework to the Los Angeles road network with 1M nodes and 2M links with 40M trips in a 24 hour period. For brevity, we will present results for the Bay Area only.

    (a) Bay Area Network 1M Links
    Figure 1: Road Network for the SF Bay Area

    Figure 2 shows the computational flow of our implementation of QDTA. Mobiliti [63], our computational platform in which the QDTA is integrated, also provides parallel discrete event traffic simulations (PDES) for urban-scale regions. The travel demand representation and road network graphs are shared between both QDTA and the PDES simulation capability. As will be discussed in Section 4.1, the core algorithms described in Section 4 are parallelized and implemented within this computational flow. Specifically, we have developed an interval-based dynamic traffic assignment solver with residual carry-over where each interval is solved using a parallelized Frank-Wolfe algorithm to minimize the desired cost function (e.g. user-equilibrium, social optimum or fuel-efficient routing). This solution for traffic assignment is sufficient for optimizing convex objective functions. We are able to achieve high computation performance through parallelization of many parts of the QDTA algorithm with both distributed memory (multi-node) and shared memory (multi-thread) computation.

    Figure 2: Computational Flow for QDTA : blue box indicates the parts which have been parallelized. This flow diagram corresponds to the pseudo-algorithm described in Algorithm 3.

    4.1 Description of QDTA Algorithm Parallelization

    Data: Network graph
       Time step length
       Total time step counts
       Original travel demand of all time steps , with
    Initialize empty nested associative array ;
      // No residual demand in the first time step
    for  do
           Let be this thread’s partition of demand (for thread ) ;
           ;
            // Combine original and residual demand
           Parallel_traffic_assignment ;
           Parallel_residual_demand ;
          
    end for
    Result: , with ;
      // Distributed path flows for all time steps
    Algorithm 5 Parallel quasi-dynamic traffic assignment

    Algorithm 5 describes the distributed-memory parallel QDTA algorithm. When parallelizing any algorithm, one of the design choices is how to partition both the data and computational work across available compute resources. This choice may differ depending on the algorithmic step being parallelized (see Figure 2). Due to the high computational cost of computing shortest path routes over a large network, the most critical step to optimize for performance is the routing of all active vehicles in the current time segment (Algorithm 3). In order to achieve effective parallelization of the routing step, and to distribute storage and management of the resulting routes, each thread is assigned a subset of the demand to compute their routes and flows , manage their corresponding residual demand , and finally store their results to disk at program completion.

    Data: Network graph
       Time step length
       Local partition of travel demand of current step
       Free-flow travel time of each link , with
       Frank-Wolfe iteration maximum steps
    Let Parallel_all_or_nothing ;
      // Set initial path flow
    for  ;
      // Gradient descent step
     do
           ;
            // Calculate the edge travel time
           Parallel_all_or_nothing ;
           Parallel_line_search ;
           ;
            // Update link flows
           if  Relative_change_in_cost() < 0.0001 then
                 break
           end if
           ;
            // Update path flows
          
    end for
    Result: ;
      // Path flows (local) and link flows (global)
    Algorithm 6 Parallel_traffic_assignment: using Frank-Wolfe’s algorithm
    Data: Network graph
       Time step length
       Travel demand of current step
       Edge travel time/cost
    Pre-process (customize) network with updated link weights ;
    Initialize empty associative array ;
      // Initialize the path flow vector
    for   do
           Get_shortest_path ;
            // Get shortest path
           Truncate_path ;
            // Truncate path according to time step length
           ;
            // Add trips to the local path flow vector
          
    end for
    ;
      // Compute local link flows from local path flows
    Global all-reduce ;
      // Compute global link flows from local link flows
    Result: ;
      // Path flows (local) and link flows (global)
    Algorithm 7 Parallel_all_or_nothing: iterative step of the Frank-Wolfe Algorithm

    Algorithm 6 describes the parallelized Frank-Wolfe algorithm that assigns traffic to each time segment. The first step in this algorithm is the parallel all-or-nothing routing step described in Algorithm 7. Note that each thread requires full knowledge of the network’s current link weights to be able to route its subset of trip legs . An important optimization we made for the routing step involves utilizing a multi-phase routing algorithm, where the network is first pre-processed with connectivity and weight information to enable subsequent routing queries to be computed very efficiently and in parallel. To this end, we leverage Customizable Contraction Hierarchies [64], which further splits the preprocessing phase in two, allowing the weights of the existing links in the network to be updated with less computation time compared to doing a full topological update (where the connectivity of the links may also change). In Algorithm 7, the preprocessing is done first so that all threads can subsequently utilize the pre-processed network with updated weights to compute its assigned routes.

    After the parallel routing and truncation step (the for loop in Algorithm 7), each process’s memory contains only the routes computed by the threads local to that process (implicit in ). However, in order to proceed to the next algorithmic step, the impact of all routes globally must be taken into consideration and reflected in the network flow data in every process. A key observation is that we do not need to globally broadcast the actual routes computed for every vehicle trip leg, as this would be a very large amount of data to communicate. Instead, the thread local route flows are reduced to thread local link flows in parallel, and then the local link flows are globally all-reduced to calculate the total link flows that result from all vehicle trip legs. In this way, we avoid having to communicate any route information between parallel processes, only the resultant flows on the links themselves.

    The next stage in Algorithm 6 after the all-or-nothing calculation is the line search to select the optimal step size. The parallelized algorithm selects the step size using directly instead of (as was done in Algorithm 2), but is equivalent since multiplication with the incidence matrix distributes in the cost function expression, i.e.:

    (6)
    (7)
    (8)

    In Equation 8, the function maps the flow on a link to its per-vehicle cost (i.e. the link traversal time using a BPR function). Thus, Equation 8 computes the total system cost, where the per-vehicle costs are multiplied by the link flows and then summed across the whole network. Furthermore, convergence of the Frank-Wolfe algorithm is determined when the relative change in total system cost falls below 0.001:

    (9)

    If evaluated sequentially, the line search is a very computationally expensive step since it requires an iterative search where the cost function is evaluated many times for potential values of until the minimum is found. Furthermore, each single evaluation of the cost function requires summing the cost contribution of every link in the network, which is substantial for large networks with millions of links. As a result, many computational traffic assignment implementations will simply use the method of successive averages (MSA) [65]

    as a heuristic to select the gradient descent step size to avoid this high computational cost. However, the MSA method has a substantial drawback in that taking sub-optimal step sizes for each gradient descent iteration results in requiring more iterations overall for the traffic assignment to converge. We have found that by selecting the optimal step size for each iteration, the number of gradient descent iterations may be reduced significantly. Therefore, computing the line search efficiently through parallelization is a key capability of our approach.

    Data: Network graph
       Current link flows
       All-or-nothing link flows
       Current Frank-Wolfe gradient descent iteration
       Line search iteration maximum steps
       Convergence threshold (slope)
       Convergence threshold (step size)
    Initialize ;
      // Initial guess
    for  ;
      // Newton line search iteration
     do
           Parallel_cost_function ;
           if  ;
            // Threshold 1
           then
                 ;
                 break
           end if
           ;
           Enforce ;
           if  ;
            // Threshold 2
           then
                 ;
                 break
           end if
          
    end for
    Result:
    Algorithm 8 Parallel_line_search: Newton’s Line Search for optimal

    Algorithm 8 describes our parallel line search algorithm, which is the implementation of Equation 7. For brevity, we introduce the short-hand function , which is equivalent to the cost function evaluated with step size and implicit arguments and . The algorithm uses Newton’s method to identify where is minimized (i.e. where the derivative of the equals zero). Thus, at each step in Newton’s method, we require approximations of the cost function’s first two derivatives at the current guess. These approximations are calculated using the finite difference method by evaluating at the values: , where nominally, . Then:

    (10)

    The thresholds and in the algorithm are tunable parameters to control the quality of convergence. We have used in our experiments to ensure good convergence of the line search. Figure 3 shows the average and maximum number of Newton iterations required to conduct the line search for all gradient descent iterations taken within each time segment. The average number of line search iterations remains consistent between 1 and 4, while the maximum is as high as 8 for the most highly congested time segments.

    Figure 3: Maximum and average number of Newton iterations to conduct the line search over all gradient descent iterations within each time segment for the San Francisco Bay Area network model.
    Data: Network graph
       Step size
       Current link flows
       All-or-nothing link flows
    Initialize ;
    Let = this thread’s partition of network links (for thread );
    Let Cost_function ;
    for  do
           ;
          
    end for
    Global all-reduce ;
      // Reduce values simultaneously
    ;
    ;
    Result:
    Algorithm 9 Parallel_cost_function: evaluate and first two derivatives

    The key to computing the line search efficiently is evaluating , , and in parallel and as a batch. Since the total cost function is the sum of partial cost contributions from each link in the network (see Equation 8) parallelization is achieved by partitioning the links in the network across available threads. Algorithm 9 describes our approach to evaluate the cost function and its derivatives in parallel. For each set of three cost function evaluations, each thread computes its local contributions to the three cost functions for the subset of links assigned to that thread. The resulting local cost contributions are then globally all-reduced across threads to obtain the total cost function values. In order to avoid the communication overhead of three separate global reductions for each of the three evaluations, our implementation simultaneously reduces all three values with a single vectorized all-reduce operation. The derivatives are then estimated using the approximations in Equation 10. Due to the efficient parallel evaluation of the cost function derivatives, we observed that even for the cases with the highest number of Newton iterations (see Figure 3), the line search completes in less than 100 milliseconds using 512 cores of the Cori supercomputer (see Section 4.2 for details on Cori).

    The parallelized line search significantly improves the computational performance of the overall QDTA compared to using the method of successive averages (MSA). Figures 3(a) and 3(b) show the impact of using a line search versus MSA on number of gradient descent iterations and total time to solution for each time segment in our experiment on the San Francisco Bay Area network (see Section 4.2 for details). The line search method reduces the number of gradient descent iterations by up to 73 percent, and the total compute times are highly correlated with the number of gradient descent iterations primarily because of the expensive all-or-nothing routing step required for each iteration. By finding the optimal step size for each iteration, the number of iterations required to converge is significantly reduced, resulting in a reduction of more than 49 percent in the total execution time compared to using the method of successive averages to select the step size.

    (a)
    (b)
    Figure 4: Comparison of a) the number of gradient descent iterations and b) the total segment compute time required using the method of successive averages (MSA, blue circles) versus using a optimal step size via Newton’s method line search (orange squares). Data is for the San Francisco Bay Area network model, and execution times are measured when running on the Cori computer with 512 cores.

    Once the line search has converged and the optimal step size has been identified, the link weights for the network must be updated in every process. This step is parallelized across threads by assigning each thread a partition of the links to update the flow values and compute the new weights associated with its subset of links. Because each process must update all of the links in the network for the subsequent routing step to perform correctly, the degree of parallelism in this step is reduced to the number of threads within each process (as opposed to across all processes). This step corresponds to the upper blue boxes in Figure 5.

    Finally, after the Frank-Wolfe algorithm has converged, the residual demand allocation must be performed to forward residual vehicles into the next QDTA time segment (Algorithm 4). The parallelization strategy for this algorithm partitions the set of active vehicle trip legs across threads in the exact manner as Algorithm 7. Each thread iterates over the subset of trip legs assigned to it (implicit in ) and forwards the vehicles that are still in transit at the end of the current time segment into the next time segment. Forwarded trip legs are guaranteed to be assigned to the same process in the next time segment so that the route incrementally assigned to each vehicle is maintained by a single process owner. Once the residual demand is computed, the non-residual demand in the next time segment is identified and combined with the residual demand in parallel, and the Frank-Wolfe optimization for the next time segment begins. As we have described, the majority of the algorithms required for the QDTA methodology (as shown in Figure 2) have been parallelized to achieve high performance on distributed memory computer platforms.

    4.2 Evaluation of computational performance and parallel scalability

    We use high performance computing to address the computational challenges of a traffic assignment at urban scale. The solutions are implemented on the Cori supercomputer, a Cray X40 at the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory. In order to evaluate the computational strong scaling performance (improvement in program execution time for a fixed input as the number of cores is increased), we ran the QDTA algorithm for the San Francisco Bay Area network, utilizing up to 32 nodes of Cori with 1,024 cores total (2 processes per node, 16 cores per process, 2 threads per core). As we described in Section 4.1, the computational advantage is realized through parallelization of the various algorithmic steps in Fig. 2 across many compute cores. We segmented the 24-hour day into 96 time segments, each 15 minutes (versus the multiple hour time segments that are traditionally used). For all runs, we solved for optimized route plans for all 19 million vehicle trips over the San Francisco Bay Area network (0.5 million nodes and 1 million links).

    (a)
    (b)
    Figure 5: Computational scaling performance for our QDTA approach as parallelism is increased.

    Figure 5 shows the strong scaling performance results: a) when increase the number of cores on a single node, and b) when scaling across multiple nodes. Running the 96 time segment QDTA on a single node, the execution time is reduced from more than three hours (188 minutes) when run on a single core to under 19 minutes utilizing all 32 cores. This represents a parallel speedup of about 10x compared to single-core execution. Running the QDTA on multiple nodes, we see that the execution time is reduced further to under 6 minutes when run on 32 nodes (1,024 cores), representing a total speedup of over 34x compared to the single core case. The reason the speedup is sub-linear (less than times speed up when using times the compute cores) is due to a combination of overhead of parallelization (communication and synchronization costs between cores and processes) and the parts of the code which are not fully parallelized (resulting in Amdahl bottlenecks [66]).

    5 Simple Network Example

    To help illustrate the algorithms presented in Sections 4 and LABEL:sec:hpc_qdta, we describe the application of the QDTA model to a simple network with 4 links and 5 nodes and compare it with STA. For simplicity, all links are connected serially and no route choice decisions are involved in the network. We consider the demand for 1 hour with 4 time periods with 15 minutes interval each. The demand for every 15 minutes is given in Fig. 6. The link travel time function is of BPR form as shown in Equation (1b).

    Figure 6: Network attributes and demand in the example

    For QDTA model (Fig. 7), for time , demand traverses link and reaches node 2, but it cannot reach its destination in the same time interval and hence stored in the downstream node 3 as residual. In the next time period , the new demand and the residual from previous time segment will traverse link . All demand reaches its destination in time segment . Time periods and have no demand and hence all trips reaches destination in the first two time interval. The total system travel time is: . The congested links in the network are during time periods and respectively.

    Figure 7: QDTA assignment for simple network. The network is loaded in 2 time segments. represents the residual stored in the node at the end of each time period. The link travel time in minutes is noted above each link.

    The same demand is assigned using STA as shown in Fig. 8. The demand is assigned as average for the entire 1 hour duration. The total system travel time is: . None of the links are congested during any time interval in this assignment.

    Figure 8: STA assignment for simple network. All demand is assigned at the same time and gets averaged for the entire 1 hour time duration.

    6 Application to San Francisco Bay Area Network

    6.1 Analysis of Traffic Assignment Results

    Two models are considered for San Francisco Bay Area. Baseline model is the static traffic assignment (STA) for 7-10 am morning peak and QDTA model is the user equilibrium with shortest travel time optimization using the Frank Wolfe solver run for the entire day in 15 minute time intervals. The travel demand was obtained from SFCTA CHAMP6 [62] model for 24 hours of a typical day. Each trip was identified with an origin and destination micro-analysis zone, which was then assigned to specific network nodes by weighting with the population density obtained from Global Human Settlement [67]. Fig. 9 shows the demand profiling for both cases under consideration. While STA assumes a constant demand for the entire morning peak period, QDTA considers a variable demand for the same duration. A professional map from HERE Technologies  [61] is a core part the foundation for the Mobiliti platform. The map information is transformed into a different representation in order to integrate into the algorithms discussed earlier. However, we maintain the definitions of functional class roads as defined by HERE Technologies  [61]

    . Specifically, functional classes classify roads according to the speed, importance and connectivity of the road. A road can be one of five functional classes – these are defined in

    Table 3. The analysis presented here will use these functional classes to help explore the results that compare QDTA to STA.

    Figure 9: Temporal demand interaction and model capabilities
    Functional Class Definition
    1 Allowing for high volume, maximum speed traffic movement
    2 Allowing for high volume, high speed traffic movement
    3 Providing a high volume of traffic movement
    4 Providing for a high volume of traffic movement at moderate speeds between neighbourhoods
    5 Roads whose volume and traffic movement are below the level of any other functional class
    Table 3: Functional Road Classes

    Table 4 provides a comparison of the system level metrics using vehicle miles travelled (VMT), average volume over capacity (VOC) ratios, and vehicle hours of delay (VHD) categorised by functional class. Only links with positive flow are included in the analysis. Additionally, a comparison was made between congested and non-congested links (Table 5). For QDTA analysis, congestion is calculated for 7.45-8 AM time period when the demand is the highest.

    Category VMT (in millions) Average VOC VHD (in thousands)
    STA QDTA STA QDTA STA QDTA
    FC2 16.44 18.09 0.52 0.69 17.57 96.84
    FC3 4.67 5.00 0.23 0.30 2.65 14.43
    FC4 4.96 5.24 0.14 0.17 1.13 6.42
    FC5 2.43 2.60 0.03 0.48 0.36 17.78
    Total 28.51 30.95 - - 21.73 135.49
    Table 4: System Level Metrics
    Category Length (km) VMT (in millions) Average VOC
    STA QDTA STA QDTA STA QDTA
    FC2 129 546 1.77 7.56 1.15 1.26
    FC3 37 86 0.18 0.42 1.18 1.27
    FC4 14 55 0.04 0.15 1.19 1.20
    FC5 5 23 0.01 0.06 1.17 1.23
    Total 185 710 2.0 8.19 - -
    Table 5: Congested Network Metrics

    The main distinction between the two models is how best each can replicate congestion on links. As seen in Table 5, for the links that are congested in STA, QDTA predicts even higher congestion. Since non uniform demand distribution results in higher travel time on links is in accordance with the convexity properties associated with link performance functions, QDTA with varying demand is expected to produce more congestion over STA with uniform demand. QDTA also predicts congestion dynamically than STA for all functional class categories. Fig. 10 shows the VOC over time by functional class (left) and VOC comparison for the two models by functional class(right). For all classes of roads the QDTA predicts congestion dynamically over time whereas STA either overestimates or underestimates the values irrespective of the demand dynamics. This difference is highly pronounced for FC2 where STA significantly underestimates the peak hour congestion.

    Figure 10: Congestion profiling for QDTA and STA. Figure on the left shows the average VOC ratio for all links over time categorised by functional classes. The figure on right is a scatter plot showing the VOC ratio comparison for every link in the network for the 2 models at 7.45-8 am.

    It can also be seen from Table 4 that the total system delay is significantly higher in QDTA over STA. This is due to the dynamic demand distribution and the modelling capability in QDTA that allows interaction of demand between time intervals, thus allowing for a more realistic modelling of traffic. In QDTA, the residual demand from each time interval is carried over to the next phase (Fig. 11). It needs to be noted that due to the route truncation mechanism, all portions of a trip for a particular time period is not completed in that time period. So portions of a trip is loaded onto network in multiple time periods. In QDTA, each trip leg only contributes flow to the links it traverses within the time segment, whereas for STA, every leg contributes flow to the entire route. For the time interval 7.45-8 AM, STA has 185 km of congested network in comparison to QDTA with 720 km of congested network. Fig. 12 shows the Bay area network with congestion locations for the two modelling cases.

    Figure 11: QDTA demand modelling with residuals (left). The length of network congested over time. A link is considered congested if the VOC ratio is greater than or equal 1.
    Figure 12: VOC ratios for congested links for QDTA (left) and STA (right) is shown for time period 7.45 - 8 AM. The links that are predicted to be congested in STA has even higher congestion in QDTA. STA produces 185 km of congested network in comparison to QDTA with 710 km of congested network. Only links with VOC greater than or equal to 1 is shown.

    6.2 Validation

    Validation was performed for QDTA results to test the effectiveness of representing the real world traffic environment. We conducted validation for traffic volume, speed and system metrics using multiple data sources. Stage 1 of the validation procedure involves checking the traffic counts for eight link corridors. The traffic count for each link was compared against the field data for the entire day in 15 minutes increments. The field data for city roads and highways were collected from the city of San Jose and Caltrans PeMS website [68] respectively for the year 2019. Each corridor provided information regarding traffic volumes and speeds by time of the day and direction. R of 0.7 is used as a satisfactory criterion for link count checks. The Fig. 13 shows R values for the eight corridors under consideration. The modeled corridors indicate a close match with the field data with the lowest R value observed being 0.68 for Zanker Road.

    Figure 13: Validation of traffic counts in 15 minute increments for links in different functional classes. All links except one have satisfactory R of greater than 0.7

    Stage 2 is speed comparison with Uber Movement speed data for San Francisco region for Q4, 2019 [69]. Links from Uber network were matched to our network for 139,495 links (20% of total). The speeds were compared for 8 am to 9 am for different speed limits. Figure 14 shows the speed distributions from QDTA results and Uber on links with 60 mph and 70 mph speed limit. Figure 15 shows the average speeds from Mobiliti and Uber across all speed limits. We believe the observed discrepancies in the speed distributions may be improved in future work with the addition of real world traffic signal location and timing data and further refinement of the link flow congestion and timing models.

    Figure 14: Kernel density plot comparing QDTA model and Uber speed distributions at 60 mph (left) and 70 mph (right).
    Figure 15: Distribution of average Uber (left) and QDTA model (right) speeds across specific speed limits between 8-9am in the morning.

    Final stage of validation includes system level metrics comparisons, network validation, and error checking. Model visualization is used to check for unusual activities in traffic flows and odd roadway network attributes. Error checking and model verification consist of several smaller tasks such as checks for link geometry and connectivity, number of lanes, speeds, ramps and intersection geometry. Since our travel demand data was obtained from SFCTA, which conducts their own validation, we did not conduct additional behavior checks. We conducted system metric checks for VMT and total demand and validated them against the 2017 Environmental Impact report for the Bay Area

    [70] in Table 6.

    Metric QDTA Field Data Relative Error(%)
    VMT 150,453,402 158,406,800 -5
    Daily Trips 19,167,301 21,227,800 -10
    Table 6: System level metrics validation

    7 Conclusions

    In this paper, we have presented a quasi-dynamic traffic assignment methodology to capture temporal dynamics in large-scale transportation networks and described its parallelization to run efficiently on distributed-memory high performance computing systems. Two key mechanisms, route truncation and residual demand, were implemented to provide more realistic demand profiles as the dynamic assignment interval is reduced and a greater percentage of vehicle trips span across multiple time segments. Our approach divides the simulated day into 15 minute intervals and utilizes a modified static traffic assignment within each interval to assign the active traffic present in the network. The QDTA assignment step differs from a traditional STA in that the assigned routes are truncated to fit in the active time interval in each Frank-Wolfe iteration so that the resulting solution only includes traffic that occurs within the current interval. Furthermore, residual demand for each time interval is calculated based on estimated travel time on the links and then carried over to the next time interval. The combination of these techniques resolves traffic that is resident in the network for a longer time horizon than a single static assignment period and captures dynamic network behavior across multiple time segments. The model can be interpreted as an time expanded network approach where the expansion of the network represents the evolution with respect to time.

    We have also described how the quasi-dynamic traffic assignment algorithm can be parallelized for efficient execution on high-performance distributed-memory computing platforms. The algorithm is parallelized through a combination of partitioning trip legs and network links across available compute threads to speed up the calculation of shortest paths, optimization cost functions, residual trip legs, and other program data. We described our parallelized line search algorithm which enables the identification of the optimal step size for each gradient descent iteration using Newton’s method, taking less than 100 milliseconds for a network of 1 million links. Using the optimal step size provides a reduction of more than 49 percent in total execution time compared to using the method of successive averages, due to a decrease in gradient descent iterations required for convergence. We demonstrated that a quasi-dynamic traffic assignment of the San Francisco Bay Area (19 million trip legs, 0.5 million nodes, and 1 million links) using 96 15-minute time segments runs in under 6 minutes on 1,024 cores of the Cori supercomputer at NERSC, corresponding to a speedup of 34x compared to single core performance.

    Finally, we presented an analysis of the traffic assignment results across functional classes, illustrating how the QDTA more accurately resolves the increased congestion patterns and dynamic behavior of the traffic system compared to a static traffic assignment approach, especially under peak congestion conditions. We presented a validation of the QDTA assignment counts and speeds compared to field data from CalTrans, San Jose, and Uber Movement, showing field count correlation values of 0.68 or greater. Future work includes evaluating a variety of optimization objective functions, include fuel optimization and system level optimizations for both travel time and fuel use. We hope to develop surrogate models using the results from the supercomputer implementation such that this capability can be made widely available.

    Acknowledgements

    This report and the work described were sponsored by the U.S. Department of Energy (DOE) Vehicle Technologies Office (VTO) under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program. The following DOE Office of Energy Efficiency and Renewable Energy (EERE) managers played important roles in establishing the project concept, advancing implementation, and providing ongoing guidance: David Anderson and Prasad Gupte. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

    References

    • [1] Nikolaos Tsanakas, Joakim Ekström, and Johan Olstam. Estimating emissions from static traffic models: Problems and solutions. ISSN: 0197-6729 Pages: e5401792 Publisher: Hindawi Volume: 2020.
    • [2] Michael G. H. Bell and Yasunori lida. Transportation Networks, chapter 2, pages 17–40. John Wiley and Sons, Ltd, 1997.
    • [3] Stefan Flügel and Gunnar Flötteröd. Traffic assignment for strategic urban transport model systems.
    • [4] Michiel Bliemer, Mark Raadsen, Erik de Romph, and Erik-Sander Smits. Requirements for traffic assignment models for strategic transport planning: A critical assessment. page 25.
    • [5] D.K. Merchant and G.L. Nemhauser. A model and an algorithm for the dynamic traffic assignment problems. Transportation science, 12(3):183–199, 1978.
    • [6] D.K. Merchant and G.L. Nemhauser. Optimality conditions for a dynamic traffic assignment model. Transportation Science 12.3, 12(3):200–207, 1978.
    • [7] Takamasa Iryo. Multiple equilibria in a dynamic traffic network. 45(6):867–879.
    • [8] Genetics of traffic assignment models for strategic transport planning. 37:56–78, January 2017.
    • [9] Michiel CJ Bliemer, Mark PH Raadsen, Erik-Sander Smits, Bojian Zhou, and Michael GH Bell. Quasi-dynamic traffic assignment with residual point queues incorporating a first order node model. Transportation Research Part B: Methodological, 68:363–384, 2014.
    • [10] Hasti Tajtehranifard, Ashish Bhaskar, Neema Nassir, Md Mazharul Haque, and Edward Chung. A path marginal cost approximation algorithm for system optimal quasi-dynamic traffic assignment. Transportation Research Part C: Emerging Technologies, 88:91–106, 2018.
    • [11] Gaetano Fusco, Chiara Colombaroni, Andrea Gemma, and Stefano Lo Sardo. A quasi-dynamic traffic assignment model for large congested urban road networks. International Journal of Mathematical Models and Methods in Applied Sciences, 7(4):341–349, 2013.
    • [12] Shoichiro Nakayama and Richard Connors. A quasi-dynamic assignment model that guarantees unique network equilibrium. Transportmetrica A: Transport Science, 10(7):669–692, 2014.
    • [13] Xing Zeng, Xuefeng Guan, Huayi Wu, and Heping Xiao. A data-driven quasi-dynamic traffic assignment model integrating multi-source traffic sensor data on the expressway network. ISPRS International Journal of Geo-Information, 10(3):113, 2021.
    • [14] H.-K. Chen. Dynamic travel choice models: a variational inequality approach. Springer Science & Business Media, 2012.
    • [15] A. Nagurney and D. Zhang. Projected dynamical systems and variational inequalities with applications, volume 2. Springer Science & Business Media, 2012.
    • [16] B. Ran and D.E. Boyce. Dynamic urban transportation network models: theory and implications for intelligent vehicle-highway systems, volume 417. Springer Science & Business Media, 2012.
    • [17] B. Ran and T. Shimazaki. A general model and algorithm for the dynamic traffic assignment problems. In Transport Policy, Management & Technology Towards 2001: Selected Proceedings of the Fifth World Conference on Transport Research, volume 4, 1989.
    • [18] B. Ran, D.E. Boyce, and L.J. LeBlanc. A new class of instantaneous dynamic user-optimal traffic assignment models. Operations Research, 41(1):192–202, 1993.
    • [19] T.L. Friesz, J. Luque, R.L. Tobin, and B.-W. Wie. Dynamic network traffic assignment considered as a continuous time optimal control problem. Operations Research, 37(6):893–901, 1989.
    • [20] B. Ran and D.E. Boyce. A link-based variational inequality formulation of ideal dynamic user-optimal route choice problem. Transportation Research Part C: Emerging Technologies, 4(1):1–12, 1996.
    • [21] D.E. Boyce, B. Ran, and L.J. Leblanc. Solving an instantaneous dynamic user-optimal route choice model. Transportation Science, 29(2):128–142, 1995.
    • [22] T.L. Friesz, D. Bernstein, T.E. Smith, R.L. Tobin, and B.-W. Wie. A variational inequality formulation of the dynamic network user equilibrium problem. Operations Research, 41(1):179–191, 1993.
    • [23] Alexandre Bayen, Alexander Keimer, Emily Porter, and Michele Spinola. Time-continuous instantaneous and past memory routing on traffic networks: A mathematical analysis on the basis of the link-delay model. SIAM Journal on Applied Dynamical Systems, 18(4):2143–2180, 2019.
    • [24] S. Peeta and T.-H. Yang. Stability issues for dynamic traffic assignment. Automatica, 39(1):21–34, 2003.
    • [25] R. Ma, X.J. Ban, and J.-S. Pang. Continuous-time dynamic system optimum for single-destination traffic networks with queue spillbacks. Transportation Research Part B: Methodological, 68:98–122, 2014.
    • [26] X.J. Ban, J.-S. Pang, H.X. Liu, and R. Ma. Modeling and solving continuous-time instantaneous dynamic user equilibria: A differential complementarity systems approach. Transportation Research Part B: Methodological, 46(3):389–408, 2012.
    • [27] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part i: Methodology and theoretical analysis. Transportation Research Part B: Methodological, 49:55 – 74, 2013.
    • [28] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part ii: Numerical analysis and computation. Transportation Research Part B: Methodological, 49:75 – 93, 2013.
    • [29] K. Han, T.L. Friesz, and T. Yao. Existence of simultaneous route and departure choice dynamic user equilibrium. Transportation Research Part B: Methodological, 53:17 – 30, 2013.
    • [30] A. Bressan and K.T. Nguyen. Optima and equilibria for traffic flow on networks with backward propagating queues. NHM, 10(4):717–748, 2015.
    • [31] A. Bressan and K.T. Nguyen. Conservation law models for traffic flow on a network of roads. NHM, 10(2):255–293, 2015.
    • [32] M. Garavello, K. Han, and B. Piccoli. Models for vehicular traffic on networks, volume 9. American Institute of Mathematical Sciences (AIMS), Springfield, MO, 2016.
    • [33] M. Garavello and B. Piccoli. Traffic flow on networks, volume 1. American institute of mathematical sciences Springfield, 2006.
    • [34] H. Holden and N. Risebro. A mathematical model of traffic flow on a network of unidirectional roads. SIAM Journal on Mathematical Analysis, 26(4):999–1017, 1995.
    • [35] G. Bretti, R. Natalini, and B. Piccoli. Numerical approximations of a traffic flow model on networks. NHM, 1(1):57–84, 2006.
    • [36] MJ Lighthill and GB Whitham. On kinematic waves. i. flood movement in long rivers. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 229(1178):281–316, 1955.
    • [37] P. I. Richards. Shock waves on the highway. Operations research, 4(1):42–51, 1956.
    • [38] A. Keimer, N. Laurent-Brouty, F. Farokhi, H. Signargout, V. Cvetkovic, A.M. Bayen, and K.H. Johansson. Information patterns in the modeling and design of mobility management services. Proceedings of the IEEE, 106(4):554–576, 2018.
    • [39] M. Gugat, A. Keimer, G. Leugering, and Z. Wang. Analysis of a system of nonlocal conservation laws for multi-commodity flow on networks. Networks and Heterogeneous Media, 10(4):749–785, 2015.
    • [40] A. Keimer, L. Pflug, and M. Spinola. Nonlocal scalar conservation laws on bounded domains and applications in traffic flow. accepted in SIAM SIMA, 2018.
    • [41] A. Aw and M. Rascle. Resurrection of "second order" models of traffic flow. SIAM Journal on Applied Mathematics, 60(3):916–938, 2000.
    • [42] H.M. Zhang. A non-equilibrium traffic model devoid of gas-like behavior. Transportation Research Part B: Methodological, 36(3):275 – 290, 2002.
    • [43] M.O. Ghali and M.J. Smith. A model for the dynamic system optimum traffic assignment problem. Transportation Research Part B: Methodological, 29(3):155–170, 1995.
    • [44] S. Peeta and H.S. Mahmassani. System optimal and user equilibrium time-dependent traffic assignment in congested networks. Annals of Operations Research, 60(1):81–113, 1995.
    • [45] M.J. Smith. A new dynamic traffic model and the existence and calculation of dynamic user equilibria on congested capacity-constrained road networks. Transportation Research Part B: Methodological, 27(1):49–63, 1993.
    • [46] S.T. Waller and A.K. Ziliaskopoulos. A chance-constrained based stochastic dynamic traffic assignment model: Analysis, formulation and solution algorithms. Transportation Research Part C: Emerging Technologies, 14(6):418–427, 2006.
    • [47] H.-K. Chen and C.-F. Hsueh. A model and an algorithm for the dynamic user-optimal route choice problem. Transportation Research Part B: Methodological, 32(3):219–234, 1998.
    • [48] J. Long, W.Y. Szeto, H.-J. Huang, and Z. Gao. An intersection-movement-based stochastic dynamic user optimal route choice model for assessing network performance. Transportation Research Part B: Methodological, 74:182 – 217, 2015.
    • [49] B.N. Janson. Dynamic traffic assignment for urban road networks. Transportation Research Part B: Methodological, 25(2-3):143–161, 1991.
    • [50] B.N. Janson. Convergent algorithm for dynamic traffic assignment. Transportation Research Record, (1328), 1991.
    • [51] J.K. Ho. A successive linear optimization approach to the dynamic traffic assignment problem. Transportation Science, 14(4):295–305, 1980.
    • [52] Ricardo Correa, Ines de Castro Dutra, Mario Fiallos, and Luiz Fernando Gomes da Silva. Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, volume 67. Springer Science & Business Media, 2013.
    • [53] Transportation Simulation Systems. Aimsun. https://www.aimsun.com.
    • [54] Institute of Transportation Systems SUMO, German Aerospace Center (DLR). http://www.sumo.dlr.de, 2018.
    • [55] Sunil Thulasidasan, Shiva Kasiviswanathan, Stephan Eidenbenz, Emanuele Galli, Susan Mniszewski, and Phillip Romero. Designing systems for large-scale, discrete-event simulations: Experiences with the fasttrans parallel microsimulator. In 2009 International Conference on High Performance Computing (HiPC), pages 428–437. IEEE, 2009.
    • [56] Colin Sheppard et al. Modeling plug-in electric vehicle charging demand with beam, the framework for behavior energy autonomy mobility. Technical report, 05/2017 2017.
    • [57] Polaris.
    • [58] Wasuwat Petprakob, Lalith Wijerathne, Takamasa Iryo, Junji Urata, Kazuki Fukuda, and Muneo Hori. On the implementation of high performance computing extensionfor day-to-day traffic assignment. Transportation research procedia, 34:267–274, 2018.
    • [59] Willem Himpe, Romain Ginestou, and MJ Chris Tampère. High performance computing applied to dynamic traffic assignment. Procedia Computer Science, 151:409–416, 2019.
    • [60] M. Patriksson. The traffic assignment problem: models and methods. Courier Dover Publications, 2015.
    • [61] HERE Technologies. https://www.here.com/, 2019. [Online; accessed 06-Feb-2019].
    • [62] SFCTA. SF-CHAMP 6.1: ConnectSF Needs Assessment 2015 Base Year Model Run. Technical report, San Francisco County Transportation Authority, February 2019.
    • [63] Cy Chan, Bin Wang, John Bachan, and Jane Macfarlane. Mobiliti: scalable transportation simulation using high-performance parallel computing. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 634–641. IEEE, 2018.
    • [64] Julian Dibbelt, Ben Strasser, and Dorothea Wagner. Customizable contraction hierarchies. In International Symposium on Experimental Algorithms, pages 271–282. Springer, 2014.
    • [65] Hayssam Sbayti, Chung-Cheng Lu, and Hani S Mahmassani. Efficient implementation of method of successive averages in simulation-based dynamic traffic assignment models for large-scale network applications. Transportation Research Record, 2029(1):22–30, 2007.
    • [66] John L. Gustafson. Amdahl’s Law, pages 53–60. Springer US, Boston, MA, 2011.
    • [67] European Commission, Joint Research Centre (JRC); Columbia University, Center for International Earth Science Information Network - CIESIN. Ghs population grid, derived from gpw4, multitemporal (2015). http://data.europa.eu/89h/jrc-ghsl-ghs_pop_gpw4_globe_r2015a, 2015. European Commission, Joint Research Centre (JRC) [Dataset].
    • [68] Caltrans, state of california. https://dot.ca.gov/programs/traffic-operations/census/traffic-volumes, 2019. [Online; accessed 03-January-2021].
    • [69] Uber movement. https://movement.uber.com/cities/san_francisco/downloads/speeds, 2019. [Online; accessed 03-January-2021].
    • [70] Environmental impact report plan bay area. https://www.planbayarea.org/2040-plan/environmental-impact-report, 2017. [Online; accessed 05-November-2020].

5 Simple Network Example

To help illustrate the algorithms presented in Sections 4 and LABEL:sec:hpc_qdta, we describe the application of the QDTA model to a simple network with 4 links and 5 nodes and compare it with STA. For simplicity, all links are connected serially and no route choice decisions are involved in the network. We consider the demand for 1 hour with 4 time periods with 15 minutes interval each. The demand for every 15 minutes is given in Fig. 6. The link travel time function is of BPR form as shown in Equation (1b).

Figure 6: Network attributes and demand in the example

For QDTA model (Fig. 7), for time , demand traverses link and reaches node 2, but it cannot reach its destination in the same time interval and hence stored in the downstream node 3 as residual. In the next time period , the new demand and the residual from previous time segment will traverse link . All demand reaches its destination in time segment . Time periods and have no demand and hence all trips reaches destination in the first two time interval. The total system travel time is: . The congested links in the network are during time periods and respectively.

Figure 7: QDTA assignment for simple network. The network is loaded in 2 time segments. represents the residual stored in the node at the end of each time period. The link travel time in minutes is noted above each link.

The same demand is assigned using STA as shown in Fig. 8. The demand is assigned as average for the entire 1 hour duration. The total system travel time is: . None of the links are congested during any time interval in this assignment.

Figure 8: STA assignment for simple network. All demand is assigned at the same time and gets averaged for the entire 1 hour time duration.

6 Application to San Francisco Bay Area Network

6.1 Analysis of Traffic Assignment Results

Two models are considered for San Francisco Bay Area. Baseline model is the static traffic assignment (STA) for 7-10 am morning peak and QDTA model is the user equilibrium with shortest travel time optimization using the Frank Wolfe solver run for the entire day in 15 minute time intervals. The travel demand was obtained from SFCTA CHAMP6 [62] model for 24 hours of a typical day. Each trip was identified with an origin and destination micro-analysis zone, which was then assigned to specific network nodes by weighting with the population density obtained from Global Human Settlement [67]. Fig. 9 shows the demand profiling for both cases under consideration. While STA assumes a constant demand for the entire morning peak period, QDTA considers a variable demand for the same duration. A professional map from HERE Technologies  [61] is a core part the foundation for the Mobiliti platform. The map information is transformed into a different representation in order to integrate into the algorithms discussed earlier. However, we maintain the definitions of functional class roads as defined by HERE Technologies  [61]

. Specifically, functional classes classify roads according to the speed, importance and connectivity of the road. A road can be one of five functional classes – these are defined in

Table 3. The analysis presented here will use these functional classes to help explore the results that compare QDTA to STA.

Figure 9: Temporal demand interaction and model capabilities
Functional Class Definition
1 Allowing for high volume, maximum speed traffic movement
2 Allowing for high volume, high speed traffic movement
3 Providing a high volume of traffic movement
4 Providing for a high volume of traffic movement at moderate speeds between neighbourhoods
5 Roads whose volume and traffic movement are below the level of any other functional class
Table 3: Functional Road Classes

Table 4 provides a comparison of the system level metrics using vehicle miles travelled (VMT), average volume over capacity (VOC) ratios, and vehicle hours of delay (VHD) categorised by functional class. Only links with positive flow are included in the analysis. Additionally, a comparison was made between congested and non-congested links (Table 5). For QDTA analysis, congestion is calculated for 7.45-8 AM time period when the demand is the highest.

Category VMT (in millions) Average VOC VHD (in thousands)
STA QDTA STA QDTA STA QDTA
FC2 16.44 18.09 0.52 0.69 17.57 96.84
FC3 4.67 5.00 0.23 0.30 2.65 14.43
FC4 4.96 5.24 0.14 0.17 1.13 6.42
FC5 2.43 2.60 0.03 0.48 0.36 17.78
Total 28.51 30.95 - - 21.73 135.49
Table 4: System Level Metrics
Category Length (km) VMT (in millions) Average VOC
STA QDTA STA QDTA STA QDTA
FC2 129 546 1.77 7.56 1.15 1.26
FC3 37 86 0.18 0.42 1.18 1.27
FC4 14 55 0.04 0.15 1.19 1.20
FC5 5 23 0.01 0.06 1.17 1.23
Total 185 710 2.0 8.19 - -
Table 5: Congested Network Metrics

The main distinction between the two models is how best each can replicate congestion on links. As seen in Table 5, for the links that are congested in STA, QDTA predicts even higher congestion. Since non uniform demand distribution results in higher travel time on links is in accordance with the convexity properties associated with link performance functions, QDTA with varying demand is expected to produce more congestion over STA with uniform demand. QDTA also predicts congestion dynamically than STA for all functional class categories. Fig. 10 shows the VOC over time by functional class (left) and VOC comparison for the two models by functional class(right). For all classes of roads the QDTA predicts congestion dynamically over time whereas STA either overestimates or underestimates the values irrespective of the demand dynamics. This difference is highly pronounced for FC2 where STA significantly underestimates the peak hour congestion.

Figure 10: Congestion profiling for QDTA and STA. Figure on the left shows the average VOC ratio for all links over time categorised by functional classes. The figure on right is a scatter plot showing the VOC ratio comparison for every link in the network for the 2 models at 7.45-8 am.

It can also be seen from Table 4 that the total system delay is significantly higher in QDTA over STA. This is due to the dynamic demand distribution and the modelling capability in QDTA that allows interaction of demand between time intervals, thus allowing for a more realistic modelling of traffic. In QDTA, the residual demand from each time interval is carried over to the next phase (Fig. 11). It needs to be noted that due to the route truncation mechanism, all portions of a trip for a particular time period is not completed in that time period. So portions of a trip is loaded onto network in multiple time periods. In QDTA, each trip leg only contributes flow to the links it traverses within the time segment, whereas for STA, every leg contributes flow to the entire route. For the time interval 7.45-8 AM, STA has 185 km of congested network in comparison to QDTA with 720 km of congested network. Fig. 12 shows the Bay area network with congestion locations for the two modelling cases.

Figure 11: QDTA demand modelling with residuals (left). The length of network congested over time. A link is considered congested if the VOC ratio is greater than or equal 1.
Figure 12: VOC ratios for congested links for QDTA (left) and STA (right) is shown for time period 7.45 - 8 AM. The links that are predicted to be congested in STA has even higher congestion in QDTA. STA produces 185 km of congested network in comparison to QDTA with 710 km of congested network. Only links with VOC greater than or equal to 1 is shown.

6.2 Validation

Validation was performed for QDTA results to test the effectiveness of representing the real world traffic environment. We conducted validation for traffic volume, speed and system metrics using multiple data sources. Stage 1 of the validation procedure involves checking the traffic counts for eight link corridors. The traffic count for each link was compared against the field data for the entire day in 15 minutes increments. The field data for city roads and highways were collected from the city of San Jose and Caltrans PeMS website [68] respectively for the year 2019. Each corridor provided information regarding traffic volumes and speeds by time of the day and direction. R of 0.7 is used as a satisfactory criterion for link count checks. The Fig. 13 shows R values for the eight corridors under consideration. The modeled corridors indicate a close match with the field data with the lowest R value observed being 0.68 for Zanker Road.

Figure 13: Validation of traffic counts in 15 minute increments for links in different functional classes. All links except one have satisfactory R of greater than 0.7

Stage 2 is speed comparison with Uber Movement speed data for San Francisco region for Q4, 2019 [69]. Links from Uber network were matched to our network for 139,495 links (20% of total). The speeds were compared for 8 am to 9 am for different speed limits. Figure 14 shows the speed distributions from QDTA results and Uber on links with 60 mph and 70 mph speed limit. Figure 15 shows the average speeds from Mobiliti and Uber across all speed limits. We believe the observed discrepancies in the speed distributions may be improved in future work with the addition of real world traffic signal location and timing data and further refinement of the link flow congestion and timing models.

Figure 14: Kernel density plot comparing QDTA model and Uber speed distributions at 60 mph (left) and 70 mph (right).
Figure 15: Distribution of average Uber (left) and QDTA model (right) speeds across specific speed limits between 8-9am in the morning.

Final stage of validation includes system level metrics comparisons, network validation, and error checking. Model visualization is used to check for unusual activities in traffic flows and odd roadway network attributes. Error checking and model verification consist of several smaller tasks such as checks for link geometry and connectivity, number of lanes, speeds, ramps and intersection geometry. Since our travel demand data was obtained from SFCTA, which conducts their own validation, we did not conduct additional behavior checks. We conducted system metric checks for VMT and total demand and validated them against the 2017 Environmental Impact report for the Bay Area

[70] in Table 6.

Metric QDTA Field Data Relative Error(%)
VMT 150,453,402 158,406,800 -5
Daily Trips 19,167,301 21,227,800 -10
Table 6: System level metrics validation

7 Conclusions

In this paper, we have presented a quasi-dynamic traffic assignment methodology to capture temporal dynamics in large-scale transportation networks and described its parallelization to run efficiently on distributed-memory high performance computing systems. Two key mechanisms, route truncation and residual demand, were implemented to provide more realistic demand profiles as the dynamic assignment interval is reduced and a greater percentage of vehicle trips span across multiple time segments. Our approach divides the simulated day into 15 minute intervals and utilizes a modified static traffic assignment within each interval to assign the active traffic present in the network. The QDTA assignment step differs from a traditional STA in that the assigned routes are truncated to fit in the active time interval in each Frank-Wolfe iteration so that the resulting solution only includes traffic that occurs within the current interval. Furthermore, residual demand for each time interval is calculated based on estimated travel time on the links and then carried over to the next time interval. The combination of these techniques resolves traffic that is resident in the network for a longer time horizon than a single static assignment period and captures dynamic network behavior across multiple time segments. The model can be interpreted as an time expanded network approach where the expansion of the network represents the evolution with respect to time.

We have also described how the quasi-dynamic traffic assignment algorithm can be parallelized for efficient execution on high-performance distributed-memory computing platforms. The algorithm is parallelized through a combination of partitioning trip legs and network links across available compute threads to speed up the calculation of shortest paths, optimization cost functions, residual trip legs, and other program data. We described our parallelized line search algorithm which enables the identification of the optimal step size for each gradient descent iteration using Newton’s method, taking less than 100 milliseconds for a network of 1 million links. Using the optimal step size provides a reduction of more than 49 percent in total execution time compared to using the method of successive averages, due to a decrease in gradient descent iterations required for convergence. We demonstrated that a quasi-dynamic traffic assignment of the San Francisco Bay Area (19 million trip legs, 0.5 million nodes, and 1 million links) using 96 15-minute time segments runs in under 6 minutes on 1,024 cores of the Cori supercomputer at NERSC, corresponding to a speedup of 34x compared to single core performance.

Finally, we presented an analysis of the traffic assignment results across functional classes, illustrating how the QDTA more accurately resolves the increased congestion patterns and dynamic behavior of the traffic system compared to a static traffic assignment approach, especially under peak congestion conditions. We presented a validation of the QDTA assignment counts and speeds compared to field data from CalTrans, San Jose, and Uber Movement, showing field count correlation values of 0.68 or greater. Future work includes evaluating a variety of optimization objective functions, include fuel optimization and system level optimizations for both travel time and fuel use. We hope to develop surrogate models using the results from the supercomputer implementation such that this capability can be made widely available.

Acknowledgements

This report and the work described were sponsored by the U.S. Department of Energy (DOE) Vehicle Technologies Office (VTO) under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program. The following DOE Office of Energy Efficiency and Renewable Energy (EERE) managers played important roles in establishing the project concept, advancing implementation, and providing ongoing guidance: David Anderson and Prasad Gupte. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

References

  • [1] Nikolaos Tsanakas, Joakim Ekström, and Johan Olstam. Estimating emissions from static traffic models: Problems and solutions. ISSN: 0197-6729 Pages: e5401792 Publisher: Hindawi Volume: 2020.
  • [2] Michael G. H. Bell and Yasunori lida. Transportation Networks, chapter 2, pages 17–40. John Wiley and Sons, Ltd, 1997.
  • [3] Stefan Flügel and Gunnar Flötteröd. Traffic assignment for strategic urban transport model systems.
  • [4] Michiel Bliemer, Mark Raadsen, Erik de Romph, and Erik-Sander Smits. Requirements for traffic assignment models for strategic transport planning: A critical assessment. page 25.
  • [5] D.K. Merchant and G.L. Nemhauser. A model and an algorithm for the dynamic traffic assignment problems. Transportation science, 12(3):183–199, 1978.
  • [6] D.K. Merchant and G.L. Nemhauser. Optimality conditions for a dynamic traffic assignment model. Transportation Science 12.3, 12(3):200–207, 1978.
  • [7] Takamasa Iryo. Multiple equilibria in a dynamic traffic network. 45(6):867–879.
  • [8] Genetics of traffic assignment models for strategic transport planning. 37:56–78, January 2017.
  • [9] Michiel CJ Bliemer, Mark PH Raadsen, Erik-Sander Smits, Bojian Zhou, and Michael GH Bell. Quasi-dynamic traffic assignment with residual point queues incorporating a first order node model. Transportation Research Part B: Methodological, 68:363–384, 2014.
  • [10] Hasti Tajtehranifard, Ashish Bhaskar, Neema Nassir, Md Mazharul Haque, and Edward Chung. A path marginal cost approximation algorithm for system optimal quasi-dynamic traffic assignment. Transportation Research Part C: Emerging Technologies, 88:91–106, 2018.
  • [11] Gaetano Fusco, Chiara Colombaroni, Andrea Gemma, and Stefano Lo Sardo. A quasi-dynamic traffic assignment model for large congested urban road networks. International Journal of Mathematical Models and Methods in Applied Sciences, 7(4):341–349, 2013.
  • [12] Shoichiro Nakayama and Richard Connors. A quasi-dynamic assignment model that guarantees unique network equilibrium. Transportmetrica A: Transport Science, 10(7):669–692, 2014.
  • [13] Xing Zeng, Xuefeng Guan, Huayi Wu, and Heping Xiao. A data-driven quasi-dynamic traffic assignment model integrating multi-source traffic sensor data on the expressway network. ISPRS International Journal of Geo-Information, 10(3):113, 2021.
  • [14] H.-K. Chen. Dynamic travel choice models: a variational inequality approach. Springer Science & Business Media, 2012.
  • [15] A. Nagurney and D. Zhang. Projected dynamical systems and variational inequalities with applications, volume 2. Springer Science & Business Media, 2012.
  • [16] B. Ran and D.E. Boyce. Dynamic urban transportation network models: theory and implications for intelligent vehicle-highway systems, volume 417. Springer Science & Business Media, 2012.
  • [17] B. Ran and T. Shimazaki. A general model and algorithm for the dynamic traffic assignment problems. In Transport Policy, Management & Technology Towards 2001: Selected Proceedings of the Fifth World Conference on Transport Research, volume 4, 1989.
  • [18] B. Ran, D.E. Boyce, and L.J. LeBlanc. A new class of instantaneous dynamic user-optimal traffic assignment models. Operations Research, 41(1):192–202, 1993.
  • [19] T.L. Friesz, J. Luque, R.L. Tobin, and B.-W. Wie. Dynamic network traffic assignment considered as a continuous time optimal control problem. Operations Research, 37(6):893–901, 1989.
  • [20] B. Ran and D.E. Boyce. A link-based variational inequality formulation of ideal dynamic user-optimal route choice problem. Transportation Research Part C: Emerging Technologies, 4(1):1–12, 1996.
  • [21] D.E. Boyce, B. Ran, and L.J. Leblanc. Solving an instantaneous dynamic user-optimal route choice model. Transportation Science, 29(2):128–142, 1995.
  • [22] T.L. Friesz, D. Bernstein, T.E. Smith, R.L. Tobin, and B.-W. Wie. A variational inequality formulation of the dynamic network user equilibrium problem. Operations Research, 41(1):179–191, 1993.
  • [23] Alexandre Bayen, Alexander Keimer, Emily Porter, and Michele Spinola. Time-continuous instantaneous and past memory routing on traffic networks: A mathematical analysis on the basis of the link-delay model. SIAM Journal on Applied Dynamical Systems, 18(4):2143–2180, 2019.
  • [24] S. Peeta and T.-H. Yang. Stability issues for dynamic traffic assignment. Automatica, 39(1):21–34, 2003.
  • [25] R. Ma, X.J. Ban, and J.-S. Pang. Continuous-time dynamic system optimum for single-destination traffic networks with queue spillbacks. Transportation Research Part B: Methodological, 68:98–122, 2014.
  • [26] X.J. Ban, J.-S. Pang, H.X. Liu, and R. Ma. Modeling and solving continuous-time instantaneous dynamic user equilibria: A differential complementarity systems approach. Transportation Research Part B: Methodological, 46(3):389–408, 2012.
  • [27] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part i: Methodology and theoretical analysis. Transportation Research Part B: Methodological, 49:55 – 74, 2013.
  • [28] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part ii: Numerical analysis and computation. Transportation Research Part B: Methodological, 49:75 – 93, 2013.
  • [29] K. Han, T.L. Friesz, and T. Yao. Existence of simultaneous route and departure choice dynamic user equilibrium. Transportation Research Part B: Methodological, 53:17 – 30, 2013.
  • [30] A. Bressan and K.T. Nguyen. Optima and equilibria for traffic flow on networks with backward propagating queues. NHM, 10(4):717–748, 2015.
  • [31] A. Bressan and K.T. Nguyen. Conservation law models for traffic flow on a network of roads. NHM, 10(2):255–293, 2015.
  • [32] M. Garavello, K. Han, and B. Piccoli. Models for vehicular traffic on networks, volume 9. American Institute of Mathematical Sciences (AIMS), Springfield, MO, 2016.
  • [33] M. Garavello and B. Piccoli. Traffic flow on networks, volume 1. American institute of mathematical sciences Springfield, 2006.
  • [34] H. Holden and N. Risebro. A mathematical model of traffic flow on a network of unidirectional roads. SIAM Journal on Mathematical Analysis, 26(4):999–1017, 1995.
  • [35] G. Bretti, R. Natalini, and B. Piccoli. Numerical approximations of a traffic flow model on networks. NHM, 1(1):57–84, 2006.
  • [36] MJ Lighthill and GB Whitham. On kinematic waves. i. flood movement in long rivers. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 229(1178):281–316, 1955.
  • [37] P. I. Richards. Shock waves on the highway. Operations research, 4(1):42–51, 1956.
  • [38] A. Keimer, N. Laurent-Brouty, F. Farokhi, H. Signargout, V. Cvetkovic, A.M. Bayen, and K.H. Johansson. Information patterns in the modeling and design of mobility management services. Proceedings of the IEEE, 106(4):554–576, 2018.
  • [39] M. Gugat, A. Keimer, G. Leugering, and Z. Wang. Analysis of a system of nonlocal conservation laws for multi-commodity flow on networks. Networks and Heterogeneous Media, 10(4):749–785, 2015.
  • [40] A. Keimer, L. Pflug, and M. Spinola. Nonlocal scalar conservation laws on bounded domains and applications in traffic flow. accepted in SIAM SIMA, 2018.
  • [41] A. Aw and M. Rascle. Resurrection of "second order" models of traffic flow. SIAM Journal on Applied Mathematics, 60(3):916–938, 2000.
  • [42] H.M. Zhang. A non-equilibrium traffic model devoid of gas-like behavior. Transportation Research Part B: Methodological, 36(3):275 – 290, 2002.
  • [43] M.O. Ghali and M.J. Smith. A model for the dynamic system optimum traffic assignment problem. Transportation Research Part B: Methodological, 29(3):155–170, 1995.
  • [44] S. Peeta and H.S. Mahmassani. System optimal and user equilibrium time-dependent traffic assignment in congested networks. Annals of Operations Research, 60(1):81–113, 1995.
  • [45] M.J. Smith. A new dynamic traffic model and the existence and calculation of dynamic user equilibria on congested capacity-constrained road networks. Transportation Research Part B: Methodological, 27(1):49–63, 1993.
  • [46] S.T. Waller and A.K. Ziliaskopoulos. A chance-constrained based stochastic dynamic traffic assignment model: Analysis, formulation and solution algorithms. Transportation Research Part C: Emerging Technologies, 14(6):418–427, 2006.
  • [47] H.-K. Chen and C.-F. Hsueh. A model and an algorithm for the dynamic user-optimal route choice problem. Transportation Research Part B: Methodological, 32(3):219–234, 1998.
  • [48] J. Long, W.Y. Szeto, H.-J. Huang, and Z. Gao. An intersection-movement-based stochastic dynamic user optimal route choice model for assessing network performance. Transportation Research Part B: Methodological, 74:182 – 217, 2015.
  • [49] B.N. Janson. Dynamic traffic assignment for urban road networks. Transportation Research Part B: Methodological, 25(2-3):143–161, 1991.
  • [50] B.N. Janson. Convergent algorithm for dynamic traffic assignment. Transportation Research Record, (1328), 1991.
  • [51] J.K. Ho. A successive linear optimization approach to the dynamic traffic assignment problem. Transportation Science, 14(4):295–305, 1980.
  • [52] Ricardo Correa, Ines de Castro Dutra, Mario Fiallos, and Luiz Fernando Gomes da Silva. Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, volume 67. Springer Science & Business Media, 2013.
  • [53] Transportation Simulation Systems. Aimsun. https://www.aimsun.com.
  • [54] Institute of Transportation Systems SUMO, German Aerospace Center (DLR). http://www.sumo.dlr.de, 2018.
  • [55] Sunil Thulasidasan, Shiva Kasiviswanathan, Stephan Eidenbenz, Emanuele Galli, Susan Mniszewski, and Phillip Romero. Designing systems for large-scale, discrete-event simulations: Experiences with the fasttrans parallel microsimulator. In 2009 International Conference on High Performance Computing (HiPC), pages 428–437. IEEE, 2009.
  • [56] Colin Sheppard et al. Modeling plug-in electric vehicle charging demand with beam, the framework for behavior energy autonomy mobility. Technical report, 05/2017 2017.
  • [57] Polaris.
  • [58] Wasuwat Petprakob, Lalith Wijerathne, Takamasa Iryo, Junji Urata, Kazuki Fukuda, and Muneo Hori. On the implementation of high performance computing extensionfor day-to-day traffic assignment. Transportation research procedia, 34:267–274, 2018.
  • [59] Willem Himpe, Romain Ginestou, and MJ Chris Tampère. High performance computing applied to dynamic traffic assignment. Procedia Computer Science, 151:409–416, 2019.
  • [60] M. Patriksson. The traffic assignment problem: models and methods. Courier Dover Publications, 2015.
  • [61] HERE Technologies. https://www.here.com/, 2019. [Online; accessed 06-Feb-2019].
  • [62] SFCTA. SF-CHAMP 6.1: ConnectSF Needs Assessment 2015 Base Year Model Run. Technical report, San Francisco County Transportation Authority, February 2019.
  • [63] Cy Chan, Bin Wang, John Bachan, and Jane Macfarlane. Mobiliti: scalable transportation simulation using high-performance parallel computing. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 634–641. IEEE, 2018.
  • [64] Julian Dibbelt, Ben Strasser, and Dorothea Wagner. Customizable contraction hierarchies. In International Symposium on Experimental Algorithms, pages 271–282. Springer, 2014.
  • [65] Hayssam Sbayti, Chung-Cheng Lu, and Hani S Mahmassani. Efficient implementation of method of successive averages in simulation-based dynamic traffic assignment models for large-scale network applications. Transportation Research Record, 2029(1):22–30, 2007.
  • [66] John L. Gustafson. Amdahl’s Law, pages 53–60. Springer US, Boston, MA, 2011.
  • [67] European Commission, Joint Research Centre (JRC); Columbia University, Center for International Earth Science Information Network - CIESIN. Ghs population grid, derived from gpw4, multitemporal (2015). http://data.europa.eu/89h/jrc-ghsl-ghs_pop_gpw4_globe_r2015a, 2015. European Commission, Joint Research Centre (JRC) [Dataset].
  • [68] Caltrans, state of california. https://dot.ca.gov/programs/traffic-operations/census/traffic-volumes, 2019. [Online; accessed 03-January-2021].
  • [69] Uber movement. https://movement.uber.com/cities/san_francisco/downloads/speeds, 2019. [Online; accessed 03-January-2021].
  • [70] Environmental impact report plan bay area. https://www.planbayarea.org/2040-plan/environmental-impact-report, 2017. [Online; accessed 05-November-2020].

6 Application to San Francisco Bay Area Network

6.1 Analysis of Traffic Assignment Results

Two models are considered for San Francisco Bay Area. Baseline model is the static traffic assignment (STA) for 7-10 am morning peak and QDTA model is the user equilibrium with shortest travel time optimization using the Frank Wolfe solver run for the entire day in 15 minute time intervals. The travel demand was obtained from SFCTA CHAMP6 [62] model for 24 hours of a typical day. Each trip was identified with an origin and destination micro-analysis zone, which was then assigned to specific network nodes by weighting with the population density obtained from Global Human Settlement [67]. Fig. 9 shows the demand profiling for both cases under consideration. While STA assumes a constant demand for the entire morning peak period, QDTA considers a variable demand for the same duration. A professional map from HERE Technologies  [61] is a core part the foundation for the Mobiliti platform. The map information is transformed into a different representation in order to integrate into the algorithms discussed earlier. However, we maintain the definitions of functional class roads as defined by HERE Technologies  [61]

. Specifically, functional classes classify roads according to the speed, importance and connectivity of the road. A road can be one of five functional classes – these are defined in

Table 3. The analysis presented here will use these functional classes to help explore the results that compare QDTA to STA.

Figure 9: Temporal demand interaction and model capabilities
Functional Class Definition
1 Allowing for high volume, maximum speed traffic movement
2 Allowing for high volume, high speed traffic movement
3 Providing a high volume of traffic movement
4 Providing for a high volume of traffic movement at moderate speeds between neighbourhoods
5 Roads whose volume and traffic movement are below the level of any other functional class
Table 3: Functional Road Classes

Table 4 provides a comparison of the system level metrics using vehicle miles travelled (VMT), average volume over capacity (VOC) ratios, and vehicle hours of delay (VHD) categorised by functional class. Only links with positive flow are included in the analysis. Additionally, a comparison was made between congested and non-congested links (Table 5). For QDTA analysis, congestion is calculated for 7.45-8 AM time period when the demand is the highest.

Category VMT (in millions) Average VOC VHD (in thousands)
STA QDTA STA QDTA STA QDTA
FC2 16.44 18.09 0.52 0.69 17.57 96.84
FC3 4.67 5.00 0.23 0.30 2.65 14.43
FC4 4.96 5.24 0.14 0.17 1.13 6.42
FC5 2.43 2.60 0.03 0.48 0.36 17.78
Total 28.51 30.95 - - 21.73 135.49
Table 4: System Level Metrics
Category Length (km) VMT (in millions) Average VOC
STA QDTA STA QDTA STA QDTA
FC2 129 546 1.77 7.56 1.15 1.26
FC3 37 86 0.18 0.42 1.18 1.27
FC4 14 55 0.04 0.15 1.19 1.20
FC5 5 23 0.01 0.06 1.17 1.23
Total 185 710 2.0 8.19 - -
Table 5: Congested Network Metrics

The main distinction between the two models is how best each can replicate congestion on links. As seen in Table 5, for the links that are congested in STA, QDTA predicts even higher congestion. Since non uniform demand distribution results in higher travel time on links is in accordance with the convexity properties associated with link performance functions, QDTA with varying demand is expected to produce more congestion over STA with uniform demand. QDTA also predicts congestion dynamically than STA for all functional class categories. Fig. 10 shows the VOC over time by functional class (left) and VOC comparison for the two models by functional class(right). For all classes of roads the QDTA predicts congestion dynamically over time whereas STA either overestimates or underestimates the values irrespective of the demand dynamics. This difference is highly pronounced for FC2 where STA significantly underestimates the peak hour congestion.

Figure 10: Congestion profiling for QDTA and STA. Figure on the left shows the average VOC ratio for all links over time categorised by functional classes. The figure on right is a scatter plot showing the VOC ratio comparison for every link in the network for the 2 models at 7.45-8 am.

It can also be seen from Table 4 that the total system delay is significantly higher in QDTA over STA. This is due to the dynamic demand distribution and the modelling capability in QDTA that allows interaction of demand between time intervals, thus allowing for a more realistic modelling of traffic. In QDTA, the residual demand from each time interval is carried over to the next phase (Fig. 11). It needs to be noted that due to the route truncation mechanism, all portions of a trip for a particular time period is not completed in that time period. So portions of a trip is loaded onto network in multiple time periods. In QDTA, each trip leg only contributes flow to the links it traverses within the time segment, whereas for STA, every leg contributes flow to the entire route. For the time interval 7.45-8 AM, STA has 185 km of congested network in comparison to QDTA with 720 km of congested network. Fig. 12 shows the Bay area network with congestion locations for the two modelling cases.

Figure 11: QDTA demand modelling with residuals (left). The length of network congested over time. A link is considered congested if the VOC ratio is greater than or equal 1.
Fi