1 Introduction
Transportation planners often use static traffic assignment solutions to estimate traffic states over the course of one day in their cities
[1]. Static traffic assignment does not deal with the dynamic behavior (realistically) that results from network dynamics  it simply assigns an origin/destination (O/D) routing solution that minimizes auto travel time for all mobile entities so that no drivers can unilaterally reduce his/her auto travel costs by shifting to another route. This is known as static user equilibrium [2]. To accommodate the temporal changes in the demand profile over an entire day, the problem is typically partitioned into time slots of interest and static traffic assignment solutions are used to estimate average speeds and average flows for each time slot for the network. Example time slots are early morning, morning rush hour, midday, evening rush hour, late evening  accounting for two to four hours of time in each time segment. Because of the scale of the network and the complexity of the algorithms, these models often take many days to run depending upon the hardware and software solutions available to city planners. Consequently, planners will often make adjustments to the model to make the computation tractable within their compute capabilities and time to solution requirements. For example, they may remove lower functional class roads from the network and aggregate travel demand to higher functional class roads. The results are then compromised by these adjustments [3]. Several limitations of STA has also been identified and discussed in many studies [3, 1, 4]. The underlying assumptions of STA, related to the stationary demand, long aggregation intervals, and static network loading, can lead to unrealistic results when variations in traffic conditions are high [1].The alternative to static traffic assignment is dynamic traffic assignment. Dynamic traffic assignment has been a topic of research for almost 50 years  beginning with the famous MN model [5, 6]. The focus of this work is to use static traffic assignment algorithms implemented on highperformance computing (HPC) to generate results that more effectively estimates a dynamic traffic assignment. We do this by refining the traditional approach in both time and trip O/D locations and introduce a quasidynamic optimization  which we will call a Quasi Dynamic Traffic Assignment (QDTA). This new quasidynamic traffic assignment refers to assigning traffic demand in the form of time segments by discretizing the traffic demand that is continuously changing over time. We further capture the dynamic impact of trips that span multiple time segments by introducing two mechanisms: path truncation and residual demand. This approach addresses the complexity of dynamic traffic assignment and removes the concern that static traffic assignment can’t provide value to practical traffic flow due to its single time assignment characteristic. In general, while the continuoustime DTA models satisfy the requirements of traffic flow theory, its uniqueness of network equilibrium flows is not necessarily guaranteed [7]. On the other hand, Quasi dynamic models are closer to static traffic assignment, somewhat sharing the formulation and properties of network equilibrium. It is also referred to as discretetime DTA or semidynamic traffic assignment in some literature [8]. QDTA provides a better understanding of the traffic dynamics over STA as it can propagate traffic flows between the time periods using the residual demand.
We use highperformance computing to accelerate the quasidynamic traffic assignment algorithm to generate highfidelity traffic assignments with significantly improved computational efficiency. Optimization on distributedmemory platforms is achieved by utilizing efficient routing algorithms and parallelizing multiple components of the computational workload across threads, including: 1) trip demand routing, 2) cost function evaluation, 3) network flow and weight updates, and 4) the residual demand calculations. The improvements gained using HPC allow us to shorten the time segment duration to 15 minute intervals and address more complex road networks and increased travel demand profiles. Our computational approach scales to one thousand compute cores, allowing us to run assignments with 19 million vehicle trips on a network with 1 million links in under 6 minutes. This capability enables urbanscale traffic assignments in which a variety of scenarios can be addressed with less concern for computational complexity. Examples include: comparisons among counties or cities, integrating parallel discrete event simulation scenarios that implement and explore infrastructure implications of QDTA, and other objective functions for the optimization. For the purpose of brevity in this paper, we will only address the introduction of QDTA in a user equilibrium travel time context. The baseline model used for comparison will be the static traffic assignment. Other scenarios will be described in subsequent papers.
First, we will discuss the general form of the proposed quasi dynamic traffic assignment in its pure mathematical form and introduce a proper instantiation of these class of problems, based on the wellknown static traffic assignment. Second, we provide the QDTA models computational flow and algorithmic representations that estimate the dynamics using the discretized time steps. Third, we present the parallelization of the QDTA model and demonstrate the significant computational gains associated with this approach, while providing remarkably improved fidelity than traditional approaches. Finally, we discuss the scenario comparison for the STA and QDTA models for the entire urban region of the San Francisco Bay Area.
2 Quasidynamic traffic assignment and highperformance computing in the literature
In the literature, the dynamics of QDTA comes in two components: demand/routing and dynamic/quasidynamic network loading. QDTA differs from STA by the use of much smaller assignment intervals. The implication of having this temporal dimension is the presence of residual demand. The residual demand is handled differently in various previous work. [9, 10] uses a pointqueue model to include "capacity constraints", preventing flow from exiting the link if larger than the capacity. This gives more realistic travel times. The authors did not demonstrate its application in multi time step situations. In other cases, this residual demand is calculated through "path truncation". Some models did not appear to consider the residual demand, rather by preprocessing the demand. Operation models (more on the DTA side), predicts the residual demand more precisely, but is computationally expensive. Traffic assignment in general may be linkbased or pathbased [10], though pathbased must address the issue of a combinatorial increase of the path set size.
Routing  Network loading  Residual demand  Advantages  Limitations  Case studies  
[9]  Routebased SUE formulated as VI solved using MSA  firstorder node model (i.e., reduction factor based on link demand and supply) formulated as a fixedpoint problem solved iteratively  …  Models node capacity at diverges and merges; correct representation of congestion upstream of bottleneck  Does not have temporal dynamics ([9] distinguishes quasidynamic assignment from semistatic assignment)  Gold coast, Australia: 9,565 links, 2,987 nodes, 0.3 hour on a personal computer. Sydney, Australia: 75,379 links, 30,573 nodes, 1.1 hour on a personal computer 
[10]  System Optimal based on a set of reasonable and independent paths (pathbased), assigned using the Method of Successive Averages (MSA)  same as above  …  …  …  Sinox Falls 
[11]  SUE, from a shortest path set, no rerouting  Move by link cost function (e.g., BPR function)  …  …  No rerouting, limited path set  Rome, Italy: 15,000 links, 6,000 nodes, simulates several hours of traffic in a few minutes on a personal computer 
[12]  
[13]  Dijkstra shortest path  Link cost constructed from average of GPS speed measurements, weighted by traffic volume reconstructed from the Licence Plate Recognition (LPR) data  No residual demand, likely the OD data is prepartitioned  Datadriven link cost and demand  Not applicable where the high resolution data (e.g., GPS or toll demand) is limited (e.g., outside of the highways)  Expressway network in Hunan, China. 530 edges and 490 vertexes. Total length is 6725.5km. Computational performance unclear. 
2.1 Dynamic traffic assignment and routing
In this section, we give a short survey of existing modeling and routing approaches. Due to the vast number of publications in this area of research, the presented list of publications cannot be complete. However, we aim to describe the different approaches specifically and at a technical level for each class. We begin this overview with the famous books in this fields: [14, 15, 16].
Time continuous modeling
Generally, in a timecontinuous setting, the link dynamics can be described via ODEs (ordinary differential equations) or PDEs (partial differential equations). The articles
[5, 6, 17, 18, 19] consider dynamic traffic assignment as optimal control problem over a given time horizon. The link dynamics are represented by a (system) of ODEs with in and outflow functions. An objective function is specified and optimality conditions are deduced. However, there is no natural delay and the optimization is considered on the full time horizon, so that one has to know information about inflow over the full time horizon to determine the routing, i.e. traffic assignment, not to mention that ODEs are generally quite far away from modelling traffic physically accurately (congestion patterns, the named delay, and more).In [20, 21, 22], more general classes of ODE models (sometimes also with delay) are considered and timedependent variational inequalities are determined to describe the routing at the junctions. Depending on whether the variational inequality is considered over the full time horizon or at every time (with the ordinary scalar product in and the variational inequality to hold at every time), the computation of a solution again requires full information of the input datum over the full time horizon considered. In [16] a more detailed mathematical analysis concerning existence of solutions is provided. [23] considers the wellposedness for an ODE model with delay and routing operators at every intersection which can dependent on the entire network state up to realtime. In [24] the routing is realized by a specific routing function taking into account the status of the network. It also proves some stability estimates with regard to the routing considered. Another modeling approach which is also considered in [25, 26] consists of using the Vickrey or pointqueue model (or a modified version) [27, 28, 29], again aiming for a routing (and departure choice time) based on variational inequalities. These modeling approaches can be generalized to dynamics which are prescribed by partial differential equations (PDEs). We refer to [30, 31, 32, 33, 34, 35] for an overview. Usually, the underlying link dynamics are modelled by the LWR (Lighthill, Whitham, Richards) PDE [36, 37], a hyperbolic conservation law, allowing spillback and congestion. Another approach uses nonlocal PDE models to simulate traffic flow at junctions [38, 39, 40] and there are also higher order models available, see for instance [41, 42]. However, as mentioned before, a reasonable routing at the intersection needs to be prescribed and the underlying models need to be solved for a given spacetime discretization on the entire network, resulting in a quite expensive computational situation. As stated before, the (time dependent) routing itself – which might depend on the entire network state at a given time (or also the future) – will make the problem even more computationally challenging, not to mention the problem of calibrating the entire system reasonably. These are reasons why we consider in this work a class of simpler models which still have a notion of time and delay, but not to the detail level which the previously mentioned models provide. Roughly, the model parameters required for a static traffic assignment are enough for the proposed QDTA. Before we investigate the QDTA in more detail, it is reasonable to mention also other timediscretized models:
Timediscretized modeling
There is a significant number of articles which deal with already discretized time dynamical models. For the sake of an exhaustive overview, see [43, 44, 45, 46, 47, 48]. The articles [49, 50, 51] consider a discretized traffic flow model based on ODEs and time optimal control problem to obtain the routing. [48] studies another discretized ODE model but this time, the routing is implemented by a probabilistic approach depending on the status of the network. Most of the existing literature considers the assignment as an optimization problem across the entire period of interest. In discretized formulation, the choice of the time step interval is important. Generally, it should be much longer than the link travel time ([50]), but not too long that makes each time slice close to static assignment. In addition, the size of the time step sometimes also depend on the availability of the temporal demand data, which may be available only at coarse grain ([12]). The timedependent demand is usually considered fixed and known, though [46]
proposed algorithms for the cases with uncertainty in the demands (random with certain probability distributions). In terms of the solution algorithms,
[51] formulates the System Optimum (SO) problem as a system of linear equations after substituting the nonlinear functions (e.g., the link exit function) with linear segments. [44] incorporates the DYNASMART simulator to generate performance measures under given route assignment in the iterative search step, and use the simulator outputs to inform the search directions.2.2 Highperformance computing for transportation modeling
Parallel computation can be divided broadly into two categories: shared memory (e.g. a workstation or a single server node) and distributed memory (e.g. clusters, cloud platforms, and HPC systems) [52]. The key distinction is that all cores in a shared memory system have direct access to data in the same address space. That is, data that is written by each compute core is immediately visible to all other cores in the same memory. In contrast, distributed memory systems have separate address spaces that require explicit message passing to share data between cores connected to distinct memory spaces. Programming distributed memory systems is more difficult due to the separate memory, as it requires mechanisms to handle data and load distribution, synchronization, and data movement between processes. However, the benefits of distributed memory parallelism are twofold: access to more compute cores and a greater total memory capacity to enable solving larger problems more efficiently. Highperformance computing systems differ from typical clusters and cloudbased systems in that they have high performance network interconnects to accelerate message passing and synchronization between the compute nodes in the system. This network hardware is critical to the performance for latencysensitive applications where cores are frequently communicating and/or synchronizing.
The use of high performance computing (HPC) in transportation modeling and simulation tools has not yet achieved widespread adoption. Most simulation tools in this domain support either only sequential execution or (shared memory parallelism, which limits parallelism to the number of cores on a single compute node. Examples that use this type of parallelism include the Aimsun [53] and SUMO [54] simulators. Some traffic simulation software projects, such as FastTrans [55], BEAM [56] and POLARIS [57], have also enabled the use of distributed memory systems, such as HPC and cloud computing platforms. Within the domain of traffic assignment, there have been some previous efforts using parallel computation on smaller networks than explored in this paper [58, 59]. In [58], they analyze the Nagoya network consisting of 152K links and 38K nodes, using embarrassingly parallel path finding and processing only the active subnetworks to achieve up to 10x speedups using 25 processors. In [59], they utilize multithreading and MPI to parallelize a link transmission model (LTM) based dynamic traffic assignment for a network with 11k nodes and 23K links, achieving up to 3x speedup on a few nodes. As we demonstrate in our paper, the combination of an algorithm that lends itself to efficient parallelization along with the use of distributed computing presents a large opportunity to increase the performance of traffic assignment algorithms for largescale systems with millions of links and vehicles.
3 A tractable dynamic assignment problem: QDTA formulation
The QDTA model proposed in this article divides the analysis period into small time steps and uses a sequence of STA steps to obtain an optimized route assignment for each time step. The main distinction between the proposed model and the conventional STA framework is the inclusion of route truncation and residual demand, to address the fact that some trip legs cannot be finished in a single analysis time step and need to be split across multiple steps. As the analysis time step gets shorter to capture shortterm traffic dynamics (i.e., 15 minutes), the fraction of trips spanning over multiple time steps become particularly prominent. Section 3.2 describes route truncation and residual demand in detail. The integration of the residual demand into the wellunderstood framework of the STA is given in Section 3.3. Lastly, some discussions are presented in Section 3.4 regarding key questions such as the choice of time step length, the connections with simulation modules, as well as the potential usage in traffic optimization.
3.1 Problem specification
The road network is represented by a directed connected graph , where is the set of graph nodes/vertices and is the set of graph links/edges. A road intersection is represented by a node in the graph , and the stretch of road between two intersections is represented by a graph link . Each link has several static properties, such as the traffic flow capacity , and traversal time at the freeflow conditions (freeflow travel time) . Other static node and link properties, such as the coordinates of the nodes, the geometries of the links, are often stored (e.g., for visualization and postprocessing analysis). But they are not described here since they are not integral in the formulation of the QDTA framework.
On the travel demand side, the timedependent travel demand on this road network is represented by timestamped origindestination (OD) flows, . and denote the origin and destination, with and . is the analysis time step. Integer is the total numbers of time steps considered, and each time step is assumed to last for a duration of . For example, if considering a traffic assignment period of 24 hours with time interval of 15 minutes, then equals to 96 and is 15 minutes.
For trips in , let represents the set of acyclic paths in the graph that connect and at time interval . The path set is time dependent, as certain links may be out of service due to dynamic events such as earthquakes, etc. The traffic flow on each path in is denoted by , or for all path flows at a particular time step. Parallelization is accomplished by partitioning the travel demand (trips) across compute threads, where represents thread ’s partition of the corresponding demand. Furthermore, represents the flows on the paths that correspond to thread ’s partition, and represents the flows on all links that result from that partition.
The QDTA framework is to propose an efficient and parallelizable method to generate an optimized solution of according to the dynamic extensions of the widely used traffic assignment principles, such as the Wardrop’s equilibrium, as shown in Sections 3.2 and 3.3. The solution can be used to offer centralized and realtime routing guidance to enhance the traffic flow for a large network. Specifically, the Wardrop’s equilibrium has two variations, namely the user equilibrium (UE) solution and the system optimal (SO) solution (see literature review). In this paper, the dynamic extension of the UE solution is adopted as an example in the mathematical formulation of the QDTA framework, with the detailed formulation given in Equation 3 in Section 3.3.
Symbol  Meaning 

Network properties  
G  Road network graph. 
,  Set of road network vertices, one vertex in the road network. 
,  Set of road network edges, one edge in the road network. 
Flow capacity of edge .  
,  Freeflow travel time/cost of all edges, or a specific edge . 
,  Time/Cost of traversing all edges, or a specific edge , at time . 
Travel demand  
Trip origin.  
Trip destination.  
Intermediate stop location.  
Travel demand (trips) starting at node , ending at node , departing at time .  
All trips traveling at time .  
,  All original, residual trips traveling at time . 
, ,  Thread ’s partition of corresponding trips at time . 
Timestep related variables  
,  Time steps, time step. 
Total number of time steps.  
Duration of each time step, e.g., 15 minutes.  
,  All acyclic paths in the road network that connect and at time , a single path in the set. 
,  Total number of vertices along path , the vertex along path . 
,  Traffic flow assigned to all paths, or path , at time step . 
,  Traffic flow assigned to all links, or link , at time step . 
,  Traffic flow assigned to paths corresponding to thread ’s partition of trips, and their resulting partial link flows. 
3.2 Residual demand and route truncation
In QDTA, travel demands are supplied in stages (time steps) according to the analysis period that the departure times are in. For example, if the analysis time period begins at 7 AM with a time step interval , a separate demand set will be used for 7  7.15 AM, 7.15  7.30 AM, and so on. The assignment framework presented in this paper does not constrain the duration of the time step. As a result, different time step lengths can be used, e.g., 5 minutes or 1 hour. However, a time step that is too long would approximate the STA solution and lack the desired temporal variations in the modeling results. While too short a time step usually means more computational efforts and higher requirements regarding the temporal distribution of the demand inputs.
The complication of dividing the travel demand into smaller analysis period (e.g., 15 minutes) is that some trips cannot finish within one assignment period. This is problematic as established optimization algorithms to obtain traffic assignment solutions (e.g., the FrankWolfe’s algorithm) require the specification of the stop positions of the trips. As a result, a modification is introduced to the STA analysis framework that aims at mitigating the inconsistency in the optimized routing caused by the uncertainty in the stop location after each analysis period. This modification involves a route truncation operation that is used repeatedly in the algorithm. In this section, the mathematical formulation of the route truncation operation will be introduced. Its integration into the QDTA framework will be given in Section 3.3.
The route truncation operation estimates the intermediate stop location that a trip in can reach in time . For long trips with short time step intervals, is usually different from the trip destination , and the remaining leg of the trip, i.e., from to , will enter the next time step as the residual demand. The determination of relies on knowing the travel time (or other general cost) of each link, , which is often a function of the link flow . Since the flow and residual demand are dependent on each other, an iterative procedure is needed to get converged results.
The first set of the equation maps link flow to the linklevel travel time . The general form of the function, as shown in Equation (1a), satisfies both the monotonicity as well as the separability assumptions. The monotonicity underlines the fact that the more flow is assigned to a road, the longer it takes in average to pass this link. The separability assumption states that travel time on a link only depends on the current flow assignment on the considered link. The separability assumption holds if the traffic flow is in a steadystate, but is not valid in case of congestion spillback. A typical choice of the travel time function that satisfies both assumptions is the wellknown Bureau of Public Roads (BPR) curves [60], as shown in Equation (1b). In Equation (1b), is the travel time on link at time step ; and are the freeflow travel time and capacity associated with the link; and are calibration parameters. , as stated before, is the linklevel traffic flow at time step .
equationparentequation
General form:  (1a)  
Example instantiation:  (1b) 
Based on the linklevel travel time , the stop location can be determined using Equation (2). is the path of the trip to its destination . If the time step duration is long enough, longer than the total time required to traverse , i.e., first row in Equation (2), the intermediate stop location is the trip destination . However, if this condition cannot be met, the stop location is a node along the path . Let denote the total number of nodes along path , and to be the node on the path, the second row of Equation (2) leads to the furthest distance that can be covered during a time duration of .
(2) 
Thus, for trip with route , only the first part of the trip till vertex will contribute towards the path/link flow in the time interval , while the second part of the trip will be added to the next time step as the carryover demand .
3.3 Full formulation
In this section, the temporal update steps to obtain the path/link traffic flow assignment will be presented. At each time step, the travel demand consists of two parts, the original trips whose starting time is at , and the residual demand trips that started before but did not reach destinations in previous time steps. This applies to all time steps except the first time step , where there is no residual demand. Let be a function that maps the travel demand to path flows:
(3) 
Equation (4) is an instantiation of that finds an optimum assignment of path flow that satisfies the UE condition, with proof in [60].
equationparentequation
(4a)  
subject to  (4b)  
(4c)  
(4d) 
is the incidence matrix, whose value is 1 if the link is on the path before the intermediate stop , and 0 otherwise:
(5) 
Equation (4) is essentially a dynamic extension of the static UE formulation in [60], with the static travel demand, link and path flow replaced by the timedependent , and . Specifically, unlike the original static formulation, where the pathlink incident matrix only depends on knowing whether a link is on a path, in the quasidynamic formulation, a link on a path may not be traversed in the current time step. As a result, the path flow in QDTA is truncated at the stop location, , first using Equation (2), before mapped to the link flow using Equations (4d)(5). Given that is used in the determination of , an iterative process is needed to reach convergence of these quantities.
3.4 Discussions of the QDTA framework
Some further discussions are provided regarding the proposed QDTA framework. First, regarding the choice of the time step length, , it reflects the tradeoffs of a few factors. If the time step length is too long, the solution will approximate the STA solution and fail to capture the changing temporal traffic dynamics. However, if the time step length is too short, it will not only increase the computational time significantly, but also lead to less accurate results as microscopic, sublink dynamics start to show. For example, consider an extreme scenario, where is shorter than the free flow time of the road link, , Equation (2) indicates that the intermediate stop node is always the origin node. In other words, the trips can never propagate through the link. This limitation can potentially be overcome by using alternative formulations of Equation (2) that tracks the distance of the traffic flow on a link, and cumulatively adds up the distance across time steps if the traffic flow cannot be propagated to the next link.
A key feature of the proposed QDTA framework is the inclusion of inflight rerouting. At every time step , the routing is updated to reflect the current traffic conditions and potential changes in the network (e.g., road closures due to earthquake damages). The route assignment is obtained through an iterative process between the path flow, , and the path truncation, and reflects the optimum (e.g., UE) traffic distribution in each time step. The inflight rerouting is not considered in some previous DTA algorithms. But it is particularly relevant, especially in the era of navigation applications that dynamically update the navigation routes for users, to include such phenomena in traffic modeling and predictions.
4 Algorithmic Representations
The mathematical formulation of the QDTA framework in Section 3 is solved using a number of scalable, computerized algorithms. The pseudo code of these algorithms are presented in this section. At the outmost level, the QDTA implementation (Algorithm 1) loops through each time step (Algorithm 1). Inside each time step, an approximate STAbased flow solution is first obtained using the FrankWolfe’s algorithm (Algorithm 2, Algorithm 3). Note that path truncation is performed in each iterative step of the FrankWolfe’s algorithm (Algorithm 3). This ensures that, for each path, only the approximate portion traversed within the current time segment is considered so as to prevent erroneous congestion effects of traffic that occurs outside the current time segment. Based on the converged solution from Algorithm 2, a last round of route truncation is performed to obtain a more accurate flow assignment and residual demand to be carried over to the next time step (Algorithm 4).
The repeated usage of route truncation resolves the issue of traffic that is resident in the network for a longer time horizon than the given QDTA time segment duration by reassigning the residual traffic to the next time slot. With this capability, the demand entering the network at the next time interval along with the residual demand is then introduced to the following time segment’s traffic assignment. This process is repeated until the final time of the simulation is reached. The model can be interpreted as an timeexpanded network approach where the expansion of the network represents the evolution with respect to time.
In the algorithm, the following operations are used but not explicitly defined:

hortest_path returns the shortest path between the origin $p$ and the destination $q$ given the road network graph $G(\V, \A)$ with cost $\bc = \{c_a\}$, with $a \in \A$. \item \verb Truncate_path returns the subpath of a path $r$ that can be traversed in time $T$ given the edge costs $\bc$. \item \verb Get_last_vertex returns the last vertex of a path $r$. \item \verb Cost_function is the function to be minimized to obtain the UE traffic flow, with one instantiation shown in Equation (\ref{eq:path_flow_formulation_ue}). \item \verb Converged checks the convergence of the FrankWolfe’s algorithm. The algorithm is considered to be converged if the relative absolute change in total system travel time is less than 0.1\%. \end{itemize} \section{Highperformance computational solutions for QDTA} \label{sec:hpc_qdta} Our overall goal is to reduce the compute time for modeling urban mobility so that city planners can investigate a large number of scenarios in a reasonably small period of time, while still preserving the fidelity of the inherent traffic dynamics. The scope of the models considered in our work are full urban networks that include all functional road classes and a full urban scale travel demand. We currently are working with models for two major urban regions: the
an Francisco Bay Area and the Los Angeles Area. The road network for San Francisco is shown in Fig. 0(a) and accounts for 0.5M nodes and 1M links ( [61]). We use the SFCTA CHAMP 6 model [62] for the Bay Area model which accounts for 19 million trips during a 24hour period. We have also successfully applied this computational framework to the Los Angeles road network with 1M nodes and 2M links with 40M trips in a 24 hour period. For brevity, we will present results for the Bay Area only.Figure 2 shows the computational flow of our implementation of QDTA. Mobiliti [63], our computational platform in which the QDTA is integrated, also provides parallel discrete event traffic simulations (PDES) for urbanscale regions. The travel demand representation and road network graphs are shared between both QDTA and the PDES simulation capability. As will be discussed in Section 4.1, the core algorithms described in Section 4 are parallelized and implemented within this computational flow. Specifically, we have developed an intervalbased dynamic traffic assignment solver with residual carryover where each interval is solved using a parallelized FrankWolfe algorithm to minimize the desired cost function (e.g. userequilibrium, social optimum or fuelefficient routing). This solution for traffic assignment is sufficient for optimizing convex objective functions. We are able to achieve high computation performance through parallelization of many parts of the QDTA algorithm with both distributed memory (multinode) and shared memory (multithread) computation.
Figure 2: Computational Flow for QDTA : blue box indicates the parts which have been parallelized. This flow diagram corresponds to the pseudoalgorithm described in Algorithm 3. 4.1 Description of QDTA Algorithm Parallelization
Data: Network graphTime step lengthTotal time step countsOriginal travel demand of all time steps , withInitialize empty nested associative array ;// No residual demand in the first time stepfor doLet be this thread’s partition of demand (for thread ) ;;// Combine original and residual demandParallel_traffic_assignment ;Parallel_residual_demand ;end forResult: , with ;// Distributed path flows for all time stepsAlgorithm 5 Parallel quasidynamic traffic assignment Algorithm 5 describes the distributedmemory parallel QDTA algorithm. When parallelizing any algorithm, one of the design choices is how to partition both the data and computational work across available compute resources. This choice may differ depending on the algorithmic step being parallelized (see Figure 2). Due to the high computational cost of computing shortest path routes over a large network, the most critical step to optimize for performance is the routing of all active vehicles in the current time segment (Algorithm 3). In order to achieve effective parallelization of the routing step, and to distribute storage and management of the resulting routes, each thread is assigned a subset of the demand to compute their routes and flows , manage their corresponding residual demand , and finally store their results to disk at program completion.
Data: Network graphTime step lengthLocal partition of travel demand of current stepFreeflow travel time of each link , withFrankWolfe iteration maximum stepsLet Parallel_all_or_nothing ;// Set initial path flowfor ;// Gradient descent stepdo;// Calculate the edge travel timeParallel_all_or_nothing ;Parallel_line_search ;;// Update link flowsif Relative_change_in_cost() < 0.0001 thenbreakend if;// Update path flowsend forResult: ;// Path flows (local) and link flows (global)Algorithm 6 Parallel_traffic_assignment: using FrankWolfe’s algorithm Data: Network graphTime step lengthTravel demand of current stepEdge travel time/costPreprocess (customize) network with updated link weights ;Initialize empty associative array ;// Initialize the path flow vectorfor doGet_shortest_path ;// Get shortest pathTruncate_path ;// Truncate path according to time step length;// Add trips to the local path flow vectorend for;// Compute local link flows from local path flowsGlobal allreduce ;// Compute global link flows from local link flowsResult: ;// Path flows (local) and link flows (global)Algorithm 7 Parallel_all_or_nothing: iterative step of the FrankWolfe Algorithm Algorithm 6 describes the parallelized FrankWolfe algorithm that assigns traffic to each time segment. The first step in this algorithm is the parallel allornothing routing step described in Algorithm 7. Note that each thread requires full knowledge of the network’s current link weights to be able to route its subset of trip legs . An important optimization we made for the routing step involves utilizing a multiphase routing algorithm, where the network is first preprocessed with connectivity and weight information to enable subsequent routing queries to be computed very efficiently and in parallel. To this end, we leverage Customizable Contraction Hierarchies [64], which further splits the preprocessing phase in two, allowing the weights of the existing links in the network to be updated with less computation time compared to doing a full topological update (where the connectivity of the links may also change). In Algorithm 7, the preprocessing is done first so that all threads can subsequently utilize the preprocessed network with updated weights to compute its assigned routes.
After the parallel routing and truncation step (the for loop in Algorithm 7), each process’s memory contains only the routes computed by the threads local to that process (implicit in ). However, in order to proceed to the next algorithmic step, the impact of all routes globally must be taken into consideration and reflected in the network flow data in every process. A key observation is that we do not need to globally broadcast the actual routes computed for every vehicle trip leg, as this would be a very large amount of data to communicate. Instead, the thread local route flows are reduced to thread local link flows in parallel, and then the local link flows are globally allreduced to calculate the total link flows that result from all vehicle trip legs. In this way, we avoid having to communicate any route information between parallel processes, only the resultant flows on the links themselves.
The next stage in Algorithm 6 after the allornothing calculation is the line search to select the optimal step size. The parallelized algorithm selects the step size using directly instead of (as was done in Algorithm 2), but is equivalent since multiplication with the incidence matrix distributes in the cost function expression, i.e.:
(6) (7) (8) In Equation 8, the function maps the flow on a link to its pervehicle cost (i.e. the link traversal time using a BPR function). Thus, Equation 8 computes the total system cost, where the pervehicle costs are multiplied by the link flows and then summed across the whole network. Furthermore, convergence of the FrankWolfe algorithm is determined when the relative change in total system cost falls below 0.001:
(9) If evaluated sequentially, the line search is a very computationally expensive step since it requires an iterative search where the cost function is evaluated many times for potential values of until the minimum is found. Furthermore, each single evaluation of the cost function requires summing the cost contribution of every link in the network, which is substantial for large networks with millions of links. As a result, many computational traffic assignment implementations will simply use the method of successive averages (MSA) [65]
as a heuristic to select the gradient descent step size to avoid this high computational cost. However, the MSA method has a substantial drawback in that taking suboptimal step sizes for each gradient descent iteration results in requiring more iterations overall for the traffic assignment to converge. We have found that by selecting the optimal step size for each iteration, the number of gradient descent iterations may be reduced significantly. Therefore, computing the line search efficiently through parallelization is a key capability of our approach.
Data: Network graphCurrent link flowsAllornothing link flowsCurrent FrankWolfe gradient descent iterationLine search iteration maximum stepsConvergence threshold (slope)Convergence threshold (step size)Initialize ;// Initial guessfor ;// Newton line search iterationdoParallel_cost_function ;if ;// Threshold 1then;breakend if;Enforce ;if ;// Threshold 2then;breakend ifend forResult:Algorithm 8 Parallel_line_search: Newton’s Line Search for optimal Algorithm 8 describes our parallel line search algorithm, which is the implementation of Equation 7. For brevity, we introduce the shorthand function , which is equivalent to the cost function evaluated with step size and implicit arguments and . The algorithm uses Newton’s method to identify where is minimized (i.e. where the derivative of the equals zero). Thus, at each step in Newton’s method, we require approximations of the cost function’s first two derivatives at the current guess. These approximations are calculated using the finite difference method by evaluating at the values: , where nominally, . Then:
(10) The thresholds and in the algorithm are tunable parameters to control the quality of convergence. We have used in our experiments to ensure good convergence of the line search. Figure 3 shows the average and maximum number of Newton iterations required to conduct the line search for all gradient descent iterations taken within each time segment. The average number of line search iterations remains consistent between 1 and 4, while the maximum is as high as 8 for the most highly congested time segments.
Figure 3: Maximum and average number of Newton iterations to conduct the line search over all gradient descent iterations within each time segment for the San Francisco Bay Area network model. Data: Network graphStep sizeCurrent link flowsAllornothing link flowsInitialize ;Let = this thread’s partition of network links (for thread );Let Cost_function ;for do;end forGlobal allreduce ;// Reduce values simultaneously;;Result:Algorithm 9 Parallel_cost_function: evaluate and first two derivatives The key to computing the line search efficiently is evaluating , , and in parallel and as a batch. Since the total cost function is the sum of partial cost contributions from each link in the network (see Equation 8) parallelization is achieved by partitioning the links in the network across available threads. Algorithm 9 describes our approach to evaluate the cost function and its derivatives in parallel. For each set of three cost function evaluations, each thread computes its local contributions to the three cost functions for the subset of links assigned to that thread. The resulting local cost contributions are then globally allreduced across threads to obtain the total cost function values. In order to avoid the communication overhead of three separate global reductions for each of the three evaluations, our implementation simultaneously reduces all three values with a single vectorized allreduce operation. The derivatives are then estimated using the approximations in Equation 10. Due to the efficient parallel evaluation of the cost function derivatives, we observed that even for the cases with the highest number of Newton iterations (see Figure 3), the line search completes in less than 100 milliseconds using 512 cores of the Cori supercomputer (see Section 4.2 for details on Cori).
The parallelized line search significantly improves the computational performance of the overall QDTA compared to using the method of successive averages (MSA). Figures 3(a) and 3(b) show the impact of using a line search versus MSA on number of gradient descent iterations and total time to solution for each time segment in our experiment on the San Francisco Bay Area network (see Section 4.2 for details). The line search method reduces the number of gradient descent iterations by up to 73 percent, and the total compute times are highly correlated with the number of gradient descent iterations primarily because of the expensive allornothing routing step required for each iteration. By finding the optimal step size for each iteration, the number of iterations required to converge is significantly reduced, resulting in a reduction of more than 49 percent in the total execution time compared to using the method of successive averages to select the step size.
(a) (b) Figure 4: Comparison of a) the number of gradient descent iterations and b) the total segment compute time required using the method of successive averages (MSA, blue circles) versus using a optimal step size via Newton’s method line search (orange squares). Data is for the San Francisco Bay Area network model, and execution times are measured when running on the Cori computer with 512 cores. Once the line search has converged and the optimal step size has been identified, the link weights for the network must be updated in every process. This step is parallelized across threads by assigning each thread a partition of the links to update the flow values and compute the new weights associated with its subset of links. Because each process must update all of the links in the network for the subsequent routing step to perform correctly, the degree of parallelism in this step is reduced to the number of threads within each process (as opposed to across all processes). This step corresponds to the upper blue boxes in Figure 5.
Finally, after the FrankWolfe algorithm has converged, the residual demand allocation must be performed to forward residual vehicles into the next QDTA time segment (Algorithm 4). The parallelization strategy for this algorithm partitions the set of active vehicle trip legs across threads in the exact manner as Algorithm 7. Each thread iterates over the subset of trip legs assigned to it (implicit in ) and forwards the vehicles that are still in transit at the end of the current time segment into the next time segment. Forwarded trip legs are guaranteed to be assigned to the same process in the next time segment so that the route incrementally assigned to each vehicle is maintained by a single process owner. Once the residual demand is computed, the nonresidual demand in the next time segment is identified and combined with the residual demand in parallel, and the FrankWolfe optimization for the next time segment begins. As we have described, the majority of the algorithms required for the QDTA methodology (as shown in Figure 2) have been parallelized to achieve high performance on distributed memory computer platforms.
4.2 Evaluation of computational performance and parallel scalability
We use high performance computing to address the computational challenges of a traffic assignment at urban scale. The solutions are implemented on the Cori supercomputer, a Cray X40 at the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory. In order to evaluate the computational strong scaling performance (improvement in program execution time for a fixed input as the number of cores is increased), we ran the QDTA algorithm for the San Francisco Bay Area network, utilizing up to 32 nodes of Cori with 1,024 cores total (2 processes per node, 16 cores per process, 2 threads per core). As we described in Section 4.1, the computational advantage is realized through parallelization of the various algorithmic steps in Fig. 2 across many compute cores. We segmented the 24hour day into 96 time segments, each 15 minutes (versus the multiple hour time segments that are traditionally used). For all runs, we solved for optimized route plans for all 19 million vehicle trips over the San Francisco Bay Area network (0.5 million nodes and 1 million links).
(a) (b) Figure 5: Computational scaling performance for our QDTA approach as parallelism is increased. Figure 5 shows the strong scaling performance results: a) when increase the number of cores on a single node, and b) when scaling across multiple nodes. Running the 96 time segment QDTA on a single node, the execution time is reduced from more than three hours (188 minutes) when run on a single core to under 19 minutes utilizing all 32 cores. This represents a parallel speedup of about 10x compared to singlecore execution. Running the QDTA on multiple nodes, we see that the execution time is reduced further to under 6 minutes when run on 32 nodes (1,024 cores), representing a total speedup of over 34x compared to the single core case. The reason the speedup is sublinear (less than times speed up when using times the compute cores) is due to a combination of overhead of parallelization (communication and synchronization costs between cores and processes) and the parts of the code which are not fully parallelized (resulting in Amdahl bottlenecks [66]).
5 Simple Network Example
To help illustrate the algorithms presented in Sections 4 and LABEL:sec:hpc_qdta, we describe the application of the QDTA model to a simple network with 4 links and 5 nodes and compare it with STA. For simplicity, all links are connected serially and no route choice decisions are involved in the network. We consider the demand for 1 hour with 4 time periods with 15 minutes interval each. The demand for every 15 minutes is given in Fig. 6. The link travel time function is of BPR form as shown in Equation (1b).
Figure 6: Network attributes and demand in the example For QDTA model (Fig. 7), for time , demand traverses link and reaches node 2, but it cannot reach its destination in the same time interval and hence stored in the downstream node 3 as residual. In the next time period , the new demand and the residual from previous time segment will traverse link . All demand reaches its destination in time segment . Time periods and have no demand and hence all trips reaches destination in the first two time interval. The total system travel time is: . The congested links in the network are during time periods and respectively.
Figure 7: QDTA assignment for simple network. The network is loaded in 2 time segments. represents the residual stored in the node at the end of each time period. The link travel time in minutes is noted above each link. The same demand is assigned using STA as shown in Fig. 8. The demand is assigned as average for the entire 1 hour duration. The total system travel time is: . None of the links are congested during any time interval in this assignment.
Figure 8: STA assignment for simple network. All demand is assigned at the same time and gets averaged for the entire 1 hour time duration. 6 Application to San Francisco Bay Area Network
6.1 Analysis of Traffic Assignment Results
Two models are considered for San Francisco Bay Area. Baseline model is the static traffic assignment (STA) for 710 am morning peak and QDTA model is the user equilibrium with shortest travel time optimization using the Frank Wolfe solver run for the entire day in 15 minute time intervals. The travel demand was obtained from SFCTA CHAMP6 [62] model for 24 hours of a typical day. Each trip was identified with an origin and destination microanalysis zone, which was then assigned to specific network nodes by weighting with the population density obtained from Global Human Settlement [67]. Fig. 9 shows the demand profiling for both cases under consideration. While STA assumes a constant demand for the entire morning peak period, QDTA considers a variable demand for the same duration. A professional map from HERE Technologies [61] is a core part the foundation for the Mobiliti platform. The map information is transformed into a different representation in order to integrate into the algorithms discussed earlier. However, we maintain the definitions of functional class roads as defined by HERE Technologies [61]
. Specifically, functional classes classify roads according to the speed, importance and connectivity of the road. A road can be one of five functional classes – these are defined in
Table 3. The analysis presented here will use these functional classes to help explore the results that compare QDTA to STA.Figure 9: Temporal demand interaction and model capabilities Functional Class Definition 1 Allowing for high volume, maximum speed traffic movement 2 Allowing for high volume, high speed traffic movement 3 Providing a high volume of traffic movement 4 Providing for a high volume of traffic movement at moderate speeds between neighbourhoods 5 Roads whose volume and traffic movement are below the level of any other functional class Table 3: Functional Road Classes Table 4 provides a comparison of the system level metrics using vehicle miles travelled (VMT), average volume over capacity (VOC) ratios, and vehicle hours of delay (VHD) categorised by functional class. Only links with positive flow are included in the analysis. Additionally, a comparison was made between congested and noncongested links (Table 5). For QDTA analysis, congestion is calculated for 7.458 AM time period when the demand is the highest.
Category VMT (in millions) Average VOC VHD (in thousands) STA QDTA STA QDTA STA QDTA FC2 16.44 18.09 0.52 0.69 17.57 96.84 FC3 4.67 5.00 0.23 0.30 2.65 14.43 FC4 4.96 5.24 0.14 0.17 1.13 6.42 FC5 2.43 2.60 0.03 0.48 0.36 17.78 Total 28.51 30.95   21.73 135.49 Table 4: System Level Metrics Category Length (km) VMT (in millions) Average VOC STA QDTA STA QDTA STA QDTA FC2 129 546 1.77 7.56 1.15 1.26 FC3 37 86 0.18 0.42 1.18 1.27 FC4 14 55 0.04 0.15 1.19 1.20 FC5 5 23 0.01 0.06 1.17 1.23 Total 185 710 2.0 8.19   Table 5: Congested Network Metrics The main distinction between the two models is how best each can replicate congestion on links. As seen in Table 5, for the links that are congested in STA, QDTA predicts even higher congestion. Since non uniform demand distribution results in higher travel time on links is in accordance with the convexity properties associated with link performance functions, QDTA with varying demand is expected to produce more congestion over STA with uniform demand. QDTA also predicts congestion dynamically than STA for all functional class categories. Fig. 10 shows the VOC over time by functional class (left) and VOC comparison for the two models by functional class(right). For all classes of roads the QDTA predicts congestion dynamically over time whereas STA either overestimates or underestimates the values irrespective of the demand dynamics. This difference is highly pronounced for FC2 where STA significantly underestimates the peak hour congestion.
Figure 10: Congestion profiling for QDTA and STA. Figure on the left shows the average VOC ratio for all links over time categorised by functional classes. The figure on right is a scatter plot showing the VOC ratio comparison for every link in the network for the 2 models at 7.458 am. It can also be seen from Table 4 that the total system delay is significantly higher in QDTA over STA. This is due to the dynamic demand distribution and the modelling capability in QDTA that allows interaction of demand between time intervals, thus allowing for a more realistic modelling of traffic. In QDTA, the residual demand from each time interval is carried over to the next phase (Fig. 11). It needs to be noted that due to the route truncation mechanism, all portions of a trip for a particular time period is not completed in that time period. So portions of a trip is loaded onto network in multiple time periods. In QDTA, each trip leg only contributes flow to the links it traverses within the time segment, whereas for STA, every leg contributes flow to the entire route. For the time interval 7.458 AM, STA has 185 km of congested network in comparison to QDTA with 720 km of congested network. Fig. 12 shows the Bay area network with congestion locations for the two modelling cases.
Figure 11: QDTA demand modelling with residuals (left). The length of network congested over time. A link is considered congested if the VOC ratio is greater than or equal 1. Figure 12: VOC ratios for congested links for QDTA (left) and STA (right) is shown for time period 7.45  8 AM. The links that are predicted to be congested in STA has even higher congestion in QDTA. STA produces 185 km of congested network in comparison to QDTA with 710 km of congested network. Only links with VOC greater than or equal to 1 is shown. 6.2 Validation
Validation was performed for QDTA results to test the effectiveness of representing the real world traffic environment. We conducted validation for traffic volume, speed and system metrics using multiple data sources. Stage 1 of the validation procedure involves checking the traffic counts for eight link corridors. The traffic count for each link was compared against the field data for the entire day in 15 minutes increments. The field data for city roads and highways were collected from the city of San Jose and Caltrans PeMS website [68] respectively for the year 2019. Each corridor provided information regarding traffic volumes and speeds by time of the day and direction. R of 0.7 is used as a satisfactory criterion for link count checks. The Fig. 13 shows R values for the eight corridors under consideration. The modeled corridors indicate a close match with the field data with the lowest R value observed being 0.68 for Zanker Road.
Figure 13: Validation of traffic counts in 15 minute increments for links in different functional classes. All links except one have satisfactory R of greater than 0.7 Stage 2 is speed comparison with Uber Movement speed data for San Francisco region for Q4, 2019 [69]. Links from Uber network were matched to our network for 139,495 links (20% of total). The speeds were compared for 8 am to 9 am for different speed limits. Figure 14 shows the speed distributions from QDTA results and Uber on links with 60 mph and 70 mph speed limit. Figure 15 shows the average speeds from Mobiliti and Uber across all speed limits. We believe the observed discrepancies in the speed distributions may be improved in future work with the addition of real world traffic signal location and timing data and further refinement of the link flow congestion and timing models.
Figure 14: Kernel density plot comparing QDTA model and Uber speed distributions at 60 mph (left) and 70 mph (right). Figure 15: Distribution of average Uber (left) and QDTA model (right) speeds across specific speed limits between 89am in the morning. Final stage of validation includes system level metrics comparisons, network validation, and error checking. Model visualization is used to check for unusual activities in traffic flows and odd roadway network attributes. Error checking and model verification consist of several smaller tasks such as checks for link geometry and connectivity, number of lanes, speeds, ramps and intersection geometry. Since our travel demand data was obtained from SFCTA, which conducts their own validation, we did not conduct additional behavior checks. We conducted system metric checks for VMT and total demand and validated them against the 2017 Environmental Impact report for the Bay Area
[70] in Table 6.Metric QDTA Field Data Relative Error(%) VMT 150,453,402 158,406,800 5 Daily Trips 19,167,301 21,227,800 10 Table 6: System level metrics validation 7 Conclusions
In this paper, we have presented a quasidynamic traffic assignment methodology to capture temporal dynamics in largescale transportation networks and described its parallelization to run efficiently on distributedmemory high performance computing systems. Two key mechanisms, route truncation and residual demand, were implemented to provide more realistic demand profiles as the dynamic assignment interval is reduced and a greater percentage of vehicle trips span across multiple time segments. Our approach divides the simulated day into 15 minute intervals and utilizes a modified static traffic assignment within each interval to assign the active traffic present in the network. The QDTA assignment step differs from a traditional STA in that the assigned routes are truncated to fit in the active time interval in each FrankWolfe iteration so that the resulting solution only includes traffic that occurs within the current interval. Furthermore, residual demand for each time interval is calculated based on estimated travel time on the links and then carried over to the next time interval. The combination of these techniques resolves traffic that is resident in the network for a longer time horizon than a single static assignment period and captures dynamic network behavior across multiple time segments. The model can be interpreted as an time expanded network approach where the expansion of the network represents the evolution with respect to time.
We have also described how the quasidynamic traffic assignment algorithm can be parallelized for efficient execution on highperformance distributedmemory computing platforms. The algorithm is parallelized through a combination of partitioning trip legs and network links across available compute threads to speed up the calculation of shortest paths, optimization cost functions, residual trip legs, and other program data. We described our parallelized line search algorithm which enables the identification of the optimal step size for each gradient descent iteration using Newton’s method, taking less than 100 milliseconds for a network of 1 million links. Using the optimal step size provides a reduction of more than 49 percent in total execution time compared to using the method of successive averages, due to a decrease in gradient descent iterations required for convergence. We demonstrated that a quasidynamic traffic assignment of the San Francisco Bay Area (19 million trip legs, 0.5 million nodes, and 1 million links) using 96 15minute time segments runs in under 6 minutes on 1,024 cores of the Cori supercomputer at NERSC, corresponding to a speedup of 34x compared to single core performance.
Finally, we presented an analysis of the traffic assignment results across functional classes, illustrating how the QDTA more accurately resolves the increased congestion patterns and dynamic behavior of the traffic system compared to a static traffic assignment approach, especially under peak congestion conditions. We presented a validation of the QDTA assignment counts and speeds compared to field data from CalTrans, San Jose, and Uber Movement, showing field count correlation values of 0.68 or greater. Future work includes evaluating a variety of optimization objective functions, include fuel optimization and system level optimizations for both travel time and fuel use. We hope to develop surrogate models using the results from the supercomputer implementation such that this capability can be made widely available.
Acknowledgements
This report and the work described were sponsored by the U.S. Department of Energy (DOE) Vehicle Technologies Office (VTO) under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program. The following DOE Office of Energy Efficiency and Renewable Energy (EERE) managers played important roles in establishing the project concept, advancing implementation, and providing ongoing guidance: David Anderson and Prasad Gupte. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DEAC0205CH11231.
References
 [1] Nikolaos Tsanakas, Joakim Ekström, and Johan Olstam. Estimating emissions from static traffic models: Problems and solutions. ISSN: 01976729 Pages: e5401792 Publisher: Hindawi Volume: 2020.
 [2] Michael G. H. Bell and Yasunori lida. Transportation Networks, chapter 2, pages 17–40. John Wiley and Sons, Ltd, 1997.
 [3] Stefan Flügel and Gunnar Flötteröd. Traffic assignment for strategic urban transport model systems.
 [4] Michiel Bliemer, Mark Raadsen, Erik de Romph, and ErikSander Smits. Requirements for traffic assignment models for strategic transport planning: A critical assessment. page 25.
 [5] D.K. Merchant and G.L. Nemhauser. A model and an algorithm for the dynamic traffic assignment problems. Transportation science, 12(3):183–199, 1978.
 [6] D.K. Merchant and G.L. Nemhauser. Optimality conditions for a dynamic traffic assignment model. Transportation Science 12.3, 12(3):200–207, 1978.
 [7] Takamasa Iryo. Multiple equilibria in a dynamic traffic network. 45(6):867–879.
 [8] Genetics of traffic assignment models for strategic transport planning. 37:56–78, January 2017.
 [9] Michiel CJ Bliemer, Mark PH Raadsen, ErikSander Smits, Bojian Zhou, and Michael GH Bell. Quasidynamic traffic assignment with residual point queues incorporating a first order node model. Transportation Research Part B: Methodological, 68:363–384, 2014.
 [10] Hasti Tajtehranifard, Ashish Bhaskar, Neema Nassir, Md Mazharul Haque, and Edward Chung. A path marginal cost approximation algorithm for system optimal quasidynamic traffic assignment. Transportation Research Part C: Emerging Technologies, 88:91–106, 2018.
 [11] Gaetano Fusco, Chiara Colombaroni, Andrea Gemma, and Stefano Lo Sardo. A quasidynamic traffic assignment model for large congested urban road networks. International Journal of Mathematical Models and Methods in Applied Sciences, 7(4):341–349, 2013.
 [12] Shoichiro Nakayama and Richard Connors. A quasidynamic assignment model that guarantees unique network equilibrium. Transportmetrica A: Transport Science, 10(7):669–692, 2014.
 [13] Xing Zeng, Xuefeng Guan, Huayi Wu, and Heping Xiao. A datadriven quasidynamic traffic assignment model integrating multisource traffic sensor data on the expressway network. ISPRS International Journal of GeoInformation, 10(3):113, 2021.
 [14] H.K. Chen. Dynamic travel choice models: a variational inequality approach. Springer Science & Business Media, 2012.
 [15] A. Nagurney and D. Zhang. Projected dynamical systems and variational inequalities with applications, volume 2. Springer Science & Business Media, 2012.
 [16] B. Ran and D.E. Boyce. Dynamic urban transportation network models: theory and implications for intelligent vehiclehighway systems, volume 417. Springer Science & Business Media, 2012.
 [17] B. Ran and T. Shimazaki. A general model and algorithm for the dynamic traffic assignment problems. In Transport Policy, Management & Technology Towards 2001: Selected Proceedings of the Fifth World Conference on Transport Research, volume 4, 1989.
 [18] B. Ran, D.E. Boyce, and L.J. LeBlanc. A new class of instantaneous dynamic useroptimal traffic assignment models. Operations Research, 41(1):192–202, 1993.
 [19] T.L. Friesz, J. Luque, R.L. Tobin, and B.W. Wie. Dynamic network traffic assignment considered as a continuous time optimal control problem. Operations Research, 37(6):893–901, 1989.
 [20] B. Ran and D.E. Boyce. A linkbased variational inequality formulation of ideal dynamic useroptimal route choice problem. Transportation Research Part C: Emerging Technologies, 4(1):1–12, 1996.
 [21] D.E. Boyce, B. Ran, and L.J. Leblanc. Solving an instantaneous dynamic useroptimal route choice model. Transportation Science, 29(2):128–142, 1995.
 [22] T.L. Friesz, D. Bernstein, T.E. Smith, R.L. Tobin, and B.W. Wie. A variational inequality formulation of the dynamic network user equilibrium problem. Operations Research, 41(1):179–191, 1993.
 [23] Alexandre Bayen, Alexander Keimer, Emily Porter, and Michele Spinola. Timecontinuous instantaneous and past memory routing on traffic networks: A mathematical analysis on the basis of the linkdelay model. SIAM Journal on Applied Dynamical Systems, 18(4):2143–2180, 2019.
 [24] S. Peeta and T.H. Yang. Stability issues for dynamic traffic assignment. Automatica, 39(1):21–34, 2003.
 [25] R. Ma, X.J. Ban, and J.S. Pang. Continuoustime dynamic system optimum for singledestination traffic networks with queue spillbacks. Transportation Research Part B: Methodological, 68:98–122, 2014.
 [26] X.J. Ban, J.S. Pang, H.X. Liu, and R. Ma. Modeling and solving continuoustime instantaneous dynamic user equilibria: A differential complementarity systems approach. Transportation Research Part B: Methodological, 46(3):389–408, 2012.
 [27] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part i: Methodology and theoretical analysis. Transportation Research Part B: Methodological, 49:55 – 74, 2013.
 [28] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part ii: Numerical analysis and computation. Transportation Research Part B: Methodological, 49:75 – 93, 2013.
 [29] K. Han, T.L. Friesz, and T. Yao. Existence of simultaneous route and departure choice dynamic user equilibrium. Transportation Research Part B: Methodological, 53:17 – 30, 2013.
 [30] A. Bressan and K.T. Nguyen. Optima and equilibria for traffic flow on networks with backward propagating queues. NHM, 10(4):717–748, 2015.
 [31] A. Bressan and K.T. Nguyen. Conservation law models for traffic flow on a network of roads. NHM, 10(2):255–293, 2015.
 [32] M. Garavello, K. Han, and B. Piccoli. Models for vehicular traffic on networks, volume 9. American Institute of Mathematical Sciences (AIMS), Springfield, MO, 2016.
 [33] M. Garavello and B. Piccoli. Traffic flow on networks, volume 1. American institute of mathematical sciences Springfield, 2006.
 [34] H. Holden and N. Risebro. A mathematical model of traffic flow on a network of unidirectional roads. SIAM Journal on Mathematical Analysis, 26(4):999–1017, 1995.
 [35] G. Bretti, R. Natalini, and B. Piccoli. Numerical approximations of a traffic flow model on networks. NHM, 1(1):57–84, 2006.
 [36] MJ Lighthill and GB Whitham. On kinematic waves. i. flood movement in long rivers. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 229(1178):281–316, 1955.
 [37] P. I. Richards. Shock waves on the highway. Operations research, 4(1):42–51, 1956.
 [38] A. Keimer, N. LaurentBrouty, F. Farokhi, H. Signargout, V. Cvetkovic, A.M. Bayen, and K.H. Johansson. Information patterns in the modeling and design of mobility management services. Proceedings of the IEEE, 106(4):554–576, 2018.
 [39] M. Gugat, A. Keimer, G. Leugering, and Z. Wang. Analysis of a system of nonlocal conservation laws for multicommodity flow on networks. Networks and Heterogeneous Media, 10(4):749–785, 2015.
 [40] A. Keimer, L. Pflug, and M. Spinola. Nonlocal scalar conservation laws on bounded domains and applications in traffic flow. accepted in SIAM SIMA, 2018.
 [41] A. Aw and M. Rascle. Resurrection of "second order" models of traffic flow. SIAM Journal on Applied Mathematics, 60(3):916–938, 2000.
 [42] H.M. Zhang. A nonequilibrium traffic model devoid of gaslike behavior. Transportation Research Part B: Methodological, 36(3):275 – 290, 2002.
 [43] M.O. Ghali and M.J. Smith. A model for the dynamic system optimum traffic assignment problem. Transportation Research Part B: Methodological, 29(3):155–170, 1995.
 [44] S. Peeta and H.S. Mahmassani. System optimal and user equilibrium timedependent traffic assignment in congested networks. Annals of Operations Research, 60(1):81–113, 1995.
 [45] M.J. Smith. A new dynamic traffic model and the existence and calculation of dynamic user equilibria on congested capacityconstrained road networks. Transportation Research Part B: Methodological, 27(1):49–63, 1993.
 [46] S.T. Waller and A.K. Ziliaskopoulos. A chanceconstrained based stochastic dynamic traffic assignment model: Analysis, formulation and solution algorithms. Transportation Research Part C: Emerging Technologies, 14(6):418–427, 2006.
 [47] H.K. Chen and C.F. Hsueh. A model and an algorithm for the dynamic useroptimal route choice problem. Transportation Research Part B: Methodological, 32(3):219–234, 1998.
 [48] J. Long, W.Y. Szeto, H.J. Huang, and Z. Gao. An intersectionmovementbased stochastic dynamic user optimal route choice model for assessing network performance. Transportation Research Part B: Methodological, 74:182 – 217, 2015.
 [49] B.N. Janson. Dynamic traffic assignment for urban road networks. Transportation Research Part B: Methodological, 25(23):143–161, 1991.
 [50] B.N. Janson. Convergent algorithm for dynamic traffic assignment. Transportation Research Record, (1328), 1991.
 [51] J.K. Ho. A successive linear optimization approach to the dynamic traffic assignment problem. Transportation Science, 14(4):295–305, 1980.
 [52] Ricardo Correa, Ines de Castro Dutra, Mario Fiallos, and Luiz Fernando Gomes da Silva. Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, volume 67. Springer Science & Business Media, 2013.
 [53] Transportation Simulation Systems. Aimsun. https://www.aimsun.com.
 [54] Institute of Transportation Systems SUMO, German Aerospace Center (DLR). http://www.sumo.dlr.de, 2018.
 [55] Sunil Thulasidasan, Shiva Kasiviswanathan, Stephan Eidenbenz, Emanuele Galli, Susan Mniszewski, and Phillip Romero. Designing systems for largescale, discreteevent simulations: Experiences with the fasttrans parallel microsimulator. In 2009 International Conference on High Performance Computing (HiPC), pages 428–437. IEEE, 2009.
 [56] Colin Sheppard et al. Modeling plugin electric vehicle charging demand with beam, the framework for behavior energy autonomy mobility. Technical report, 05/2017 2017.
 [57] Polaris.
 [58] Wasuwat Petprakob, Lalith Wijerathne, Takamasa Iryo, Junji Urata, Kazuki Fukuda, and Muneo Hori. On the implementation of high performance computing extensionfor daytoday traffic assignment. Transportation research procedia, 34:267–274, 2018.
 [59] Willem Himpe, Romain Ginestou, and MJ Chris Tampère. High performance computing applied to dynamic traffic assignment. Procedia Computer Science, 151:409–416, 2019.
 [60] M. Patriksson. The traffic assignment problem: models and methods. Courier Dover Publications, 2015.
 [61] HERE Technologies. https://www.here.com/, 2019. [Online; accessed 06Feb2019].
 [62] SFCTA. SFCHAMP 6.1: ConnectSF Needs Assessment 2015 Base Year Model Run. Technical report, San Francisco County Transportation Authority, February 2019.
 [63] Cy Chan, Bin Wang, John Bachan, and Jane Macfarlane. Mobiliti: scalable transportation simulation using highperformance parallel computing. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 634–641. IEEE, 2018.
 [64] Julian Dibbelt, Ben Strasser, and Dorothea Wagner. Customizable contraction hierarchies. In International Symposium on Experimental Algorithms, pages 271–282. Springer, 2014.
 [65] Hayssam Sbayti, ChungCheng Lu, and Hani S Mahmassani. Efficient implementation of method of successive averages in simulationbased dynamic traffic assignment models for largescale network applications. Transportation Research Record, 2029(1):22–30, 2007.
 [66] John L. Gustafson. Amdahl’s Law, pages 53–60. Springer US, Boston, MA, 2011.
 [67] European Commission, Joint Research Centre (JRC); Columbia University, Center for International Earth Science Information Network  CIESIN. Ghs population grid, derived from gpw4, multitemporal (2015). http://data.europa.eu/89h/jrcghslghs_pop_gpw4_globe_r2015a, 2015. European Commission, Joint Research Centre (JRC) [Dataset].
 [68] Caltrans, state of california. https://dot.ca.gov/programs/trafficoperations/census/trafficvolumes, 2019. [Online; accessed 03January2021].
 [69] Uber movement. https://movement.uber.com/cities/san_francisco/downloads/speeds, 2019. [Online; accessed 03January2021].
 [70] Environmental impact report plan bay area. https://www.planbayarea.org/2040plan/environmentalimpactreport, 2017. [Online; accessed 05November2020].
5 Simple Network Example
To help illustrate the algorithms presented in Sections 4 and LABEL:sec:hpc_qdta, we describe the application of the QDTA model to a simple network with 4 links and 5 nodes and compare it with STA. For simplicity, all links are connected serially and no route choice decisions are involved in the network. We consider the demand for 1 hour with 4 time periods with 15 minutes interval each. The demand for every 15 minutes is given in Fig. 6. The link travel time function is of BPR form as shown in Equation (1b).
For QDTA model (Fig. 7), for time , demand traverses link and reaches node 2, but it cannot reach its destination in the same time interval and hence stored in the downstream node 3 as residual. In the next time period , the new demand and the residual from previous time segment will traverse link . All demand reaches its destination in time segment . Time periods and have no demand and hence all trips reaches destination in the first two time interval. The total system travel time is: . The congested links in the network are during time periods and respectively.
The same demand is assigned using STA as shown in Fig. 8. The demand is assigned as average for the entire 1 hour duration. The total system travel time is: . None of the links are congested during any time interval in this assignment.
6 Application to San Francisco Bay Area Network
6.1 Analysis of Traffic Assignment Results
Two models are considered for San Francisco Bay Area. Baseline model is the static traffic assignment (STA) for 710 am morning peak and QDTA model is the user equilibrium with shortest travel time optimization using the Frank Wolfe solver run for the entire day in 15 minute time intervals. The travel demand was obtained from SFCTA CHAMP6 [62] model for 24 hours of a typical day. Each trip was identified with an origin and destination microanalysis zone, which was then assigned to specific network nodes by weighting with the population density obtained from Global Human Settlement [67]. Fig. 9 shows the demand profiling for both cases under consideration. While STA assumes a constant demand for the entire morning peak period, QDTA considers a variable demand for the same duration. A professional map from HERE Technologies [61] is a core part the foundation for the Mobiliti platform. The map information is transformed into a different representation in order to integrate into the algorithms discussed earlier. However, we maintain the definitions of functional class roads as defined by HERE Technologies [61]
. Specifically, functional classes classify roads according to the speed, importance and connectivity of the road. A road can be one of five functional classes – these are defined in
Table 3. The analysis presented here will use these functional classes to help explore the results that compare QDTA to STA.Functional Class  Definition 

1  Allowing for high volume, maximum speed traffic movement 
2  Allowing for high volume, high speed traffic movement 
3  Providing a high volume of traffic movement 
4  Providing for a high volume of traffic movement at moderate speeds between neighbourhoods 
5  Roads whose volume and traffic movement are below the level of any other functional class 
Table 4 provides a comparison of the system level metrics using vehicle miles travelled (VMT), average volume over capacity (VOC) ratios, and vehicle hours of delay (VHD) categorised by functional class. Only links with positive flow are included in the analysis. Additionally, a comparison was made between congested and noncongested links (Table 5). For QDTA analysis, congestion is calculated for 7.458 AM time period when the demand is the highest.
Category  VMT (in millions)  Average VOC  VHD (in thousands)  

STA  QDTA  STA  QDTA  STA  QDTA  
FC2  16.44  18.09  0.52  0.69  17.57  96.84 
FC3  4.67  5.00  0.23  0.30  2.65  14.43 
FC4  4.96  5.24  0.14  0.17  1.13  6.42 
FC5  2.43  2.60  0.03  0.48  0.36  17.78 
Total  28.51  30.95      21.73  135.49 
Category  Length (km)  VMT (in millions)  Average VOC  

STA  QDTA  STA  QDTA  STA  QDTA  
FC2  129  546  1.77  7.56  1.15  1.26 
FC3  37  86  0.18  0.42  1.18  1.27 
FC4  14  55  0.04  0.15  1.19  1.20 
FC5  5  23  0.01  0.06  1.17  1.23 
Total  185  710  2.0  8.19     
The main distinction between the two models is how best each can replicate congestion on links. As seen in Table 5, for the links that are congested in STA, QDTA predicts even higher congestion. Since non uniform demand distribution results in higher travel time on links is in accordance with the convexity properties associated with link performance functions, QDTA with varying demand is expected to produce more congestion over STA with uniform demand. QDTA also predicts congestion dynamically than STA for all functional class categories. Fig. 10 shows the VOC over time by functional class (left) and VOC comparison for the two models by functional class(right). For all classes of roads the QDTA predicts congestion dynamically over time whereas STA either overestimates or underestimates the values irrespective of the demand dynamics. This difference is highly pronounced for FC2 where STA significantly underestimates the peak hour congestion.
It can also be seen from Table 4 that the total system delay is significantly higher in QDTA over STA. This is due to the dynamic demand distribution and the modelling capability in QDTA that allows interaction of demand between time intervals, thus allowing for a more realistic modelling of traffic. In QDTA, the residual demand from each time interval is carried over to the next phase (Fig. 11). It needs to be noted that due to the route truncation mechanism, all portions of a trip for a particular time period is not completed in that time period. So portions of a trip is loaded onto network in multiple time periods. In QDTA, each trip leg only contributes flow to the links it traverses within the time segment, whereas for STA, every leg contributes flow to the entire route. For the time interval 7.458 AM, STA has 185 km of congested network in comparison to QDTA with 720 km of congested network. Fig. 12 shows the Bay area network with congestion locations for the two modelling cases.
6.2 Validation
Validation was performed for QDTA results to test the effectiveness of representing the real world traffic environment. We conducted validation for traffic volume, speed and system metrics using multiple data sources. Stage 1 of the validation procedure involves checking the traffic counts for eight link corridors. The traffic count for each link was compared against the field data for the entire day in 15 minutes increments. The field data for city roads and highways were collected from the city of San Jose and Caltrans PeMS website [68] respectively for the year 2019. Each corridor provided information regarding traffic volumes and speeds by time of the day and direction. R of 0.7 is used as a satisfactory criterion for link count checks. The Fig. 13 shows R values for the eight corridors under consideration. The modeled corridors indicate a close match with the field data with the lowest R value observed being 0.68 for Zanker Road.
Stage 2 is speed comparison with Uber Movement speed data for San Francisco region for Q4, 2019 [69]. Links from Uber network were matched to our network for 139,495 links (20% of total). The speeds were compared for 8 am to 9 am for different speed limits. Figure 14 shows the speed distributions from QDTA results and Uber on links with 60 mph and 70 mph speed limit. Figure 15 shows the average speeds from Mobiliti and Uber across all speed limits. We believe the observed discrepancies in the speed distributions may be improved in future work with the addition of real world traffic signal location and timing data and further refinement of the link flow congestion and timing models.
Final stage of validation includes system level metrics comparisons, network validation, and error checking. Model visualization is used to check for unusual activities in traffic flows and odd roadway network attributes. Error checking and model verification consist of several smaller tasks such as checks for link geometry and connectivity, number of lanes, speeds, ramps and intersection geometry. Since our travel demand data was obtained from SFCTA, which conducts their own validation, we did not conduct additional behavior checks. We conducted system metric checks for VMT and total demand and validated them against the 2017 Environmental Impact report for the Bay Area
[70] in Table 6.Metric  QDTA  Field Data  Relative Error(%) 

VMT  150,453,402  158,406,800  5 
Daily Trips  19,167,301  21,227,800  10 
7 Conclusions
In this paper, we have presented a quasidynamic traffic assignment methodology to capture temporal dynamics in largescale transportation networks and described its parallelization to run efficiently on distributedmemory high performance computing systems. Two key mechanisms, route truncation and residual demand, were implemented to provide more realistic demand profiles as the dynamic assignment interval is reduced and a greater percentage of vehicle trips span across multiple time segments. Our approach divides the simulated day into 15 minute intervals and utilizes a modified static traffic assignment within each interval to assign the active traffic present in the network. The QDTA assignment step differs from a traditional STA in that the assigned routes are truncated to fit in the active time interval in each FrankWolfe iteration so that the resulting solution only includes traffic that occurs within the current interval. Furthermore, residual demand for each time interval is calculated based on estimated travel time on the links and then carried over to the next time interval. The combination of these techniques resolves traffic that is resident in the network for a longer time horizon than a single static assignment period and captures dynamic network behavior across multiple time segments. The model can be interpreted as an time expanded network approach where the expansion of the network represents the evolution with respect to time.
We have also described how the quasidynamic traffic assignment algorithm can be parallelized for efficient execution on highperformance distributedmemory computing platforms. The algorithm is parallelized through a combination of partitioning trip legs and network links across available compute threads to speed up the calculation of shortest paths, optimization cost functions, residual trip legs, and other program data. We described our parallelized line search algorithm which enables the identification of the optimal step size for each gradient descent iteration using Newton’s method, taking less than 100 milliseconds for a network of 1 million links. Using the optimal step size provides a reduction of more than 49 percent in total execution time compared to using the method of successive averages, due to a decrease in gradient descent iterations required for convergence. We demonstrated that a quasidynamic traffic assignment of the San Francisco Bay Area (19 million trip legs, 0.5 million nodes, and 1 million links) using 96 15minute time segments runs in under 6 minutes on 1,024 cores of the Cori supercomputer at NERSC, corresponding to a speedup of 34x compared to single core performance.
Finally, we presented an analysis of the traffic assignment results across functional classes, illustrating how the QDTA more accurately resolves the increased congestion patterns and dynamic behavior of the traffic system compared to a static traffic assignment approach, especially under peak congestion conditions. We presented a validation of the QDTA assignment counts and speeds compared to field data from CalTrans, San Jose, and Uber Movement, showing field count correlation values of 0.68 or greater. Future work includes evaluating a variety of optimization objective functions, include fuel optimization and system level optimizations for both travel time and fuel use. We hope to develop surrogate models using the results from the supercomputer implementation such that this capability can be made widely available.
Acknowledgements
This report and the work described were sponsored by the U.S. Department of Energy (DOE) Vehicle Technologies Office (VTO) under the Big Data Solutions for Mobility Program, an initiative of the Energy Efficient Mobility Systems (EEMS) Program. The following DOE Office of Energy Efficiency and Renewable Energy (EERE) managers played important roles in establishing the project concept, advancing implementation, and providing ongoing guidance: David Anderson and Prasad Gupte. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DEAC0205CH11231.
References
 [1] Nikolaos Tsanakas, Joakim Ekström, and Johan Olstam. Estimating emissions from static traffic models: Problems and solutions. ISSN: 01976729 Pages: e5401792 Publisher: Hindawi Volume: 2020.
 [2] Michael G. H. Bell and Yasunori lida. Transportation Networks, chapter 2, pages 17–40. John Wiley and Sons, Ltd, 1997.
 [3] Stefan Flügel and Gunnar Flötteröd. Traffic assignment for strategic urban transport model systems.
 [4] Michiel Bliemer, Mark Raadsen, Erik de Romph, and ErikSander Smits. Requirements for traffic assignment models for strategic transport planning: A critical assessment. page 25.
 [5] D.K. Merchant and G.L. Nemhauser. A model and an algorithm for the dynamic traffic assignment problems. Transportation science, 12(3):183–199, 1978.
 [6] D.K. Merchant and G.L. Nemhauser. Optimality conditions for a dynamic traffic assignment model. Transportation Science 12.3, 12(3):200–207, 1978.
 [7] Takamasa Iryo. Multiple equilibria in a dynamic traffic network. 45(6):867–879.
 [8] Genetics of traffic assignment models for strategic transport planning. 37:56–78, January 2017.
 [9] Michiel CJ Bliemer, Mark PH Raadsen, ErikSander Smits, Bojian Zhou, and Michael GH Bell. Quasidynamic traffic assignment with residual point queues incorporating a first order node model. Transportation Research Part B: Methodological, 68:363–384, 2014.
 [10] Hasti Tajtehranifard, Ashish Bhaskar, Neema Nassir, Md Mazharul Haque, and Edward Chung. A path marginal cost approximation algorithm for system optimal quasidynamic traffic assignment. Transportation Research Part C: Emerging Technologies, 88:91–106, 2018.
 [11] Gaetano Fusco, Chiara Colombaroni, Andrea Gemma, and Stefano Lo Sardo. A quasidynamic traffic assignment model for large congested urban road networks. International Journal of Mathematical Models and Methods in Applied Sciences, 7(4):341–349, 2013.
 [12] Shoichiro Nakayama and Richard Connors. A quasidynamic assignment model that guarantees unique network equilibrium. Transportmetrica A: Transport Science, 10(7):669–692, 2014.
 [13] Xing Zeng, Xuefeng Guan, Huayi Wu, and Heping Xiao. A datadriven quasidynamic traffic assignment model integrating multisource traffic sensor data on the expressway network. ISPRS International Journal of GeoInformation, 10(3):113, 2021.
 [14] H.K. Chen. Dynamic travel choice models: a variational inequality approach. Springer Science & Business Media, 2012.
 [15] A. Nagurney and D. Zhang. Projected dynamical systems and variational inequalities with applications, volume 2. Springer Science & Business Media, 2012.
 [16] B. Ran and D.E. Boyce. Dynamic urban transportation network models: theory and implications for intelligent vehiclehighway systems, volume 417. Springer Science & Business Media, 2012.
 [17] B. Ran and T. Shimazaki. A general model and algorithm for the dynamic traffic assignment problems. In Transport Policy, Management & Technology Towards 2001: Selected Proceedings of the Fifth World Conference on Transport Research, volume 4, 1989.
 [18] B. Ran, D.E. Boyce, and L.J. LeBlanc. A new class of instantaneous dynamic useroptimal traffic assignment models. Operations Research, 41(1):192–202, 1993.
 [19] T.L. Friesz, J. Luque, R.L. Tobin, and B.W. Wie. Dynamic network traffic assignment considered as a continuous time optimal control problem. Operations Research, 37(6):893–901, 1989.
 [20] B. Ran and D.E. Boyce. A linkbased variational inequality formulation of ideal dynamic useroptimal route choice problem. Transportation Research Part C: Emerging Technologies, 4(1):1–12, 1996.
 [21] D.E. Boyce, B. Ran, and L.J. Leblanc. Solving an instantaneous dynamic useroptimal route choice model. Transportation Science, 29(2):128–142, 1995.
 [22] T.L. Friesz, D. Bernstein, T.E. Smith, R.L. Tobin, and B.W. Wie. A variational inequality formulation of the dynamic network user equilibrium problem. Operations Research, 41(1):179–191, 1993.
 [23] Alexandre Bayen, Alexander Keimer, Emily Porter, and Michele Spinola. Timecontinuous instantaneous and past memory routing on traffic networks: A mathematical analysis on the basis of the linkdelay model. SIAM Journal on Applied Dynamical Systems, 18(4):2143–2180, 2019.
 [24] S. Peeta and T.H. Yang. Stability issues for dynamic traffic assignment. Automatica, 39(1):21–34, 2003.
 [25] R. Ma, X.J. Ban, and J.S. Pang. Continuoustime dynamic system optimum for singledestination traffic networks with queue spillbacks. Transportation Research Part B: Methodological, 68:98–122, 2014.
 [26] X.J. Ban, J.S. Pang, H.X. Liu, and R. Ma. Modeling and solving continuoustime instantaneous dynamic user equilibria: A differential complementarity systems approach. Transportation Research Part B: Methodological, 46(3):389–408, 2012.
 [27] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part i: Methodology and theoretical analysis. Transportation Research Part B: Methodological, 49:55 – 74, 2013.
 [28] K. Han, T.L. Friesz, and T. Yao. A partial differential equation formulation of vickrey’s bottleneck model, part ii: Numerical analysis and computation. Transportation Research Part B: Methodological, 49:75 – 93, 2013.
 [29] K. Han, T.L. Friesz, and T. Yao. Existence of simultaneous route and departure choice dynamic user equilibrium. Transportation Research Part B: Methodological, 53:17 – 30, 2013.
 [30] A. Bressan and K.T. Nguyen. Optima and equilibria for traffic flow on networks with backward propagating queues. NHM, 10(4):717–748, 2015.
 [31] A. Bressan and K.T. Nguyen. Conservation law models for traffic flow on a network of roads. NHM, 10(2):255–293, 2015.
 [32] M. Garavello, K. Han, and B. Piccoli. Models for vehicular traffic on networks, volume 9. American Institute of Mathematical Sciences (AIMS), Springfield, MO, 2016.
 [33] M. Garavello and B. Piccoli. Traffic flow on networks, volume 1. American institute of mathematical sciences Springfield, 2006.
 [34] H. Holden and N. Risebro. A mathematical model of traffic flow on a network of unidirectional roads. SIAM Journal on Mathematical Analysis, 26(4):999–1017, 1995.
 [35] G. Bretti, R. Natalini, and B. Piccoli. Numerical approximations of a traffic flow model on networks. NHM, 1(1):57–84, 2006.
 [36] MJ Lighthill and GB Whitham. On kinematic waves. i. flood movement in long rivers. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 229(1178):281–316, 1955.
 [37] P. I. Richards. Shock waves on the highway. Operations research, 4(1):42–51, 1956.
 [38] A. Keimer, N. LaurentBrouty, F. Farokhi, H. Signargout, V. Cvetkovic, A.M. Bayen, and K.H. Johansson. Information patterns in the modeling and design of mobility management services. Proceedings of the IEEE, 106(4):554–576, 2018.
 [39] M. Gugat, A. Keimer, G. Leugering, and Z. Wang. Analysis of a system of nonlocal conservation laws for multicommodity flow on networks. Networks and Heterogeneous Media, 10(4):749–785, 2015.
 [40] A. Keimer, L. Pflug, and M. Spinola. Nonlocal scalar conservation laws on bounded domains and applications in traffic flow. accepted in SIAM SIMA, 2018.
 [41] A. Aw and M. Rascle. Resurrection of "second order" models of traffic flow. SIAM Journal on Applied Mathematics, 60(3):916–938, 2000.
 [42] H.M. Zhang. A nonequilibrium traffic model devoid of gaslike behavior. Transportation Research Part B: Methodological, 36(3):275 – 290, 2002.
 [43] M.O. Ghali and M.J. Smith. A model for the dynamic system optimum traffic assignment problem. Transportation Research Part B: Methodological, 29(3):155–170, 1995.
 [44] S. Peeta and H.S. Mahmassani. System optimal and user equilibrium timedependent traffic assignment in congested networks. Annals of Operations Research, 60(1):81–113, 1995.
 [45] M.J. Smith. A new dynamic traffic model and the existence and calculation of dynamic user equilibria on congested capacityconstrained road networks. Transportation Research Part B: Methodological, 27(1):49–63, 1993.
 [46] S.T. Waller and A.K. Ziliaskopoulos. A chanceconstrained based stochastic dynamic traffic assignment model: Analysis, formulation and solution algorithms. Transportation Research Part C: Emerging Technologies, 14(6):418–427, 2006.
 [47] H.K. Chen and C.F. Hsueh. A model and an algorithm for the dynamic useroptimal route choice problem. Transportation Research Part B: Methodological, 32(3):219–234, 1998.
 [48] J. Long, W.Y. Szeto, H.J. Huang, and Z. Gao. An intersectionmovementbased stochastic dynamic user optimal route choice model for assessing network performance. Transportation Research Part B: Methodological, 74:182 – 217, 2015.
 [49] B.N. Janson. Dynamic traffic assignment for urban road networks. Transportation Research Part B: Methodological, 25(23):143–161, 1991.
 [50] B.N. Janson. Convergent algorithm for dynamic traffic assignment. Transportation Research Record, (1328), 1991.
 [51] J.K. Ho. A successive linear optimization approach to the dynamic traffic assignment problem. Transportation Science, 14(4):295–305, 1980.
 [52] Ricardo Correa, Ines de Castro Dutra, Mario Fiallos, and Luiz Fernando Gomes da Silva. Models for Parallel and Distributed Computation: Theory, Algorithmic Techniques and Applications, volume 67. Springer Science & Business Media, 2013.
 [53] Transportation Simulation Systems. Aimsun. https://www.aimsun.com.
 [54] Institute of Transportation Systems SUMO, German Aerospace Center (DLR). http://www.sumo.dlr.de, 2018.
 [55] Sunil Thulasidasan, Shiva Kasiviswanathan, Stephan Eidenbenz, Emanuele Galli, Susan Mniszewski, and Phillip Romero. Designing systems for largescale, discreteevent simulations: Experiences with the fasttrans parallel microsimulator. In 2009 International Conference on High Performance Computing (HiPC), pages 428–437. IEEE, 2009.
 [56] Colin Sheppard et al. Modeling plugin electric vehicle charging demand with beam, the framework for behavior energy autonomy mobility. Technical report, 05/2017 2017.
 [57] Polaris.
 [58] Wasuwat Petprakob, Lalith Wijerathne, Takamasa Iryo, Junji Urata, Kazuki Fukuda, and Muneo Hori. On the implementation of high performance computing extensionfor daytoday traffic assignment. Transportation research procedia, 34:267–274, 2018.
 [59] Willem Himpe, Romain Ginestou, and MJ Chris Tampère. High performance computing applied to dynamic traffic assignment. Procedia Computer Science, 151:409–416, 2019.
 [60] M. Patriksson. The traffic assignment problem: models and methods. Courier Dover Publications, 2015.
 [61] HERE Technologies. https://www.here.com/, 2019. [Online; accessed 06Feb2019].
 [62] SFCTA. SFCHAMP 6.1: ConnectSF Needs Assessment 2015 Base Year Model Run. Technical report, San Francisco County Transportation Authority, February 2019.
 [63] Cy Chan, Bin Wang, John Bachan, and Jane Macfarlane. Mobiliti: scalable transportation simulation using highperformance parallel computing. In 2018 21st International Conference on Intelligent Transportation Systems (ITSC), pages 634–641. IEEE, 2018.
 [64] Julian Dibbelt, Ben Strasser, and Dorothea Wagner. Customizable contraction hierarchies. In International Symposium on Experimental Algorithms, pages 271–282. Springer, 2014.
 [65] Hayssam Sbayti, ChungCheng Lu, and Hani S Mahmassani. Efficient implementation of method of successive averages in simulationbased dynamic traffic assignment models for largescale network applications. Transportation Research Record, 2029(1):22–30, 2007.
 [66] John L. Gustafson. Amdahl’s Law, pages 53–60. Springer US, Boston, MA, 2011.
 [67] European Commission, Joint Research Centre (JRC); Columbia University, Center for International Earth Science Information Network  CIESIN. Ghs population grid, derived from gpw4, multitemporal (2015). http://data.europa.eu/89h/jrcghslghs_pop_gpw4_globe_r2015a, 2015. European Commission, Joint Research Centre (JRC) [Dataset].
 [68] Caltrans, state of california. https://dot.ca.gov/programs/trafficoperations/census/trafficvolumes, 2019. [Online; accessed 03January2021].
 [69] Uber movement. https://movement.uber.com/cities/san_francisco/downloads/speeds, 2019. [Online; accessed 03January2021].
 [70] Environmental impact report plan bay area. https://www.planbayarea.org/2040plan/environmentalimpactreport, 2017. [Online; accessed 05November2020].
6 Application to San Francisco Bay Area Network
6.1 Analysis of Traffic Assignment Results
Two models are considered for San Francisco Bay Area. Baseline model is the static traffic assignment (STA) for 710 am morning peak and QDTA model is the user equilibrium with shortest travel time optimization using the Frank Wolfe solver run for the entire day in 15 minute time intervals. The travel demand was obtained from SFCTA CHAMP6 [62] model for 24 hours of a typical day. Each trip was identified with an origin and destination microanalysis zone, which was then assigned to specific network nodes by weighting with the population density obtained from Global Human Settlement [67]. Fig. 9 shows the demand profiling for both cases under consideration. While STA assumes a constant demand for the entire morning peak period, QDTA considers a variable demand for the same duration. A professional map from HERE Technologies [61] is a core part the foundation for the Mobiliti platform. The map information is transformed into a different representation in order to integrate into the algorithms discussed earlier. However, we maintain the definitions of functional class roads as defined by HERE Technologies [61]
. Specifically, functional classes classify roads according to the speed, importance and connectivity of the road. A road can be one of five functional classes – these are defined in
Table 3. The analysis presented here will use these functional classes to help explore the results that compare QDTA to STA.Functional Class  Definition 

1  Allowing for high volume, maximum speed traffic movement 
2  Allowing for high volume, high speed traffic movement 
3  Providing a high volume of traffic movement 
4  Providing for a high volume of traffic movement at moderate speeds between neighbourhoods 
5  Roads whose volume and traffic movement are below the level of any other functional class 
Table 4 provides a comparison of the system level metrics using vehicle miles travelled (VMT), average volume over capacity (VOC) ratios, and vehicle hours of delay (VHD) categorised by functional class. Only links with positive flow are included in the analysis. Additionally, a comparison was made between congested and noncongested links (Table 5). For QDTA analysis, congestion is calculated for 7.458 AM time period when the demand is the highest.
Category  VMT (in millions)  Average VOC  VHD (in thousands)  

STA  QDTA  STA  QDTA  STA  QDTA  
FC2  16.44  18.09  0.52  0.69  17.57  96.84 
FC3  4.67  5.00  0.23  0.30  2.65  14.43 
FC4  4.96  5.24  0.14  0.17  1.13  6.42 
FC5  2.43  2.60  0.03  0.48  0.36  17.78 
Total  28.51  30.95      21.73  135.49 
Category  Length (km)  VMT (in millions)  Average VOC  

STA  QDTA  STA  QDTA  STA  QDTA  
FC2  129  546  1.77  7.56  1.15  1.26 
FC3  37  86  0.18  0.42  1.18  1.27 
FC4  14  55  0.04  0.15  1.19  1.20 
FC5  5  23  0.01  0.06  1.17  1.23 
Total  185  710  2.0  8.19     
The main distinction between the two models is how best each can replicate congestion on links. As seen in Table 5, for the links that are congested in STA, QDTA predicts even higher congestion. Since non uniform demand distribution results in higher travel time on links is in accordance with the convexity properties associated with link performance functions, QDTA with varying demand is expected to produce more congestion over STA with uniform demand. QDTA also predicts congestion dynamically than STA for all functional class categories. Fig. 10 shows the VOC over time by functional class (left) and VOC comparison for the two models by functional class(right). For all classes of roads the QDTA predicts congestion dynamically over time whereas STA either overestimates or underestimates the values irrespective of the demand dynamics. This difference is highly pronounced for FC2 where STA significantly underestimates the peak hour congestion.
It can also be seen from Table 4 that the total system delay is significantly higher in QDTA over STA. This is due to the dynamic demand distribution and the modelling capability in QDTA that allows interaction of demand between time intervals, thus allowing for a more realistic modelling of traffic. In QDTA, the residual demand from each time interval is carried over to the next phase (Fig. 11). It needs to be noted that due to the route truncation mechanism, all portions of a trip for a particular time period is not completed in that time period. So portions of a trip is loaded onto network in multiple time periods. In QDTA, each trip leg only contributes flow to the links it traverses within the time segment, whereas for STA, every leg contributes flow to the entire route. For the time interval 7.458 AM, STA has 185 km of congested network in comparison to QDTA with 720 km of congested network. Fig. 12 shows the Bay area network with congestion locations for the two modelling cases.
Comments
There are no comments yet.