Even though extensive road networks have been developed to satisfy the high demand for vehicular transportation, overoccupancy of roads still occurs on a daily basis, causing traffic jams which hurt the environment, the economy and the drivers’ moods. Finding a solution to traffic congestion is a challenging problem that has occupied many in the past century. After all, traffic dynamics are difficult to predict, due to complex fluctuations in traffic demand, both spatial and temporal. This makes it hard to devise a protocol for traffic flow redistribution that works well in varying conditions.
To date, various approaches have been proposed to alleviate congestion in some way (KumarShukla2015). However, these methods tend to be either static, data-independent protocols, micro-scale solutions (on the level of individual roads) or primarily driven by theoretical models. Our objective instead is to construct a dynamic, data-driven, macro-scale (road network level) approach to address traffic congestion. In this sense, dynamic means that a solution can be adapted to new traffic data with relative ease.
In this work, we propose a method for traffic redistribution fueled by metaheuristic optimisation, which we test on the case of the city centre of Tokyo. We seek to shift the traffic situation away from a state where each driver chooses the fastest, or shortest, route (thus causing congestion on roads that occur in many shortest routes), towards a system optimal equilibrium, as coined by Wardrop (Wardrop1952), where the total travel time for all drivers is minimised. By introducing externally imposed variable costs (e.g. tolls, or any other financial or non-financial method a supervisory institution might deploy) on each road, we aim to discourage drivers from all taking the same congested roads. This approach asserts that, on average, each driver is willing to take the cheapest route from their point of departure to the destination, where the total costs to drive a route depend both on the distance travelled, through a spatial cost, and the imposed variable costs encountered along the route.
In order to make predictions of traffic flow and occurrence of congestion, we infer traffic demand from a data set of urban movements. A number of public traffic data sets, such as the Dutch NDW (ndw), report the traffic flow or density at certain points in time; however, while this gives a detailed picture of a local situation, it provides no information as to what routes drivers are following. Hence, such data is of little use when we wish to redistribute traffic by encouraging sensible alternative routes. For this reason, we use a data set of urban movements, provided by Foursquare as part of the Future cities challenge, which allows for the inference of traffic level information needed for this research such as the origin and destination of movements (fcc).
2. Related work
The objective of combatting traffic congestion by altering road network setups has been addressed in a large body of work. The use of road pricing as a means to achieve this goal is a prevalent approach (Keeler2002; Walters1961; Yang2017; Ye2012). In this context, the marginal cost of congestion is a frequently employed measure to assess the optimal road pricing. A key difference between these papers and our work is that the previously proposed road pricing policies are fixed (e.g. charge a fee within a certain radius of the city centre) instead of dynamic, and do not follow from an optimisation procedure based on actual movement data. Approaches for optimising road networks and traffic flow from a different viewpoint, unrelated to road pricing policies, include metaheuristic optimisation of road improvements (Gallo2010), development of intelligent traffic light systems (Pallavi2018)
, optimisation of road graph architectures with evolutionary algorithms(Schweitzer1997), and prediction of optimal traffic flow through maximum-entropy methods (Liu2008). A more exhaustive list of methods is provided by Kumar Shukla and Agrawal (KumarShukla2015). In this work, we explore the use of optimisation algorithms in proposing a dynamic pricing mechanism using actual movement data.
3. Problem statement
The problem of optimising traffic flow through adaptive road pricing is twofold. First, we must estimate traffic flow and congestion in a road network, which is the underlying cause of high total travel time when all drivers follow the cheapest routes from their origins to their destinations. By combining movement data describing the traffic demand in the network and the spatial road network data, traffic flow theory yields these estimations. The demand data is a setconsisting of movements between venue locations within the road network. Each element of this set (indexed as ) is a tuple where and are elements of a venue data set containing spatial information about the venues, and is the recorded frequency of the specific movement from to . Second, having found a method to express the total travel time as a function of the variable cost parameters, we aim to optimise the parameters for minimal total travel time. Our methods for addressing this optimisation problem are set out in detail in the next section.
4.1. Traffic flow estimation
4.1.1. Road network and routing model
The first step towards prediction and minimisation of congestion is to represent the physical road network as a planar graph that has road segments for edges, which may be traversed in order to travel from an origin to a destination. Specifically, the graph is a tuple with the node set, the edge set and the set of Haversine lengths of all edges. The node set contains intersections in the road network, as well as nodes for the origin and destination locations from .
We then introduce, for each road segment in the graph, a cost that a driver needs to pay to traverse this segment. The main part of this cost is a variable cost
. All variable cost parameters collectively form the variable cost vectorwhich we seek to optimise for minimum congestion. Next, we assign to each segment a spatial cost, which is an immutable base cost for travelling from node to that is linearly dependent on the length of the segment by a tunable factor . Since the movement data is aggregated into frequency numbers, and is not provided on an individual level for anonymity reasons, we take to be equal for all drivers. Put together, the total cost for a driver to travel via a connected route of segments from some origin to a destination, is given by the sum of the individual segment costs occurring on the route:
For the development of traffic flow, we assert that all drivers are selfish and seek to drive the route which incurs the lowest total cost. These routes can be found using a weighted shortest path algorithm. Note that if on all edges, each driver will drive the route of lowest spatial cost, which is exactly the shortest route. From the cheapest routes, which are jointed collections of segments, and the frequency numbers , we can predict the vehicle count on each segment, from which the degree of congestion is computed as set out in the following subsection.
4.1.2. Congestion model
In our congestion model, we assume that the flow of traffic on a road segment is fully described by a Greenshields fundamental traffic flow curve (Greenshields1934), which is a widely used theoretical model for predicting traffic dynamics on a road segment. The used variables are flow (vehicles passing by per unit time) and density (vehicles present per unit length). In this model, there exists a maximum density that the road can support, beyond which the total flow is zero. Furthermore, there is a critical density at which the flow reaches its maximum value: . Naturally, , as no flow exists when no cars are present.
A basic curve that fits this description is a concave quadratic function, with zero flow at , which we define as
where we note that, since is quadratic in , . The maximum flow is directly related to the critical density; assuming that the traffic is able to drive at the maximum allowed speed when the density is at its critical point, we set
From this density-flow dependence, we can extract the space mean speed , the average speed of all vehicles on the road segment, as (Greenshields1934)
where again the maximum speed enters the relation, this time as a bound on the space mean speed. Finally, the space mean travel time on the segment, taken to have length , is computed as
From the expected number of vehicles on each road segment, we obtain the segment density as . For multi-lane roads, we divide this number by the number of lanes. By inserting the segment density into the density-time relation described above (eq. 4), we find the space mean time spent on the segment. The collection of segment travel times then leads to the definition of the objective function for optimisation by a metaheuristic algorithm.
4.2. Parameter optimisation
4.2.1. Objective function
The objective function computes the measure , which reflects the extent to which the system optimal equilibrium is reached by a variable cost configuration . This equilibrium occurs when the total travel times for each driver on their routes are minimal. As such, we define the objective function as the total space mean travel time over all routes driven on the road network. This total travel time can be conveniently expressed using the segment vehicle counts , which are directly dependent on :
Note that each segment mean travel time is also a function of since it is dependent on the vehicle count . Algorithm 1 shows, in pseudocode, the routine to compute the objective function.
4.2.2. Optimisation algorithms
The purpose of the optimisation algorithms is to find an optimal variable cost configuration such that the value of the objective function, the total travel time, is minimised. In principle, any black-box metaheuristic optimisation algorithm could be used to search for local optima of variable cost configurations which might approximate a system optimal equilibrium. That said, for these purposes, algorithms which are robust for high-dimensional problems are preferred, as the number of parameters increases proportionally to the number of edges in the graph.
For our proof-of-concept implementation we use simulated annealing (SA) (Kirkpatrick1983OptimizationBS), which is a variation of hill climbing where worse solutions can get accepted depending on the algorithm’s decreasing temperature
value, and a genetic algorithm (GA) adapted for continuous optimisation. For both algorithms each iteration contains 40 objective function evaluations; after one iteration, the GA updates its population, whereas SA resets its temperature value. Both algorithms use mutations generated using a normal distribution with zero mean and unit variance, at a mutation rate of 0.2 per parameter.
5. Case study: Tokyo city centre
In order to test our traffic flow optimisation method, we applied it to movements inside the city centre of Tokyo (i.e. excluding the Greater Tokyo area). We briefly discuss the movements and road network used for the case study, and present experimental results.
5.1. Movement data
The movement data was provided by Foursquare as part of the Future cities challenge (fcc); we selected only those parts related to the Tokyo city centre. The data contains a list of venues together with their GPS locations, forming the data set introduced in section 3, and a list of movements between the venues. The movement data contains movements of the same form as the tuples in , but with additional indications of the month during which the movement occurred, and the time of the day (periods of 4 to 6 hours). We considered only movements made in the afternoon, and combined the frequencies
for the same movement in different months into a single figure. Since the Foursquare dataset does not cover the entirety of vehicular movements inside Tokyo (and, in fact, also includes other types of movements such as subway, biking and walking trips), we viewed the frequencies as ratios rather than absolute numbers, asserting the law of large numbers for sufficient accuracy. We then normalised the frequencies to numbers that the road graph we constructed (see next subsection) could support.
As a last modification, we selected the venues occurring in the 100 most popular (i.e. frequent) routes, and clustered all other venues together with their nearest neighbour (in terms of Haversine distance) from the set of most popular venues. The routes were clustered accordingly, going between venue clusters instead of individual venues. This was done in order to substantially reduce the number of routes and therewith the computational complexity of the problem.
5.2. Road network data
The road network data used is based on the Asia shapefile provided by the Earthdata Global roads open access data set (eosdis). It contains information on the road networks of the entirety of Asia with a variable resolution; for the city Tokyo, its resolution is well suited for the algorithm. The road data is translated into a graph representation by finding intersections between lines, and turning these into intersection nodes. The lines themselves are used to create edges between intersections. Venue nodes are created by identifying the location of the venues from the venue data set, which are connected to the nearest intersection node.
5.3. Experimental results
The improvement progress for both optimisation algorithms is shown in figures 3 and 4. For comparison, we ran the objective function once with all variable costs set to 0 to obtain the default flow of the road network. This configuration had an objective value of around 416 million hours spent on the road network. It should be noted that the objective function values are higher than what would be a realistic amount of time spent on roads. As the theoretical traffic flow models do not take relief methods into account for fully congested roads, the speed on those roads in considered to be zero. In these cases, we use an arbitrary low value to represent the speed on the road, resulting in somewhat unrealistic amounts of hours.
Both the GA and SA were able to find solutions which substantially improved traffic flow over the network. After 30 iterations, the lowest fitness value of the GA was around 175 million hours (an improvement of 57.9%), and the lowest fitness value of SA was around 155 million hours (an improvement of 62.6%). Effectively, this means the solution found by SA was able to reduce the total amount of hours spent over a period of 6 hours in the afternoon by 261 million, due to having found a good distribution of variable costs for each road segment. Although the results of the GA were not as good as those of SA, the GA was still successful in improving the unoptimised configuration, which supports the view that any optimisation algorithm could be used for these purposes.
Interestingly, even the first iterations of the algorithms showed values which were an improvement compared to the default traffic flow. This is likely due to the nature of the shortest path algorithm when taking only distance into account, which sends many cars over the same roads unnecessarily. Though this type of behaviour was the basic premise allowing us to optimise, given the large variety of available, very similar roads in the network, any deviation from the over-use of single roads caused by the random initialisation of variable costs would result in better traffic flow. From such a randomly initialised state, both algorithms then improved the solutions such that the more well-suited alternatives were used.
The results show that the algorithms were effective in finding solutions improving traffic flow over the default setting of no variable costs. That said, there is no guarantee the optima the algorithms converged to were global optima, nor that the convergence was as fast as possible. Future work could include more thorough exploration of optimisation algorithms and their parameter settings.
We have shown that we can successfully address traffic congestion by redistributing traffic through imposing of variable road segment costs, and optimising this cost configuration using metaheuristic algorithms. The best variable costs configuration was found by a simulated annealing routine, improving upon the total travel time corresponding to a configuration with zero variable costs by 62.6%. Both simulated annealing and a genetic algorithm were effective at optimising solutions.
Though the practical implementation of the variable costs may be another non-trivial problem to address first, the positive results show that, at least conceptually, this method could result in improved traffic flow when applied in practice. Consequently, cities may enjoy shorter travel times, better accessibility, cleaner air and, not unimportantly, improved drivers’ moods.