DeepAI
Log In Sign Up

DuETA: Traffic Congestion Propagation Pattern Modeling via Efficient Graph Learning for ETA Prediction at Baidu Maps

08/15/2022
by   Jizhou Huang, et al.
Baidu, Inc.
0

Estimated time of arrival (ETA) prediction, also known as travel time estimation, is a fundamental task for a wide range of intelligent transportation applications, such as navigation, route planning, and ride-hailing services. To accurately predict the travel time of a route, it is essential to take into account both contextual and predictive factors, such as spatial-temporal interaction, driving behavior, and traffic congestion propagation inference. The ETA prediction models previously deployed at Baidu Maps have addressed the factors of spatial-temporal interaction (ConSTGAT) and driving behavior (SSML). In this work, we focus on modeling traffic congestion propagation patterns to improve ETA performance. Traffic congestion propagation pattern modeling is challenging, and it requires accounting for impact regions over time and cumulative effect of delay variations over time caused by traffic events on the road network. In this paper, we present a practical industrial-grade ETA prediction framework named DuETA. Specifically, we construct a congestion-sensitive graph based on the correlations of traffic patterns, and we develop a route-aware graph transformer to directly learn the long-distance correlations of the road segments. This design enables DuETA to capture the interactions between the road segment pairs that are spatially distant but highly correlated with traffic conditions. Extensive experiments are conducted on large-scale, real-world datasets collected from Baidu Maps. Experimental results show that ETA prediction can significantly benefit from the learned traffic congestion propagation patterns. In addition, DuETA has already been deployed in production at Baidu Maps, serving billions of requests every day. This demonstrates that DuETA is an industrial-grade and robust solution for large-scale ETA prediction services.

READ FULL TEXT VIEW PDF
05/28/2021

Spatial-Temporal Dual Graph Neural Networks for Travel Time Estimation

Travel time estimation is a basic but important part in intelligent tran...
08/25/2021

ETA Prediction with Graph Neural Networks in Google Maps

Travel-time prediction constitutes a task of high importance in transpor...
08/15/2021

Time Delay Estimation of Traffic Congestion Propagation based on Transfer Entropy

Considering how congestion will propagate in the near future, understand...
06/13/2018

Deep Sequence Learning with Auxiliary Information for Traffic Prediction

Predicting traffic conditions from online route queries is a challenging...
07/18/2021

Detecting Braess Routes: an Algorithm Accounting for Queuing Delays With an Extended Graph

The Braess paradox is a counter-intuitive phenomenon whereby adding road...
10/08/2021

Hybrid Graph Embedding Techniques in Estimated Time of Arrival Task

Recently, deep learning has achieved promising results in the calculatio...
12/07/2018

Modelling Metropolitan-area Ambulance Mobility under Blue Light Conditions

Actions taken immediately following a life-threatening personal health i...

1. Introduction

Figure 1. A screenshot of live traffic service at Baidu Maps.

Estimated time of arrival (ETA) prediction (a.k.a., travel time estimation) aims at predicting the travel time for a given route and departure time, which greatly helps users to make informed decisions about traffic conditions and plan their travels wisely in advance. ETA prediction is a fundamental task for a wide range of intelligent transportation applications, such as navigation, route planning, and ride-hailing services. As one of the largest web mapping applications, Baidu Maps keeps serving tens of billions of daily ETA requests that benefit tens of millions of users per day. In order to help traffic participants make more informed decisions on route selection and congestion avoidance, it is important to provide accurate and reliable travel time estimations.

Figure 2. A representative case of associations between the road segments that are not directly connected. An arrow indicates a road segment. The road segments with gray background constitute the route selected by a user. The live traffic condition of each road segment is marked with a distinctive color, i.e., “congestion”/“slow”/“fast” are in red/yellow/green, respectively.

ETA prediction is a challenging task, as it needs to take into account both contextual and predictive factors, such as spatial-temporal interaction, driving behavior, and traffic congestion propagation inference. The ETA prediction models previously deployed at Baidu Maps have addressed the factors of spatial-temporal interaction (ConSTGAT (Fang et al., 2020)) and driving behavior (SSML (Fang et al., 2021)). In our efforts toward developing a more powerful ETA prediction model, we observed that a propagation of ETA errors arises from the sharp inconsistency between the predicted traffic condition in the future and ground truth. As such, besides the current traffic conditions on the road network, it is also important to accurately infer the traffic conditions unfolding in the future. Motivated by this observation, we focus on modeling traffic congestion propagation patterns to improve ETA performance in this work. Figure 1 shows an example of live traffic conditions at Baidu Maps. As illustrated in it, the impact regions and cumulative delays over time caused by traffic congestion (the road segments in red) would inevitably affect all the interdependent segments on the road network.

Although there is a growing interest in incorporating traffic conditions into ETA prediction models, e.g., (Zhang et al., 2018b; Li et al., 2018; Yu et al., 2018; Guo et al., 2019; Yu et al., 2019; Fang et al., 2020), the inferring of traffic conditions unfolding in the future still remains a bottleneck in industrial ETA prediction models, particularly for modeling impact regions and cumulative delays over time caused by traffic congestion. To illustrate the importance of modeling traffic congestion propagation patterns in ETA prediction models, we consider the representative case presented in Figure 2.

As can be seen from Figure 2, road segment pairs that are spatially distant and indirectly connected can interact with traffic conditions, which demonstrates the importance of modeling impact regions over time caused by traffic congestion. Although existing studies have applied spatial-temporal graph neural networks (STGNNs) (Zhang et al., 2018b; Li et al., 2018; Yu et al., 2018; Guo et al., 2019; Yu et al., 2019; Fang et al., 2020) to model traffic conditions, they pay too much attention to directly connected road segments, which have two main limitations. (1) The long-distance correlations of indirectly connected road segments are not explicitly modeled, which inevitably suffer from information loss during the multi-step message passing. (2) Traffic conditions are not sufficiently transmitted between two road segments that are spatially distant, because they typically execute only a few steps of message passing (one step in most cases), due to the computational complexity of STGNNs.

In this paper, we present our efforts toward designing and developing DuETA, which is designed to model traffic congestion propagation patterns via efficient graph learning. Specifically, instead of directly using the road network as a graph, we construct a congestion-sensitive graph based on the correlations of traffic patterns. In addition, we develop a route-aware graph transformer to directly learn the long-distance correlations of the road segments. These designs enable DuETA to capture the interactions between any two road segment pairs that are spatially distant but highly correlated with traffic conditions.

Extensive experiments are conducted on large-scale, real-world datasets collected from Baidu Maps. Experimental results show that DuETA is able to effectively learn traffic congestion patterns and predict subsequent propagation of traffic events, which significantly benefits the performance of ETA prediction.

Our main contributions to this problem are as follows:

  • Potential impact: We suggest a practical and robust framework, named DuETA, as an industrial-grade solution to the task of ETA prediction. We hope that it could be of potential interest to practitioners working with such problems.

  • Novelty: The design of DuETA is driven by the novel ideas that directly capture the long-distance correlations through a congestion-sensitive graph, and that model traffic congestion propagation patterns via a route-aware graph transformer.

  • Technical quality: Extensive experiments show that ETA prediction can significantly benefit from the learned traffic congestion propagation patterns, which demonstrates the effectiveness and practical applicability of DuETA. The successful deployment of DuETA at Baidu Maps further shows that it is an industrial-grade and robust solution for large-scale ETA prediction services.

2. DuETA

We first formalize the task, then detail the architecture of DuETA.

2.1. Problem Formulation

When the ETA prediction service receives a request, it will provide the estimated arrival time on the basis of the received request, road network, road conditions, and other contextual information.

Road network: The road network is an essential component of ETA. In this study, the road network is defined as a directed graph , where is a link set and is an edge set. Link represents a road segment. For the sake of convenience, “road segment” is referred to as “link” hereafter. Edge denotes the edge connecting link and link , if and share the same junction.

Route: A route is defined as a link sequence , where is the number of links in the route. Usually, a route contains hundreds of consecutive links. A navigation service produces several candidate routes based on the corresponding road network.

Request: A request is represented by a pair , where is the route, and is the departure time. The objective of ETA is to estimate the travel time of the given request .

Dataset: A dataset is defined as , where is the ground-truth travel time of in , and is the number of requests in the dataset. For , the travel time of the -th link in of request is denoted as , and the travel time of the entire route is computed by .

2.2. Feature Preparation

Two types of features are prepared for ETA prediction: static features and dynamic features. The static features refer to features that do not change over time, consisting of the information of the road network of the corresponding city (e.g., the link ID, the length, the width, the number of lanes, the type of road, the speed limit, the type of crossing, and the kind of traffic light), as well as the contextual information (e.g., departure time and user profile).

On the contrary, the dynamic features change over time, such as traffic conditions. We aim to infer the future traffic conditions from the recent traffic conditions. When estimating the time of arrival, the traffic conditions of the past one hour are collected as features, which are divided into 12 time slots (5 minutes per time slot). For each link at each time slot, we calculate the median speed, max speed, min speed, mean speed, and record counts as features. These statistic numbers of a link at time slot

are deduced from the traffic records in the dataset, where a time slot is set to 5 minutes in this paper. Then, these features are flattened and mapped into the same shape with static features by a linear transformation. Static and dynamic features are taken as the features of the links. For link

, we denote its feature vector as

.

2.3. Congestion-sensitive Graph

Instead of directly using the road network as the graph for traffic congestion pattern modeling, we design a congestion-sensitive graph , where is the link set. For each link , we take advantage of the first-order neighbor links, as well as the high-order neighbor links whose traffic patterns are highly correlated to that of link . is a set of edge sets that describes the first-order neighbors, while is an edge set that describes the high-order neighbors. We detail them as follows.

2.3.1. First-order Neighbors

An edge of the road network describes the relation between a link and its first-order neighbor that is directly connected to . Our previous work (Fang et al., 2020) has demonstrated that different types of neighbor links play varying roles in the future traffic of a given link. For example, the traffic congestion is more likely to propagate from downstream links to upstream links. In order to effectively handle such correlations, the following refinements are performed. First, we define multiple types of link relations and incorporate these relations into the construction of the congestion-sensitive graph. Second, we use attention mechanism separately for each relation to capture the impact of neighbor links, which is detailed in Section 2.4.1 and 2.4.2.

Figure 3. Demonstration of extracting first-order and high-order neighbors to construct a congestion-sensitive graph.

An edge describes the relation between two links, and all the edges in the edge set are divided into five types according to the connection relationships between the links. For example, as shown on the top of Figure 3, given a link, the second type of the first-order neighbor is its upstream link, and the third type of the first-order neighbor is its downstream link. Although the links of the remaining three types are not included in the travel route, the traffic conditions of these links also affect that of the target link. For example, the vehicles on these links may block the traffic at the intersection, thus holding up the traffic of the target link. The set for each type of edge is denoted by with denoting the index of the type. That means .

Figure 4. Architecture of the route-aware graph transformer.

2.3.2. High-order Neighbors

Existing studies only focus on the relationships between directly connected road segments. Despite the associations between the directly connected links, the long-distance associations between indirectly connected links are also crucial for ETA prediction. We aim to model the interactions between the link pairs that are spatially distant but highly correlated. Given a link, its indirectly connected links are called the high-order neighbors. Theoretically, all the indirectly connected links in the road network can be regarded as its high-order neighbors, and there are a great number of links in the road network. Considering the computational cost of STGNN, we extract the most correlated high-order neighbors for each link to construct the congestion-sensitive graph.

The correlated high-order neighbors are extracted from tens of millions of historical travel routes of Baidu Maps. The idea of high-order neighbor construction is demonstrated in Figure 3. We pay attention to each link’s 2-hop to 5-hop neighborhood to extract the correlated high-order neighbors, since traffic correlation between two links too far away is usually low. More concretely, for a link in a historical travel route , we first extract its 2-hop to 5-hop neighbors in the route as the candidate high-order neighbors of link . We denote those candidate high-order neighbors w.r.t. route as . Then, we calculate the Pearson correlation111To calculate the Pearson correlation between two links and , we first count the average travel time every five minutes for the last two hours as and . Then, the Pearson correlation is computed by . between the traffic patterns of link and each candidate in . The route-based correlation score between two links in the historical travel route is denoted as

. We make the assumption that the higher the correlation score, the higher the probability that the corresponding links will impact each other. Finally, accounting for the population of co-occurrence of two links, for links

and , we sum up all the route-based correlation scores of all the historical travel routes as the final correlation score . The most influential high-order neighbors of link are defined as those links with the highest correlation scores . For each link, we select the top-5 influential high-order neighbors. A high-order edge is defined as an edge that connects a link and one of its high-order neighbor links. All the high-order edges compose a high-order edge set, i.e., . Note that is regarded as a supplemental edge set to capture the associations between the links that are spatially distant but highly correlated, and .

For the sake of convenience, we will use to denote the congestion-sensitive graph hereafter, where the high-order edges are considered as the sixth type of link relations.

2.4. Route-aware Graph Transformer

We design a route-aware graph transformer, which aims at efficiently and effectively aggregating traffic information through the congestion-sensitive graph. Figure 4 illustrates the architecture of the route-aware graph transformer. It takes the congestion-sensitive graph and the prepared features as input and generates the representation vector of each link in the route.

2.4.1. Graph Transformer

In order to specify different weights to different links in a neighborhood, we use the graph transformer (Shi et al., 2021) as the backbone structure to aggregate the information from our congestion-sensitive graph. The reasons for choosing the graph transformer are two-fold. (1) It adopts the multi-head attention mechanism (Vaswani et al., 2017)

to learn edge weights. (2) It addresses the over-smoothing problem in vanilla GNNs by residual connections.

Given a graph and the corresponding link features of route , the graph transformer performs multi-head attention for each edge :

(1)
(2)
(3)
(4)

where , , and denote the query, key, and value, respectively, of the attention mechanism. represents the index of the head. For the -th head attention, we transform the feature vector of link , i.e., into a query vector . The feature vector of link (a neighbor of link ), i.e., is converted into a key vector and a value vector . , , , , , and are learnable parameters. is an exponential scale dot-product function, and is the hidden size of each head. denotes the neighbor links of link according to the edge set . With the graph multi-head attention , we perform weighted aggregation from all the neighbors of link :

(5)

where is the number of attention heads. To tackle the over-smoothing issue (Chen et al., 2020; Li et al., 2019), we simply average the results from different attention heads to keep the same shape with the residual connection term . For the sake of convenience, function is introduced to denote the graph transformer.

2.4.2. Route-aware Graph Transformer

In order to better capture the correlations between different links, we decompose the graph into six relational graphs, i.e., where the relation index is from 1 to 6. We apply graph transformer to each relational graph to obtain the representations of links. Then, for each link , we summarize its representations that are learned through different relational graphs to produce a new hidden vector, which incorporates the six kinds of relations. Equation 5 is rewritten as:

(6)

where is the set of neighbors for link w.r.t. the -th relation.

We believe that the route-level information is also crucial for ETA prediction, and we integrate it into link representations. However, it is more challenging to estimate the traffic condition of a link that is far from the origin of a route, because the traffic condition when a user arrives at this link is likely different from that when the user requests the ETA. In addition, the graph transformer is unable to identify whether a link belongs to a given route or not, making it difficult to generate distinct representations of a same link w.r.t. different routes, as showcased by Figure 4(a). To address these issues, we introduce two route-aware structural encoding methods, position encoding and route identifier, to improve the sensitivity on route information, as showcased by Figure 4(b).

(a) Graph transformer is unable to distinguish the two routes in case 1 and case 2, because it generates the same representations of link a in two cases.
(b) Route identifier is added to distinguish whether a link belongs to a route or not, which facilitates generating distinct representations of link a in two cases.
Figure 5. The necessity of introducing a route identifier.

Specifically, the position encoding is designed to encode the order information of a link. Given a route, we calculate the shortest hop between the origin and a link in the road network as the link’s position encoding. In this way, the position encoding of a link can be regarded as a gate to control the degree of dependency of the traffic condition when a user requests the ETA. In addition, a route identifier is added to distinguish whether a link belongs to a route or not, which enables the graph transformer to generate contextualized link representations.

2.4.3. Integration

To capture the local dependencies between links in the requested route, we use a 1-D convolution layer (Conv1D) with a window size of 3 (Krizhevsky et al., 2012)

and take link representation vectors of the links in the route as input. Then, a multilayer perceptron (MLP) with ReLU

(Agarap, 2018)

as the activation function is employed, which takes the smoothed output from Conv1D as input and estimates the travel time of each link:

(7)

where is the estimated travel time of the -th link in the route.

The estimated travel times of all the links in the route are summed up as the estimated travel time of the entire route :

(8)

where denotes the estimated travel time of route .

Multi-task learning is adopted to optimize the model parameters from both the link-level and the route-level. On the one hand, Huber loss (Huber, 1992)

is used as the link-level loss function, which is defined as:

(9)

where

is a hyper-parameter to alleviate the impact of the outliers. On the other hand, absolute percentage error (APE) is used as the loss function of the entire route

, which is defined as:

(10)

We combine the link-level loss and the route-level loss to obtain the loss function :

(11)

We minimize to optimize the parameters of the model.

3. Experiments

3.1. Experimental Setup

Dataset #Links #Training #Test Average links
routes routes per route
Beijing 2,435,719 20,067,736 4,153,587 96.87
Shanghai 2,427,225 23,844,299 5,280,030 87.98
Tianjin 1,643,454 7,760,443 1,815,334 94.25
Table 1. Statistics of the real-world datasets.
Beijing Shanghai Tianjin
Method MAE (sec) RMSE (sec) MAPE (%) MAE (sec) RMSE (sec) MAPE (%) MAE (sec) RMSE (sec) MAPE (%)
AVG 367.37 750.68 41.25 312.22 575.82 39.74 316.47 567.34 35.94
STANN 206.18 450.57 24.37 184.46 369.88 24.32 189.01 395.59 21.48
DCRNN 204.65 444.10 24.81 183.52 367.08 24.27 188.28 395.15 21.40
DeepTravel 183.87 410.92 21.67 170.09 328.35 25.10 178.66 386.33 19.62
ConSTGAT 181.91 401.21 22.03 158.33 319.62 21.05 176.61 381.49 19.59
DuETA 178.39 399.12 21.22 155.85 315.82 20.83 171.42 370.59 19.50
Table 2. Performance of DuETA and the baseline methods for ETA prediction on three real-world datasets.

We evaluate DuETA against several competitive baseline methods on three real-world, large-scale datasets, including Beijing, Shanghai, and Tianjin. These metropolises are the largest cities in China, which contain millions of links individually. The datasets are sampled from Baidu Maps, which consist of tens of millions of routes and their corresponding travel times ranging from Oct. 10th to Nov. 20th, 2021. The selected period minimizes the impact of the COVID-19 pandemic on traffic (Huang et al., 2020b). The data of the first four weeks are used for training, while the data of the last week are used for evaluation. The statistics of the datasets are shown in Table 1.

Three widely-used metrics for evaluating ETA prediction, including mean average error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE), are used to evaluate the methods, which are defined as:

(12)
(13)
(14)

3.2. Baselines

We compare DuETA against the following five baseline methods.

  • AVG. For each dataset, we calculate the average travel times of each link at each time slot according to the records in the training set. Given a request , we sum up the statistical travel times of all the links in route with the departure time to estimate the travel time of .

  • STANN (He et al., 2018). STANN is an STGNN, which encodes the spatial and temporal traffic information by attention mechanism and LSTM. The standard STANN assumes that all links in the road network are connected, making it computationally impractical for large-scale datasets. Therefore, we revise it to consider only the spatially connected links.

  • DCRNN (Li et al., 2018). DCRNN is also an STGNN, which first captures the spatial information by graph convolution network and then captures temporal information by LSTM.

  • DeepTravel (Zhang et al., 2018a). DeepTravel is an end-to-end method, where the spatial and temporal features are extracted and taken as the input of a bidirectional LSTM to estimate the travel time. We re-implement DeepTravel based on the road network instead of the original spatial grids.

  • ConSTGAT (Fang et al., 2020). ConSTGAT is our previously deployed end-to-end ETA prediction model at Baidu Maps, which models the joint relations of spatial and temporal information as well as the contextual information of a route.

All methods are implemented using PaddlePaddle, an open-source deep-learning platform maintained by Baidu. We set the embedding size and the hidden size of DuETA to be 32. The number of attention heads

is set to be 8. The same settings are applied to the baseline methods, for a fair comparison. The model parameters are optimized by Adam optimizer (Kingma and Ba, 2015) with a learning rate of

. The model hyperparameters are tuned according to the validation performance, online serving latency, and model capacity.

Figure 6. The RMSE of ablation versions of DuETA.
(a) (a) Complete DuETA in Beijing.
(b) (c) Complete DuETA in Shanghai.
(c) (e) Complete DuETA in Tianjin.
(d) (b) DuETA w/o route identifier in Beijing.
(e) (d) DuETA w/o route identifier in Shanghai.
(f) (f) DuETA w/o route identifier in Tianjin.
Figure 7. Visualization of the attention weight distributions of the complete DuETA and the ablative DuETA without a route identifier. The vertical dotted line denotes the mean value of the attention weights.

3.3. Results and Analysis

3.3.1. Overall Performance

We conducted offline tests to compare DuETA with multiple strong baselines to demonstrate the superiority of DuETA. Table 2 shows the experimental results. Boldface indicates the best score w.r.t. each metric. From the results, we have the following observations.

(1) The performance of the naive baseline AVG is the worst, as it makes a simple assumption that the traffic condition of each link at each time slot is constant.

(2) STANN and DCRNN significantly outperform AVG by a large margin in terms of all metrics on three datasets. The main reason is that they both take into account the spatial and temporal information of the traffic conditions.

(3) DeepTravel and ConSTGAT significantly improve over both STANN and DCRNN on three datasets. The improvements are mainly due to two reasons: End-to-end methods are more effective than the segment-based methods, and the correlations of spatial and temporal information are jointly modeled.

(4) Results show that DuETA significantly outperforms all baselines, which demonstrates that modeling traffic congestion propagation patterns is able to significantly benefit the performance of ETA prediction. The main reason is two-fold. On the one hand, DuETA is more sensitive to long-distance traffic congestion, because the introduction of high-order neighbors enables DuETA to bridge the gap between links that are not directly connected yet are highly correlated. On the other hand, the cumulative effect of delay variations over time caused by traffic events on the road network can be alleviated by the high efficiency of traffic congestion pattern modeling.

3.3.2. Ablation Studies

We perform ablation experiments to understand the relative importance of different components of DuETA.

First, we examine the impact of different components of DuETA, including the route-aware graph transformer and the congestion-sensitive graph. The ablation results in Figure 6 show that removing both components hurts performance significantly in all three cities. This demonstrates the significance of both components in improving ETA prediction performance. Moreover, removing the high-order neighbors (DuETA w/o high-order neighbors) leads to significant drops in performance on all datasets. This confirms the effectiveness of the long-distance associations between links for ETA prediction.

Second, we study the effect of the route-aware graph transformer. To learn the long-distance associations between links, DuETA adopts a route-aware graph transformer that encodes route identifier and position information. To obtain an understanding of the effect of the route identifier, we visualize the distributions of the attention weights in Figure 7. The sub-figures at the top of Figure 7 show the distributions of the attention weights of the complete DuETA, while the sub-figures at the bottom show those of the ablative DuETA without a route identifier (abbr. “DuETA w/o RI”). From the results, we have the following observations. For the links that are in the travel routes (abbr. “On route” in Figure 7), the averaged attention weights of the complete DuETA are larger than those of “DuETA w/o RI” in all three cities. By contrast, for the links that are not in the travel routes (abbr. “Off route” in Figure 7), the averaged attention weights of the complete DuETA are smaller than those of “DuETA w/o RI”. This demonstrates that the introduction of a route identifier enables DuETA to pay more attention to the links that are in the travel routes.

Figure 8. Averaged Pearson correlation coefficients of traffic patterns between the links and their neighbors.

Third, we study the effect of a congestion-sensitive graph. To enable DuETA to capture the interactions between links that are spatially distant but highly correlated with traffic conditions, we construct a congestion-sensitive graph based on the correlations of traffic patterns. Figure 8 presents the Pearson correlation coefficients of the traffic patterns between the links and their neighbors. It can be seen that the correlations between the links and their first-order neighbors are the strongest among all kinds of neighbors. This is intuitive, because the traffic flows on spatially adjacent and closely connected links can easily influence each other. Although the correlations between the links and their second-order and third-order neighbors are relatively lower than that of first-order neighbors, they can also contribute to improving the performance of ETA prediction. Moreover, the average Pearson correlation coefficients of our selected high-order neighbors is much higher than those of the second-order and third-order neighbors, which demonstrates the significant role of high-order neighbors in improving ETA prediction performance. To further investigate the impact of high-order neighbors for traffic congestion pattern modeling, we examine the relative improvements of high-order neighbors in cases of traffic congestion222Here, traffic congestion is defined as speed 10 km/h. and normal traffic. Table 3 shows the relative improvements of high-order neighbors in cases of different traffic conditions. Results show that the improvements achieved by DuETA in case of traffic congestion are much higher than those of normal traffic. This further verifies the effectiveness of high-order neighbors in improving ETA prediction.

Congestion Normal
Beijing 12.2% 6.6%
Shanghai 12.0% 6.8%
Tianjin 17.0% 5.9%
Table 3. The relative improvements (MAPE) of high-order neighbors in different traffic conditions.

3.4. Practical Applicability

Before deploying DuETA in production, we conducted a case study to validate the practicability of DuETA, as well as to verify the superiority of DuETA in traffic congestion propagation modeling over the previously deployed model, ConSTGAT. Specifically, given a requested travel route that consists of the origin link S, the destination link E, and the links in between S and E, we compare the future traffic conditions of the links in predicted by ConSTGAT and DuETA when congestion occurs in a distant link P. Figure 9 shows two cases of future traffic conditions predicted by ConSTGAT and DuETA. From the results, we make the following observations. First, the sub-figures at the left present the results predicted by ConSTGAT and DuETA when the traffic condition of link P is normal. We can see that the future traffic conditions of the links in predicted by both models are mostly normal without any congestion. Second, the sub-figures at the right illustrate the results predicted by ConSTGAT and DuETA when severe congestion occurs in link P. The results in Figure 8(b) clearly show that the traffic conditions of most links (predicted by DuETA) in are propagated from the congestion in distant link P, while the traffic conditions of these links (predicted by ConSTGAT) are still normal, as shown in Figure 8(a). This demonstrates that DuETA is more sensitive to distant congestion than ConSTGAT, and it confirms the superiority of DuETA for the problem of traffic congestion propagation modeling.

(a) Future traffic conditions predicted by ConSTGAT.
(b) Future traffic conditions predicted by DuETA.
Figure 9. Comparison of DuETA and ConSTGAT in modeling traffic congestion propagation. The estimated travel speed (km/h) of each link is marked with a distinctive color.

3.5. Online Evaluation

Before being launched in production, we would routinely deploy the new ETA prediction model online and make it randomly serve about 25% of ETA prediction requests. During the period of A/B testing, we monitored the performance of DuETA and compare it with that of the previously deployed model online. This period conventionally lasted for one week, from Apr. 12th to Apr. 18th, 2022 in Beijing, China. Figure 10 shows the experimental results. From the results, we have the following observations.

First, for overall performance comparison, the RMSE scores of DuETA (blue line) are lower than those of the previously deployed model (orange line) from Figure 9(a). This demonstrates the superiority of DuETA over the previously deployed model.

Second, we further investigate the contribution of DuETA to the travel time estimations of the long travel routes (3km) and short travel routes (3km). By comparing Figure 9(b) and Figure 9(c), we can observe that DuETA achieves greater improvement on the long travel routes (3km) than on the short travel routes (3km). This observation is in line with our expectations, since DuETA focuses on resolving the issue of traffic congestion propagation, especially the cumulative effect of long-distance congestion propagation.

Third, we analyze the evaluation results of DuETA on the ETA prediction requests in non-rush hours, morning rush hours, and evening rush hours, as shown in Figure 9(d), Figure 9(e), and Figure 9(f). DuETA consistently surpasses the previously deployed model in all the scenarios across the whole week. Especially since DuETA can respond quickly to the changes in the real-time traffic conditions, the improvement is more significant in the morning and evening rush hours.

Fourth, we observe that the averaged RMSE scores in the online evaluation of DuETA are higher than those in the offline evaluation. The main reason is that the real-time traffic data processing procedures (e.g., data collection, data identification, and data cleansing) in online settings typically introduce more noise data, which inevitably leads to inflated variance of data distribution between online and offline evaluations. This results in relatively higher RMSE scores in the online evaluation. Similar performance gaps between online and offline evaluations are also reported in our previous studies

(Fang et al., 2020; Huang et al., 2020a) at Baidu Maps.

(a) Overall performance.
(b) Short travel routes (3km).
(c) Long travel routes (3km).
(d) Non-rush hours.
(e) Morning rush hours (7:00-9:00AM).
(f) Evening rush hours (5:00-7:00PM).
Figure 10. Online evaluation of DuETA and the previously deployed model during April 12th - 18th, 2022, in Beijing.

4. Related Work

Here we briefly review closely related work in the fields of estimated time of arrival (ETA) prediction and traffic prediction with graph neural networks (GNNs).

4.1. ETA Prediction

The mainstream methods for ETA prediction can be categorized into two groups: segment-based methods and end-to-end methods. The segment-based methods (Amirian et al., 2016; Wang et al., 2019, 2014) are widely used in most navigation services. They first estimate the travel time of each road segment independently. Then, the travel time of the entire route is obtained by simply summing up the estimated travel times of all road segments in the route. The segment-based methods are computationally efficient and scalable, since the travel times of road segments can be estimated in parallel. Although they are efficient, they do not account for the information of the travel route. The route information, such as the connections of road segments and traffic lights, is necessary for ETA prediction.

By contrast, the end-to-end methods (Wang et al., 2018a, b; Zhang et al., 2018a; Fang et al., 2020) take a route as input and directly estimate the travel time of the entire route. Compared with the segment-based methods, the end-to-end methods have achieved further improvements because they have taken into account the contextual information of a route (Fang et al., 2020). For example, Wang et al. (2018a) applied convolution on traveling sequence for spatial representation learning and stacked LSTM for temporal modeling. Wang et al. (2018b)

proposed a Wide-Deep-Recurrent model using a Wide&Deep network for feature extraction and an LSTM for trajectory information. However, the step-by-step message-passing techniques used by most existing methods are inefficient for modeling the traffic congestion propagation patterns along the route. In this work, we propose a more efficient method to model traffic congestion propagation patterns, which has been shown to significantly improve ETA prediction performance.

4.2. Traffic Prediction with GNNs

GNNs (Kipf and Welling, 2016; Veličković et al., 2017) have been proven to be powerful structural modeling approaches. Recent studies have proposed variations of different spatial-temporal GNNs (STGNNs) to tackle the tasks of traffic prediction (Xia et al., 2022b; Guo et al., 2019; He et al., 2018; Li et al., 2018; Yu et al., 2019; Jiang and Luo, 2021; Yin et al., 2021; Lee et al., 2022) and ETA prediction (Fang et al., 2020; Hong et al., 2020; Derrow-Pinion et al., 2021). STGNNs usually process spatial-temporal signals using a graph convolution network (GCN) (Kipf and Welling, 2016) for geographic information and a recurrent model for a temporal dynamic. The major drawback of GNNs is their relatively weak scalability on real-world industrial datasets, since increasing the depth of a GNN often means exponential expansion of the neighbor scope. To alleviate this problem, recent studies of GNNs (Ying et al., 2018; Zeng et al., 2021) suggested that a properly extracted subgraph, consisting of a small number of critical neighbors while excluding irrelevant ones, can achieve significant accuracy improvement with orders of magnitude reduction in computation and hardware cost. Inspired by this observation, we propose a congestion-sensitive graph for traffic prediction in this paper.

5. Conclusions

Traffic congestion propagation pattern modeling is of great importance for ETA prediction. To address this, we develop a novel and practical ETA framework named DuETA. DuETA can efficiently learn the traffic propagation patterns through an elaborately designed, congestion-sensitive graph and a route-aware graph transformer. Experiments show that DuETA is a practical and robust solution for large-scale ETA prediction services.

In the future, we consider addressing the following open problems. First, given the observation that roads are successively constructed and upgraded (Yang et al., 2022; Xia et al., 2022a), we plan to investigate the transferability of our model to deal with unseen road segments or regions. Second, given the observation that the travel times of some routes have a considerable correlation with the POIs distributed along the roads. For example, the roads that pass schools, hospitals, and markets tend to be congested at specific times, which could potentially impact the ETA prediction performance. To address this issue, we plan to utilize the POI retrieval system (Huang et al., 2020a; Fan et al., 2021; Huang et al., 2021) as an auxiliary tool to forecast which POIs would be densely populated and how extensively they would affect the ETA prediction.

References

  • A. F. Agarap (2018)

    Deep learning using rectified linear units (relu)

    .
    CoRR abs/1803.08375. External Links: Link, 1803.08375 Cited by: §2.4.3.
  • P. Amirian, A. Basiri, and J. Morley (2016) Predictive analytics for enhancing travel time estimation in navigation apps of apple, google, and microsoft. In Proceedings of the 9th ACM SIGSPATIAL International Workshop on Computational Transportation Science, pp. 31–36. External Links: ISBN 9781450345774 Cited by: §4.1.
  • M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li (2020) Simple and deep graph convolutional networks. In

    Proceedings of the 37th International Conference on Machine Learning

    ,
    pp. 1725–1735. Cited by: §2.4.1.
  • A. Derrow-Pinion, J. She, D. Wong, O. Lange, T. Hester, L. Perez, M. Nunkesser, S. Lee, X. Guo, B. Wiltshire, P. W. Battaglia, V. Gupta, A. Li, Z. Xu, A. Sanchez-Gonzalez, Y. Li, and P. Velickovic (2021) ETA prediction with graph neural networks in google maps. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 3767–3776. Cited by: §4.2.
  • M. Fan, Y. Sun, J. Huang, H. Wang, and Y. Li (2021) Meta-learned spatial-temporal poi auto-completion for the search engine at baidu maps. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 2822–2830. Cited by: §5.
  • X. Fang, J. Huang, F. Wang, L. Liu, Y. Sun, and H. Wang (2021) SSML: self-supervised meta-learner for en route travel time estimation at baidu maps. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 2840–2848. Cited by: DuETA: Traffic Congestion Propagation Pattern Modeling via Efficient Graph Learning for ETA Prediction at Baidu Maps, §1.
  • X. Fang, J. Huang, F. Wang, L. Zeng, H. Liang, and H. Wang (2020) ConSTGAT: contextual spatial-temporal graph attention network for travel time estimation at baidu maps. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2697–2705. Cited by: DuETA: Traffic Congestion Propagation Pattern Modeling via Efficient Graph Learning for ETA Prediction at Baidu Maps, §1, §1, §1, §2.3.1, 5th item, §3.5, §4.1, §4.2.
  • S. Guo, Y. Lin, N. Feng, C. Song, and H. Wan (2019) Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In

    The Thirty-Third AAAI Conference on Artificial Intelligence

    ,
    pp. 922–929. Cited by: §1, §1, §4.2.
  • Z. He, C. Chow, and J. Zhang (2018) STANN: a spatio–temporal attentive neural network for traffic prediction. IEEE Access 7, pp. 4795–4806. Cited by: 2nd item, §4.2.
  • H. Hong, Y. Lin, X. Yang, Z. Li, K. Fu, Z. Wang, X. Qie, and J. Ye (2020) HetETA: heterogeneous information network embedding for estimating time of arrival. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2444–2454. Cited by: §4.2.
  • J. Huang, H. Wang, M. Fan, A. Zhuo, and Y. Li (2020a) Personalized prefix embedding for poi auto-completion in the search engine of baidu maps. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2677–2685. Cited by: §3.5, §5.
  • J. Huang, H. Wang, M. Fan, A. Zhuo, Y. Sun, and Y. Li (2020b) Understanding the impact of the covid-19 pandemic on transportation-related behaviors with human mobility data. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3443–3450. Cited by: §3.1.
  • J. Huang, H. Wang, Y. Sun, M. Fan, Z. Huang, C. Yuan, and Y. Li (2021) HGAMN: heterogeneous graph attention matching network for multilingual poi retrieval at baidu maps. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 3032–3040. Cited by: §5.
  • P. J. Huber (1992) Robust estimation of a location parameter. In Breakthroughs in statistics, pp. 492–518. Cited by: §2.4.3.
  • W. Jiang and J. Luo (2021) Graph neural network for traffic forecasting: a survey. arXiv preprint arXiv:2101.11174. Cited by: §4.2.
  • D. P. Kingma and J. Ba (2015) Adam: a method for stochastic optimization. In International Conference on Learning Representations, Cited by: §3.2.
  • T. N. Kipf and M. Welling (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. Cited by: §4.2.
  • A. Krizhevsky, I. Sutskever, and G. E. Hinton (2012) Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pp. 1097–1105. Cited by: §2.4.3.
  • H. Lee, S. Jin, H. Chu, H. Lim, and S. Ko (2022)

    Learning to remember patterns: pattern matching memory networks for traffic forecasting

    .
    In International Conference on Learning Representations, Cited by: §4.2.
  • G. Li, M. Muller, A. Thabet, and B. Ghanem (2019) Deepgcns: can gcns go as deep as cnns?. In

    Proceedings of the IEEE/CVF international conference on computer vision

    ,
    pp. 9267–9276. Cited by: §2.4.1.
  • Y. Li, R. Yu, C. Shahabi, and Y. Liu (2018)

    Diffusion convolutional recurrent neural network: data-driven traffic forecasting

    .
    In International Conference on Learning Representations, Cited by: §1, §1, 3rd item, §4.2.
  • Y. Shi, Z. Huang, S. Feng, H. Zhong, W. Wang, and Y. Sun (2021) Masked label prediction: unified message passing model for semi-supervised classification. In International Joint Conferences on Artificial Intelligence Organization, pp. 1548–1554. Cited by: §2.4.1.
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin (2017) Attention is all you need. In Advances in Neural Information Processing Systems, pp. 5998–6008. Cited by: §2.4.1.
  • P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio (2017) Graph attention networks. arXiv preprint arXiv:1710.10903. Cited by: §4.2.
  • D. Wang, J. Zhang, W. Cao, J. Li, and Y. Zheng (2018a) When will you arrive? Estimating travel time based on deep neural networks. In Thirty-Second AAAI Conference on Artificial Intelligence, pp. 2500–2507. Cited by: §4.1.
  • H. Wang, X. Tang, Y. Kuo, D. Kifer, and Z. Li (2019) A simple baseline for travel time estimation using large-scale trip data. ACM Transactions on Intelligent Systems and Technology (TIST) 10 (2), pp. 1–22. Cited by: §4.1.
  • Y. Wang, Y. Zheng, and Y. Xue (2014) Travel time estimation of a path using sparse trajectories. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 25–34. Cited by: §4.1.
  • Z. Wang, K. Fu, and J. Ye (2018b) Learning to estimate the travel time. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 858–866. Cited by: §4.1.
  • D. Xia, J. Huang, J. Yang, X. Liu, and H. Wang (2022a) DuARUS: automatic geo-object change detection with street view imagery for updating road database at baidu maps. In Proceedings of the 31st ACM International Conference on Information and Knowledge Management, Cited by: §5.
  • D. Xia, X. Liu, W. Zhang, H. Zhao, C. Li, W. Zhang, J. Huang, and H. Wang (2022b) DuTraffic: live traffic condition prediction with trajectory data and street views at baidu maps. In Proceedings of the 31st ACM International Conference on Information and Knowledge Management, Cited by: §4.2.
  • J. Yang, X. Ye, B. Wu, Y. Gu, Z. Wang, D. Xia, and J. Huang (2022) DuARE: automatic road extraction with aerial images and trajectory data at baidu maps. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 4321–4331. Cited by: §5.
  • X. Yin, G. Wu, J. Wei, Y. Shen, H. Qi, and B. Yin (2021) Deep learning on traffic prediction: methods, analysis and future directions. IEEE Transactions on Intelligent Transportation Systems (), pp. 1–17. Cited by: §4.2.
  • R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec (2018) Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 974–983. Cited by: §4.2.
  • B. Yu, M. Li, J. Zhang, and Z. Zhu (2019) 3D graph convolutional networks with temporal graphs: a spatial information free framework for traffic forecasting. arXiv preprint arXiv:1903.00919. Cited by: §1, §1, §4.2.
  • B. Yu, H. Yin, and Z. Zhu (2018) Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, pp. 3634–3640. Cited by: §1, §1.
  • H. Zeng, M. Zhang, Y. Xia, A. Srivastava, A. Malevich, R. Kannan, V. Prasanna, L. Jin, and R. Chen (2021) Decoupling the depth and scope of graph neural networks. In Advances in Neural Information Processing Systems, pp. 19665–19679. Cited by: §4.2.
  • H. Zhang, H. Wu, W. Sun, and B. Zheng (2018a) DeepTravel: a neural network based travel time estimation model with auxiliary supervision. In International Joint Conferences on Artificial Intelligence Organization, pp. 3655–3661. Cited by: 4th item, §4.1.
  • J. Zhang, X. Shi, J. Xie, H. Ma, I. King, and D. Yeung (2018b) GaAN: gated attention networks for learning on large and spatiotemporal graphs. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, pp. 339–349. Cited by: §1, §1.