I Introduction
Vehicular transportation reflects the pulse of a city. It not only affects people’s daily lives and also plays an essential role in many businesses as well as society as a whole [1, 2]. With recent deployment of sensing technologies and continued digitization, large amounts of vehicle trajectory data are collected, which provide a solid data foundation to improve the quality of a wide variety of transportation services, such as vehicle routing, traffic prediction, and urban planning.
A fundamental functionality in vehicular transportation is routing. Given a source and a destination, classic routing algorithms, e.g., Dijkstra’s algorithm, identify an optimal path connecting the source and the destination, where the optimal path is often the path with the least travel cost, e.g., the shortest path or the fastest path. However, a routing service quality study [3] shows that local drivers often choose paths that are neither shortest nor fastest, rendering classic routing algorithms often impractical in many real world routing scenarios.
To contend with this challenge, a wide variety of advanced routing algorithms, e.g., skyline routing [4] and kshortest path routing [5], are proposed to identify a set of optimal paths, where the optimality is defined based on, e.g., pareto optimality or top least costs, which provide drivers with multiple candidate paths. In addition, commercial navigation systems, such as Google Maps and TomTom, often follow a similar strategy by suggesting multiple candidate paths to drivers, although the criteria for selecting the candidate paths are often confidential.
Under this context, ranking the candidate paths is essential for ensuring high routing quality. Existing solutions often rely on simple heuristics, e.g., ranking paths w.r.t. their travel times. However, travel times may not always be the most important factor when drivers choose paths, as demonstrated in the routing quality study where drivers often do not choose the fastest paths
[3]. In addition, existing solutions often provide the same ranking to all users but ignore distinct preferences which different drivers may have.In this paper, we propose a datadriven ranking framework PathRank, which ranks candidate paths by taking into account the paths used by local drivers in their historical trajectories. More specifically, PathRank models ranking candidate paths as a “regression” problem—for each candidate path, PathRankestimates a ranking score for the candidate path.
The intuition behind PathRank is that if a driver used path from source to destination , this means that the driver considered path as the “best” path over all possible paths from to . Then, a path that is similar to should rank higher than a path that is dissimilar to .
Based on the above intuition, for each historical trajectory, we identify the path used by the trajectory and the source and the destination of the trajectory. We consider path as the ground truth path. Next, we identify a set of paths that connect and . For each candidate path , we associate a similarity score that measures how similar between the path and the ground truth path . Here, a number of path similarity functions [6] can be applied as function , e.g., weighted Jaccard similarity [7].
In the training phase, set is used to train a regression model, where path is a training instance and is its label, i.e., the ranking score. After training, we obtain a regression model. Then, in the testing phase, given a set of candidate paths returned by advanced routing algorithms or Google Maps, the regression model is able to estimate a ranking score for each candidate path. Finally, we rank the candidate paths w.r.t. their ranking scores.
We may train PathRank on historical trajectories from a specific driver and thus provide a personalized ranking for the driver. Alternatively, we can also train PathRank on historical trajectories from many drivers and thus provide a generic ranking, which, for example, can be used for different drivers, especially for new drivers who do not have many or even no historical trajectories. We include an empirical study on this in Section VID.
Enabling PathRank is nontrivial as we need to face two major challenges. First, constructing an appropriate training path set is nontrivial. Since there may exist a large amount of paths from a source to a destination, it is thus prohibitive to include all such paths in . Selecting a small subset of such paths may adversely affect the training effectiveness. Thus, it is challenging to select a small, representative subset of paths to be included in such that they enable both efficient and effective training and thus providing accurate ranking while maintaining efficiency.
Second, effective regression models often rely on meaningful feature representations of input data. In our setting, the input is a path and no existing methods are available to represent paths in a meaningful feature space to enable ranking. Here, the meaningful feature space should take into account both the topology of the underlying road network and the spatial properties, such as distances and travel times, of the road network.
To contend with the first challenge, we propose an effective method to generate a compact training path set . We consider different travel costs that drivers may consider, e.g., distance, travel time, and fuel consumption. Next, for each travel cost, we identify a set of diversified, top leastcost paths. Here, two paths are diversified if the path similarity between them is smaller than a threshold, e.g., 0.8, where a number of different path similarity functions can be applied here as well [6]. As an example, diversified top3 shortest paths consist of three paths where the path similarity of every pair of paths is smaller than a threshold and there does not exist another set of three paths which are mutually diversified and whose total distance is shorter. Considering diversity avoids including top3 shortest paths where they only differ slightly, e.g., one or two edges. This method makes sure that the candidate path set (i) considers multiple travel costs that a driver may consider when making routing decisions; and (ii) includes paths that are dissimilar with each other, which in turn represent a large feature space of the underlying road network.
Next, we propose a deep learning framework to learn meaningful feature representations of paths which enables effective ranking and thus solve the second challenge. Recall that the input is a path, which is represented as a sequence of vertices in a road network graph. To capture the graph topology, we utilize unsupervised graph embedding, e.g., node2vec
[8], to transform a vertex into a feature vector. Since a path is a sequence of vertices, we employ a recurrent neural network (RNN) to model the sequence of the corresponding feature vectors of the vertices. So far, thanks to the graph embedding, the framework considers the topology of the underlying road network, but we also need to consider spatial properties of the road network, which are not captured by classic graph embedding. To this end, we let the RNN not only estimate the similarity w.r.t. the ground truth path but also reconstruct the path’s spatial properties, such as the length, the travel time, and the fuel consumption of the path. This makes the framework a multitask learning framework where the main task is to estimate the similarity which is used for the final ranking and the auxiliary tasks are to enforce the graph embedding to also capture the spatial properties of the underlying road network which eventually improve the accuracy of the main task.To the best of our knowledge, this is the first datadriven, endtoend solution for ranking paths in spatial networks. Specifically, we make four contributions. First, we propose a method to generate a compact set of training paths which enables effective and efficient learning. Second, we propose a multitask learning framework to enable spatial network embedding that enhances classic graph embedding by incorporating spatial properties. Third, we integrate the spatial network embedding with similarity regression to provide an endtoend solution for ranking paths. Fourth, we conduct extensive experiments using a large real world trajectory set to offer insight into the design properties of the proposed framework and to demonstrate that the framework is effective.
Paper Outline: Section 2 covers related work. Section 3 covers preliminaries. Section 4 discusses how to generate the training data. Section 5 proposes PathRank, including basic framework and advanced framework. Section 6 reports on empirical evaluations. Section 7 concludes.
Ii Related Work
We review related studies on learning to rank in the context of information retrieval, graph representation learning, and trajectory learning.
Learning to rank Learning to rank plays an important role in ranking in the context of information retrieval (IR), where the primary goal is to learn how to rank documents or web pages w.r.t. queries, which are all represented as feature vectors. Fig. 1 gives the typical learning to rank framework.
Learning to rank methods in IR can be categorized into pointwise, pairwise, and listwise methods. Pointwise methods estimate a ranking score for each individual document. Then, the documents can be ranked based on the ranking scores [9]. Pairwise methods focus on, for a given pair of documents, making a binary decision on which document is better, i.e., a relative order. Here, although we do not know the ranking scores for individual documents, we are still able to rank documents based on the estimated relative orders [10, 11]. Listwise methods take into account a set of documents and estimate the ranking for the documents [12]. Recently, deep learning is applied in learning to ranking in IR with a focus on learning semantic meaningful representation of both queries and documents, such as DSSM [13], CDSSM [14] and DeepRank [15].
Although learning to rank techniques have been applied widely and successfully in IR, they only consider textual documents and queries and cannot be applied for ranking paths in spatial networks, since both graph topology and spatial properties, which are the two most important factors in spatial networks, are ignored. We follow the idea of the pointwise learning to rank techniques in IR and propose PathRank to rank paths in spatial networks while considering both graph topology and spatial properties.
Network Representation Learning Network representation learning, a.k.a., graph embedding, aims to learn lowdimensional feature vectors for vertices while preserving network topology structure such that the vertices with similar feature vectors share similar structural properties [8, 16, 17, 18, 20, 25]. We distinguish two categories of methods: random walk based methods and deep learning based methods.
A representative method in the first category is DeepWalk [17]. DeepWalk first samples sequences of vertices based on truncated random walks, where the sampled vertex sequences capture the connections between vertices in the graph. Then, skipgram model [19] is used to learn lowdimensional feature vectors based on the sampled vertex sequences. Node2vec [8]
considers higher order proximity between vertices by maximizing the probability of occurrences of subsequent vertices in fixed length random walks. A key difference from DeepWalk is that node2vec employs biasedrandom walks that provide a tradeoff between breadthfirst and depthfirst searches, and hence achieves higher quality and more informative embedding than DeepWalk does.
To overcome the weaknesses of random walk based methods, e.g., the difficulty in determining the random walk length and the number of random walks, deep learning based methods utilize the random surfing model to capture contextual relatedness between each pair of vertices and preserves them into lowdimensional feature vectors for vertices [20]. Deep learning based methods are also able to take into account complex nonlinear relations. GraphGAN [25] is proposed to learn vertex representations by modeling the connectivity behavior through an adversarial learning framework using a minimax game.
LINE [18] does not fall into the above two categories. Instead of exploiting random walks to capture network structures, LINE [18] propose a model with a carefully designed objective function that preserves both the firstorder and secondorder proximities.
However, all existing graph embedding methods consider nonspatial networks such as social networks, citation networks, and biology networks. They ignore spatial properties, e.g., distances and travel times, which are crucial features in spatial networks such as road networks. In this paper, we propose a multitask learning framework to extend existing graph embedding to incorporate important spatial properties. Experimental results show that the graph embedding that considers spatialproperties gives the best performance when ranking paths in spatial networks.
Trajectory Learning Machine learning have been also applied on trajectories to support different applications [43, 48, 47]. A multitask learning framework [35]
is proposed to distinguish trajectories from different drivers. When considering trajectories as time series, recurrent autoencoders
[41, 46] and recurrent autoencoder ensembles [36]are proposed to identify outliers. However, these studies do not take into account underlying road network structures into consideration.
In addition, different approaches have been proposed to learn personalized driving preferences from trajectories [21, 32, 42], which enable personalized routing. However, in this paper, we consider an orthogonal approach where we rank candidate paths, which can be obtained from wellknown navigation services, rather than proposing yet another personalized routing approach which may not be easily integrated with existing navigation services.
Finally, trajectories have been applied to extract highresolution travel costs [4, 34, 33, 42, 22], such as travel time and fuel consumption [24, 23]. In particular, timevarying and uncertain travel costs can be learned from trajectories. It is of interest to extend PathRank to consider timevarying and uncertain traffic conditions as future work.
Iii Preliminaries
Iiia Basic Concepts
A road network is modeled as a weighted, directed graph . Vertex set represents road intersections and road ends; edge set represents road segments. Functions , , and maintain the travel costs of the edges in graph . Specifically, function maps each edge to its length. Functions and have similar signatures and maps edges to their travel times and fuel consumption, respectively.
A path is a sequence of vertices where and each two adjacent vertices must be connected by an edge in .
A trajectory is a sequence of GPS records pertaining to a trip, where each GPS record represents the location of a vehicle at a particular timestamp. The GPS records are ordered according to their corresponding timestamps, where if .
Map matching [26] is able to map a GPS record to a specific location on an edge in the underlying road network, thus aligning a trajectory with a path in the underlying road network. We call such paths trajectory paths. In addition, a trajectory is also associated with a driver identifier, denoted as , indicating who made the trajectory.
Multiple similarity functions [1, 6, 7, 27] are available to calculate the similarity between two paths, where the most popular functions belong to the Jaccard similarity function family, in particular, the weighted Jaccard similarity [1, 7]. In this paper, we use the weighted Jaccard Similarity (see Equation 1) to evaluate the similarity between two paths. However, other similarity functions can be easily incorporated into the proposed framework.
(1) 
Here, we use and to represent two edge sets: edge set consists of the edges that appear in both and ; and edge set consists of the edges that appear in either or . Recall that function returns the length of edge . Then, the intuition of the weighted Jaccard similarity is twofold: first, the more edges the two paths share, the more similar the two paths are; second, the longer the shared edges are, the more similar the two paths are.
IiiB PathRank Overview
Fig. 2 shows an overview of the proposed PathRank.
Given a set of historical trajectory, we first map match them to obtain their corresponding trajectory paths. In the training phase, the trajectory paths are fed into the Training Data Generation module. For each trajectory path , the training data generation module generates a compact set of competitive paths such that each competitive path also connects the same source and destination of the trajectory path . Next, we consider the trajectory path as the ground truth path and thus compute a similarity score for each competitive path . The training data generation module iterates over each trajectory path and generates competitive paths along with similarity scores. The output of the module is a set of “competitive path” and “similarity score” pairs, denoted as , which is used as the input for the PathRank.
In the training phase, for each training instance , the Spatial Network Embedding Module embeds each vertex in competitive path into a feature vector. This transfers path into a sequence of feature vectors, which is then fed into a Recurrent Neural Network (RNN). The RNN estimates the similarity between ground truth trajectory path and path . An objective function is designed to measure the discrepancy between the estimated similarity and the ground truth similarity . Then, the whole training process aims to minimize the objective function.
In the testing phase, we use the trained PathRank to rank candidate paths. Given a source and a destination, advanced routing algorithms or commercial navigation systems are able to provide multiple candidate paths, which are used as testing instances. Next, PathRank takes as input each testing path and returns an estimated ranking score. Finally, we are able to rank the testing paths according to their estimated ranking scores.
Iv Training Data Generation
We proceed to elaborate how to generate a compact set of training paths for a trajectory path.
Iva Intuitions
Ranking paths is similar to rank products in online shops. If a user clicks a specific product, it provides evidence that the user is interested in the product than other similar products. Similarly, a trajectory path from a source to destination also provides evidence that a driver prefers path than other paths that connect to .
The main difference is that, in online shops, the other similar products, i.e., competitor products, can be obtained explicitly, e.g., those products that are shown to the user in the same web page but are not clicked by the user. Based on the positive and negative training data, i.e., the products that are clicked and not clicked by the user, effective learning mechanism, e.g., learning to rank [9, 10, 11, 12, 13, 14, 15], is available to learn an appropriate ranking function.
However, in our setting, the other candidate paths are often unknown and implicit because we do not know when the driver made the decision to take path , what other paths were in driver’s mind. Thus, the main target of the training data generation module is to generate a set of paths which may include the other paths when the driver made decision to use trajectory path . We call competitive path set.
A naive way to generate the competitive path set is to simply include all paths from to . This is infeasible to use in real world settings since the competitive path set may contain a huge number of paths in a citylevel road network graph, which in turn makes the training prohibitively inefficient. Thus, we aim to identify a compact competitive path set, where only a small number of paths, e.g., less than 10 paths, are included.
IvB Top Shortest Paths
The first strategy is to employ a classic top shortest path algorithm, e.g., Yen’s algorithm [28], to include the top shortest paths from to into the competitive path set .
This strategy is simple and efficient since a wide variety of efficient algorithms are available to generate top shortest paths in the literature [28, 29, 30, 31]. However, a serious issue of this strategy is that the top shortest paths are often highly similar. Thus, their similarities w.r.t. the ground truth, trajectory path , are also similar, which adversely affect the effectiveness of the subsequent ranking score regression.
For example, we choose four trajectory paths with different sources and destinations. For each trajectory path, we generate top9 shortest, fastest, and most fuelefficient paths connecting the same source and destination as the competitive paths. Then, we compute the competitive paths’ similarities w.r.t. the trajectory path. Figures 2(a), 2(b), and 2(c) show the box plots of the similarities per trajectory path. We observe that the similarities often only spread over a very small range. For example, for the first trajectory path , its corresponding top9 shortest paths have similarities spreading from 0.65 to 0.75.
If the similarities of competitive paths only spread over a small range, they only provide training instances for estimating ranking scores in the small range, which may make the trained model unable to make accurate estimations for ranking scores outside the small range. Thus, an ideal strategy should be providing a set of competitive paths whose similarities cover a large range. To this end, we propose the second strategy using the diversified top shortest paths.
IvC Diversified Top Shortest Paths
Diversified top shortest paths finding aims at identifying top shortest paths such that the paths are mutually dissimilar, or diverse, with each other.
Algorithm 1 details the procedure of finding diversified top shortest path. First, we always include the shortest path into the diversified top shortest path set . Next, we get into a loop where we keep checking the next shortest path until we have included paths in or we have checked all paths connecting the source and destination. When checking the next shortest path , we include into if the similarity between and each existing path in is smaller than a threshold . This means that is sufficiently dissimilar with the paths in , thus making sure that is a diverse top shortest path set. The smaller the threshold is, the more diverse the paths in are. However, if the threshold is too small, it may happen that less than diverse shortest paths or even only the shortest path are included in .
Figures 2(d), 2(e), and 2(f) show the box plots of the similarities of the same four trajectory paths when using diversified top shortest, fastest, and most fuel efficient paths with threshold . We observe that the similarities spread over larger ranges compared to Figure 2(a), 2(b), and 2(c) when using classic top shortest paths.
IvD Considering Multiple Travel Costs
Recent studies on personalized routing [1, 7] suggest that a driver may consider different travel costs, e.g., travel time, distance, and fuel consumption, when making routing decisions. This motivates us to consider multiple travel costs, but not only distance, when generating competitive path sets. The first option to do so is to use Skyline routing [4], which is able to identify a set of paretooptimal paths, a.k.a., Skyline paths, when considering multiple travel costs. However, Skyline routing also suffers the high similarity problem that the classic top shortest paths have—it often happens that the skyline paths are mutually similar, which may adversely affect the training effectiveness.
We propose a simple yet effective approach. We run the diversified top shortest paths times where each time we consider a specific travel cost. Then, we use the union of the diverse paths as the final competitive path set . For example, when considering three travel costs, i.e., distances, travel times, and fuel consumption, we set and identify the diversified top shortest, fastest, and most fuel efficient paths, respectively. Then, the union of the diversified top shortest, fastest, and most fuel efficient paths is used as the final competitive path set .
Since we run the diversified top shortest path finding multiple times for different travel costs, we can use a small for each run. For example, when we set and consider three travel costs, this makes also consist of up to 9 paths including the top3 shortest, fastest, and most fuel efficient paths.
To summarize, we use multicost, diversified top leastcost paths as the compact competitive path set for each trajectory path. Next, we combine the competitive path sets from all trajectory paths together to obtain a set of “competitive path” and “similarity score” pairs, denoted as . Here, competitive path is the input instance and similarity score is the corresponding label. This set is used as the training data for PathRank.
V PathRank
We propose an endtoend deep learning framework to estimate similarity scores for paths. We first propose a basic framework that consists of a vertex embedding network and a recurrent neural network. Next, we extend the vertex embedding network to capture both the topology and spatial properties of a road network graph, which improves the learning accuracy.
Va Basic Framework
Recall that the input for PathRank is a path, i.e., competitive path , and the label of the input is its similarity score . In order to use deep learning to solve the similarity score regression problem, a prerequisite is to represent the input path into an appropriate feature space. To this end, we propose to use a vertex embedding network to transfer each vertex in the input path to a feature vector. Since a path is a sequence of vertices, after vertex embedding, the path becomes a sequence of feature vectors. RNN finally captures the features of path sequence, which is applied to compute an estimated similarity score. Next, since RNNs are capable of capturing dependency for sequential data, we employ an RNN to model the sequence of feature vectors. The RNN finally outputs an estimated similarity score, which is compared against the ground truth similarity . This results in the basic framework of PathRank, which consists of two neural networks—a vertex embedding network and a recurrent neural network (RNN), as shown in Figure 4.
VA1 Vertex Embedding
We represent a vertex in road network graph as a onehot vector , where represents the number of vertices in , i.e., . Specifically, the th vertex in graph is represented as a vector where the th bit is 1 and the other bits are 0.
Vertex embedding employs an embedding matrix to transfer a vertex’s onehot vector into a new feature vector . The feature vector is often in a smaller space, where .
Given a competitive path , we apply the same embedding matrix to transfer each vertex to a feature vector. Thus, the competitive path is represented as a sequence of features , where and .
VA2 Rnn
The feature sequence represents the flow of travel on path and we would like to capture the flow. To this end, we fed the feature sequence into a recurrent neural network, which is known to be effective for modeling sequences. Specifically, we employ a bidirectional gated recurrent neural network (BDGRU) to capture the sequential dependencies in both the direction and the opposite direction of the travel flow.
We consider the direction of the travel flow first, i.e., from left to right. A GRU unit learns sequential correlations by maintaining a hidden state at position , which can be regard as an accumulated information of the positions on the left of position . Specifically, , where is the input feature vector at position and is the hidden state at position , i.e., the hidden state of the left position. More specifically, the GRU unit is composed of the following computations as shown in Equations 2, 3, 4, and 5. First, the GRU unit computes an update gate and reset gate , respectively.
Both gates are contributed to control how much information from the left hidden states should be considered in order to make the final similarity score estimation accurate. By doing this, it is possible to remember and forget left hidden states which are found to be relevant and irrelevant for the final similarity score estimation.
(2)  
(3)  
(4)  
(5) 
where is the logistic function, and denotes Hadamard product and is hyperbolic tangent function. and are the feature vector and hidden state at position , respectively. , , , and are parameters to be learned.
For the opposite direction of the travel flow, i.e., from right to left, we apply another GRU to generate hidden state . Here, the input consists of the feature vector at position and the hidden state at position , i.e., the right hidden state.
The final hidden state at position is the concatenation of the hidden states from both GRUs, i.e., where indicates the concatenation operation.
VA3 Fully Connected Layer
We stack all outputs from the BDGRU units into a long feature vector where indicates the concatenation operation. Then, we apply a fully connected layer with weight vector to produce a single value , as the estimated similarity for the competitive path .
VA4 Loss Function
The first term of the loss function measures the discrepancy between the estimated similarity and the ground truth similarity . We use the average of square error to measure the discrepancy, where is the total number of competitive paths we used for training.
The second term of the loss function is a L2 regularizer on all learnable parameters in the model, including the embedding matrix , multiple matrices used in BDGRU, and the matrix in the final fully connected layer . Here, controls the relative importance of the second term w.r.t. the first term. The basic training pipeline is outlined in Algorithm 2.
VB Advanced Framework
To further improve the learning accuracy, we pay particular attentions on the vertex embedding network since so far the vertex embedding network is “graphblind” which only employs an embedding matrix and does not take into account any information from the underlying road network graph. To improve this, we design an advanced framework to extend the basic framework with the help of multitask learning such that the embedding network takes into account both the topology of the underlying road network graph and the spatial properties associated with the underlying road network such as distances, travel times, and fuel consumption. The advanced PathRank framework is shown in Figure 5.
VB1 Capturing Graph Topology with Graph Embedding
Graph embedding, e.g., DeepWalk [17], node2vec [8], LINE [18], GraphGAN [25], aims at learning lowdimensional, latent representations of vertices in a graph by taking into account the graph topology.
A typical way to enable graph embedding is to mimic the way of embedding words for natural languages [8, 17]. In particular, multiple vertex sequences can be generated by using random walks, where random walks can consider edge weights or ignore edge weights. Next, vertices are considered as words and the generated vertex sequences are considered as sentences, which enanles the use of word embedding techniques to generate embeddings for vertices. Since the vertex sequences are generated by applying random walks on the graph, the obtained vertex embedding actually already takes into account the graph topology.
The learned vertex embeddings are used as feature vectors, which enables a wide variety of learning tasks on graphs such as classification [17, 18], link prediction [37], clustering [18, 38], recommendation [39], and visualization [40, 44].
We propose two different strategies to incorporate graph embedding. First, we simply apply an existing graph embeding method, e.g., DeepWalk or node2vec, to embed a onehot representation of a vertex to a low dimensional feature vector. Then, we use the feature vector as the input to the BDGRU. This means that PathRank only includes a RNN module, whose inputs are sequences of feature vectors, and the vertex embedding module is disabled.
Second, inspired by the wellknown practice of unsupervised pretraining [45], we use the embedding matrix obtained from an existing graph embedding method to initialize the embedding matrix in the vertex embedding module in PathRank. This allows PathRank to update the embedding matrix during training such that it not only captures the graph topology but also better fits the similarity regression.
VB2 Capturing Spatial Properties with MultiTask Learning
Although many vertex embedding algorithms exist, they are only able to capture graph topology because they only focus on graphs representing, e.g., social networks and citation network. In other words, they do not consider graphs representing spatial networks such as road networks. However, in road network graphs, many spatial attributes, in addition to topology, are also very important. For example, distances between two vertices are crucial features for spatial networks. To let the graph embedding also maintain the spatial properties, we design a multitask learning framework using pretrained graph embedding.
We first employ an existing graph embedding algorithm to initialize the vertex embedding matrix in the vertex embedding module of PathRank. This pretrained embedding matrix captures the graph topology. Next, we try to update such that it also captures relevant spatial properties during training. To this end, we employ multitask learning principles, where the main task is to estimate similarity and the auxiliary tasks are to reconstruct travel costs of competitive paths which help learning an appropriate embedding matrix that also considers spatial properties of the underlying road network.
To enable the multitask learning framework, in the final fully connected layer, we let PathRank not only estimate a similarity score but also estimate, or reconstruct, the spatial properties of the corresponding competitive path , such as the distance, travel time, and fuel consumption of . We also extend the loss function to include terms that consider the discrepancies between the actual distance and the estimated distance, the actual travel time and the estimated travel time, and the actual fuel consumption and the estimated fuel consumption. The loss function for the multitask learning framework is defined in Equation 7.
(7) 
where is a hyper parameter that controls the tradeoff between main task and auxiliary tasks; and denote the estimated cost of the th auxiliary task and the ground truth of the th auxiliary task, respectively. For example, when considering distance, travel time, and fuel consumption, we set to 3; and and represent the estimated and ground truth distance, travel time, or fuel consumption of the th competitive path . The basic training pipeline is outlined in Algorithm 3.
Vi Experiments
We conduct a comprehensive empirical study to investigate the effectiveness of the proposed PathRank framework.
Via Experiments Setup
ViA1 Road Network and Trajectories
We consider the road network in North Jutland, Denmark. We obtain the road network graph from OpenStreetMap, which consists of 8,893 vertices and 10,045 edges.
We use a substantial GPS data set occurred on the road network, which consists of 180 million GPS records for a twoyear period from 183 vehicles. The sampling rate of the GPS data is 1 Hz (i.e., one GPS record per second). We split the GPS records into 22,612 trajectories representing different trips. A wellknow map matching method [26] is used to map match the GPS trajectories such that for each trajectory, we obtain its corresponding trajectory path.
ViA2 Ground Truth Data
We split the trajectories into three sets—70 for training, 10 for validation, and 20 for testing. The distributions of the cardinalities of the trajectory paths in training and testing sets are show in Figure 6. The distribution on the validation set is similar and thus is omitted due to space limitation.
For each trajectory , we obtain its source , destination , and the trajectory path . Then, we employ seven different strategies to generate seven sets of competitive paths according to the sourcedestination pairs , —top shortest paths (TkDI), top fastest paths (TkTT), top most fuel efficient paths (TkFC), diversified top shortest paths (DTkDI), diversified top fastest paths (DTkTT), diversified top most fuel efficient paths (DTkFC), and diversified, multicost top paths (DTkM). For each competitive path , we employ weighted Jaccard similarity , as ’s ground truth ranking score.
When training and validation, we use the competitive path set generated by a specific training data generation strategy to train a PathRank model. Thus, we are able to train seven different PathRank models using the same set of training and validation trajectories, but seven different sets of competitive paths.
When testing, to make the comparision among different PathRank models fair, for each testing trajectory, we consider all competitive path sets generated by the 7 different strategies. This makes sure that (1) PathRank models that are trained on different training data sets are tested against the same set of competitive paths; (2) a PathRank model that is trained on a specific strategy is tested against competitive paths sets that are generated by all strategies.
ViA3 PathRank Frameworks
We consider different variations of PathRank. First, we consider the basic framework PRB where the vertex embedding just employs an embedding matrix , which ignores the graph topology.
Second, we consider the advanced framework where the vertex embeding employs graph embedding. Recall that we have two strategies to use the advanced framework—keeping the graph embedding static (PRA1) vs. keep updating the embedding together with the PathRank (PRA2).
Finally, we consider the multitask learning method which considers spatial properties, where we use PRA2M to indicate a PathRank model that uses an objective function considering spatial properties, i.e., auxiliary tasks.
For the advanced frameworks, i.e., PRA1, PRA2, and PRA2M, we choose node2vec [8] as the graph embeding method. Node2vec is a more general random walk based graph embedding method, which outperforms alternative methods such as DeepWalk [17] and LINE [18]. When new, better unsupervised graph embedding method becomes available, it can be easily integrated into PathRank to replace node2vec.
ViA4 Parameters
When generating diversified top paths, we consider two different similarity thresholds —0.6 and 0.8. A smaller threshold enforces more diversified paths. However, it is also more likely that we cannot identify paths that are significantly diversified paths, especially when is large. Recall that the vertex embeding utilizes a embedding matrix to embed each vertex into a dimensional feature vector, where is the number of vertices. We consider two settings of , namely 64 and 128.
We consider 250 GRU units in total in the bidirectional GRU module by considering the cardinalities of the trajectory paths shown in Figure 6
, where the largest cardinality is 250. If a competitive path that consists of less than 250 vertices, we use zero padding to fill in.
For the multitask learning framework, we vary from 0, 0.2, 0.4, 0.6, to 0.8 to study the effect on learning additional spatial properties.
We summary different parameter settings in Table I, where the default values are shown in bold.
Parameters  Values 

Similarity Threshold  0.6, 0.8 
Embedding feature size  64, 128 
Multitask learning parameter  0, 0.2, 0.4, 0.6, 0.8 
ViA5 Evaluation Metrics
We evaluate the accuracy of the proposed PathRank framework based on two categories of metrics. The first category includes metrics that measure how accurate the estimated ranking scores w.r.t. the ground truth ranking scores. This category includes Mean Absolute Error (MAE) and Mean Absolute Relative Error (MARE). Smaller MAE and MARE values indicate higher accuracy. Specifically, we have
(8) 
where and represent the ground truth ranking score and the estimated ranking score, respectively; and is the total number of estimations.
The second category includes Kendall rank correlation coefficient (denoted by ) and Spearman’s rank correlation coefficient (denoted by ), which measure the similarity, or consistency, between a ranking based on the estimated ranking scores and a ranking based on the ground truth ranking scores. Sometimes, although the estimated ranking scores deviate from the ground truth ranking scores, the two rankings derived by both scores can be consistent. In this case, we consider the estimated ranking scores also accurate, since we eventually care the final rankings of the candidate paths but not the specific ranking scores for individiual candidate paths. Both and are able to measure how consistent between the two rankings. The higher the values are, the more consistent the two rankings are. If the two rankings are identical, both and values are 1. Specifically, we have
(9) 
Assume that we have a set of candidate paths , , , the ground truth ranking is , , , and the estiamted ranking is , , .
In , and represent the number of path pairs are consistent and inconsistent in the two rankings. We have since in both ranking, appears before . We have since appears before in the ground truth ranking, while appears before in the estimated ranking. Similarly, the orderings between and are also inconsistent in two rankings.
In , represents the rank difference on the th competitive path in both rankings. Following the running example, we have because path has rank 1 and rank 3 in both rankings, respectively.
ViB Experimental Results
ViB1 Effects of Training Data Generation Strategies
We investigate how the different training data generation strategies affect the accuracy of PathRank. We first consider PRA1, where we only use graph embedding method node2vec to initialize the vertex embeding matrix and do not update during training.
Table II shows the results, where we categorize the training data generation strategies into three categories based on top paths, diversified top paths, and multicost, diversified top paths. For each category, the best results are highlighted with underline. The best results overall is also highlighted with bold. We also show results when the embeding feature sizes are and , respectively.
The results show that (1) when using the diversified top paths for training, we have higher accuracy (i.e., lower MAE and MARE and larger and ) compared to when using top paths; (2) using multicost, diversified top paths achieves better accuracy compared to singlecost, diversified top paths, thus achieving the best results; (3) a larger embeding feature size achieves better results.
Strategies  MAE  MARE  

TkDI  64  0.1433  0.2300  0.6638  0.7044 
128  0.1168  0.1875  0.6913  0.7330  
TkTT  64  0.1302  0.2090  0.6642  0.7046 
128  0.1181  0.1896  0.6818  0.7208  
TkFC  64  0.1208  0.1940  0.6692  0.7131 
128  0.1257  0.2019  0.6699  0.7110  
DTkDI  64  0.1140  0.1830  0.6959  0.7346 
128  0.0955  0.1533  0.7077  0.7492  
DTkTT  64  0.1050  0.1686  0.7124  0.7554 
128  0.0974  0.1564  0.7271  0.7714  
DTkFC  64  0.1045  0.1678  0.7100  0.7544 
128  0.0900  0.1445  0.7238  0.7685  
DTkM  64  0.1077  0.1729  0.7261  0.7679 
128  0.0792  0.1271  0.7478  0.7876 
Next, we consider PRA2, where the graph embedding matrix is also updated during training to fit better the ranking score regression problem. Table III shows the results. The three observations from Table II also hold for Table III. In addition, PRA2 achieves better accuracy than does PRA1, meaning that updating embedding matrix is useful.
Strategies  MAE  MARE  

TkDI  64  0.1163  0.1868  0.6835  0.7256 
128  0.1130  0.1814  0.7082  0.7481  
TkTT  64  0.1218  0.1956  0.6858  0.7282 
128  0.1161  0.1864  0.7026  0.7446  
TkFC  64  0.1216  0.1952  0.6911  0.7321 
128  0.1082  0.1737  0.7070  0.7477  
DTkDI  64  0.0940  0.1509  0.7144  0.7532 
128  0.0855  0.1373  0.7339  0.7731  
DTkTT  64  0.1010  0.1622  0.7283  0.7693 
128  0.0997  0.1600  0.7169  0.7596  
DTkFC  64  0.0938  0.1506  0.7318  0.7743 
128  0.0809  0.1299  0.7386  0.7811  
DTkM  64  0.0966  0.1551  0.7393  0.7771 
128  0.0725  0.1164  0.7528  0.7905 
From the above experiments, the multicost, diversifed top strategy DTkM is the most promising strategy. Then, we further investigate the effects on the similarity threshold used in the diversifed top path finding. Specifically, we consider two threshold values 0.6 and 0.8 and the results are shown in Table IV. When a smaller threshold is used, i.e., higher diversity in the top paths, the accuracy is improved.
MAE  MARE  

PRA1  0.6  64  0.1006  0.1615  0.7321  0.7733 
128  0.0770  0.1237  0.7496  0.7874  
0.8  64  0.1077  0.1729  0.7261  0.7679  
128  0.0792  0.1271  0.7478  0.7876  
PRA2  0.6  64  0.0817  0.1311  0.7404  0.7792 
128  0.0710  0.1140  0.7751  0.8109  
0.8  64  0.0966  0.1551  0.7393  0.7771  
128  0.0725  0.1164  0.7528  0.7905 
Since we have identified that DTkM gives the best accuracy, we only consider DTkM with similarity threshold in the diversified top paths as the training data generation strategy for the following experiments.
ViB2 Effects of Vertex Embedding
We investigate the effects of different vertex embedding strategies. We consider PRB where we just use a randomly initialized embeding matrix , which totally ignores graph topology. For PRA1 and PRA2 where we both use node2vec to embed vertices. Here, we use node2vec to embed both weighted and unweighted graphs, respectively. When embeding weighted graphs, we simply use distance as edge weights.
Based on the results in Table V, we observe the following. First, PRB gives the worst accuracy: the estimated ranking scores have the largest errors in terms of both MAE and MARE; and the ranking based on estiamted ranking scores deviates the most from the ground truth ranking in terms of both and . This suggests that ignoring graph topology when embeding vertices is not a good choice.
Second, when embeding vertices using node2vec, whether or not considering edge weights does not significantly change the accuracy. Thus, it is not a significant design choice
Third, PRA2 achieves the best accuracy in terms of both errors on estimated ranking scores and consistency between two rankings. Thus, this suggests that considering graph topology improves accuracy and updating the embeding matrix according to the loss function on ranking scores makes the embedding matrix fit better the ranking score regression problem. This also suggests that, by including spatial properties in the loss function, the embeding matrix should be tuned to capture spatial properties, which in turn should improve ranking score regression. This is verified in the following experiments on the multitask framework.
Embedding  MAE  MARE  

PRB  —  0.1159  0.1816  0.7233  0.7611 
PRA1  unweighted  0.0878  0.1410  0.7453  0.7852 
weighted  0.0792  0.1271  0.7478  0.7876  
PRA2  unweighted  0.0734  0.1178  0.7640  0.8012 
weighted  0.0725  0.1164  0.7528  0.7905 
ViB3 Effects of Multitask Learning
In the following set of experiments, we study the effects of the proposed multitask learning framework. In particular, we investigate how much we are able to improve when incorporating different spatial properties in the loss function to let the vertex embedding also consider spatial properties, which may potentially contribute to better ranking score regression.
We start by PRA2M1, which considers only one auxiliary task on reconstructing distances. This means that PathRank not only estimate the ranking score of a competitive path but also tries to reconstruct the distance of the competitive paths. Table VI shows the results with varying values. When , the auxiliary task is ignored, which makes PRA2M1 into PRA2, i.e., its corresponding model with only the main task on estimating ranking scores. When , i.e., the auxiliary task on distances is considered while learning, we observe that the estimated ranking scores are improved. In particular, the setting with gives the best results in terms both and , indicating that the ranking w.r.t. the estimated ranking scores is more consistent with the ground truth ranking. When , it achieves the smallest MAE and MARE. Both settings suggest that considering the additional auxiliary task on reconstructing distance helps improve the final ranking.
PRA2M2 includes two auxiliary tasks on reconstructing both distances and travel times, and PRA2M3 includes three auxiliary tasks on reconstructing distances, travel times, and fuel consumption. All the three multitask models show that considering spatial properties improve the final ranking. In particular, when considering all the three spatial properties give the best final ranking in terms of and , i.e., achieving the most consistent ranking w.r.t. the ground truth ranking.
MAE  MARE  

PRA2  0  0.0725  0.1164  0.7528  0.7905 
PRA2M1  0.2  0.0756  0.1214  0.7713  0.8057 
0.4  0.0704  0.1129  0.7765  0.8110  
0.6  0.0693  0.1113  0.7783  0.8141  
0.8  0.0680  0.1029  0.7712  0.8057  
PRA2M2  0.2  0.0653  0.1048  0.7727  0.8089 
0.4  0.0701  0.1125  0.7869  0.8235  
0.6  0.0777  0.1247  0.7752  0.8100  
0.8  0.0807  0.1296  0.7616  0.7973  
PRA2M3  0.2  0.0724  0.1162  0.7732  0.8092 
0.4  0.0740  0.1188  0.7711  0.8090  
0.6  0.0662  0.1063  0.7923  0.8261  
0.8  0.0695  0.1116  0.7842  0.8177 
ViC Comparison with Baseline Ranking Heuristics
We consider three baseline ranking heuristics, i.e., ranking the candidate paths according to their distances, travel times, and fuel consumption. When using each heuristics, we obtain a ranking. Then, we compare the ranking with the ground truth ranking to compute the corresponding and .
Table VII shows the comparision, where we categorize the testing cases based on the distances of the lengths of their corresponding trajectory paths into three categories (0, 5], (5, 10], and (10, 15] km. The results show that the ranking obtained by PathRank, more specifically, by PRA2M3, is clearly the best in all categories, suggesting that PathRank outperforms baseline heuristics. In the longest distance category, PathRank becomes less accurate since the most of the training paths are within short distance categories, as shown in Figure 6.
(0, 5]  (5, 10]  (10, 15]  

Distance  0.7569  0.7886  0.6562  0.6912  0.4745  0.4361 
Travel Time  0.6760  0.7066  0.6406  0.6784  0.4714  0.5435 
Fuel  0.6890  0.7229  0.3919  0.4099  0.2591  0.2291 
PathRank  0.7985  0.8334  0.7649  0.8055  0.6097  0.6702 
ViD Comparison with Driver Specific PathRank
We investigate if driver specific PathRank models are able to provide more accurate, personalized ranking. We select the two drivers with the largest amount training trajectories. The Driver 1 has 2068 trajectories and Driver 2 has 1457 trajectories. We train three PRA2M3 models, denoted as PRDr1, PRDr2, and PRAll, using the training trajectories from Driver 1, Driver 2, and all drivers, respectively.
We test the three models using the testing trajectories from Driver 1 and Driver 2, respectively. Table VIII shows that (1) for the testing trajectories from Driver 1, PRDr1 outperforms PRDr2; and for the testing trajectories from Driver 2, PRDr2 outperforms PRDr1; (2) for both testing cases, PRAll performs the best.

Model  MAE  MARE  

Driver1  PRDr1  0.1154  0.1878  0.7868  0.8162  
PRDr2  0.1464  0.2289  0.7753  0.7983  
PRAll  0.0614  0.1000  0.8269  0.8560  
Driver2  PRDr1  0.2431  0.3957  0.6628  0.6547  
PRDr2  0.1066  0.1666  0.7825  0.8037  
PRAll  0.0633  0.0989  0.8430  0.8610 
Next, we report statistics on a casebycase comparision, where Table IX shows the percentages of the cases where a driver specific PathRank outperforms PRAll. Specifically, PRDr1 outperforms PRAll in ca. 21% of the testing cases from Driver 1, and PRDr2 outperforms PRAll in ca. 17% of the testing cases from Driver 2.
PRDr1  PRDr2  
21.57%  20.10%  17.24%  17.24% 
The results from the above two tables suggest that userspecific PathRank models have a potential to achieve personalized ranking, which may outperform the PathRank model trained on all trajectories, i.e., PRAll. However, the number of an individual driver’s training trajectories is often very limited, making it difficult to cover a large feature space. Thus, it is often difficult to outperform PRAll on average.
ViE Online Efficiency
Since ranking candidate paths is conducted online, we report the runtime. Table X reports the runtime for estimating a path when using different PathRank models. It shows that the nonmultitask learning models, i.e., PRB, PRA1, and PRA2, have similar run time. Multitask learning models take longer time and the more auxiliary tasks are included in a model, the longer time the model takes. PRA2M3 takes the longest time, on average 45.1 ms. Suppose that an advanced routing algorithm or a commercial navigation system returns 10 candidate paths, PRA2M3 is able to return a ranking in 451 ms, which is within a reasonable response time.
PRB  PRA1  PRA2  PRA2M1  PRA2M2  PRA2M3 

11.4  11.3  11.5  22.8  34.4  45.1 
ViF Scalability
We conduct this experiment to investigate the performance when varying the sizes of training data. Specifically, we use 25%, 50%, 75%, 100% of the total training data to train PathRank, respectively. Based on the results shown in Table XI, more training data gives better performance.
Percentage  MAE  MARE  

25%  0.1260  0.2023  0.7100  0.7535 
50%  0.1001  0.1607  0.7286  0.7686 
75%  0.0830  0.1333  0.7395  0.7795 
100%  0.0725  0.1164  0.7528  0.7905 
Vii Conclusion and Future work
We propose PathRank, a learning to rank technique for ranking paths in spatial networks. We propose an effective way to generate a compact set of competitive paths to enable effective and efficient learning. Then, we propose a multitask learning framework to enable graph embedding that takes into account spatial properties. A recurrent neural network, together with the learned graph embedding, is employed to estimate the ranking scores which eventually enable ranking paths. Empirical studies conducted on a large real world trajectory set demonstrate that PathRank is effective and efficient for practical usage. As future work, it is of interest to exploit an attention mechanism on path lengths to further improve the ranking quality of PathRank.
References
 [1] C. Guo, B. Yang, J. Hu, and C. S. Jensen, “Learning to route with sparse trajectory sets,” in ICDE, 2018, pp. 10731084.
 [2] Z. Ding, B. Yang, Y. Chi, and L. Guo, “Enabling smart transportation systems: A parallel spatiotemporal database approach,” IEEE Trans. Computers, vol. 65, no. 5, pp. 13771391, 2016.
 [3] V. Ceikute and C. S. Jensen. “Routing service quality  local driver behavior versus routing services,” in MDM, 2013, pp. 97106.
 [4] B. Yang, C. Cuo, C. S. Jensen, M. Kaul and S. Shang, “Stochastic skyline route planning under timevarying uncertainty,” in ICDE, 2014, pp. 136147.
 [5] J. Y. Yen, “Finding the k shortest loopless paths in a network,” Management Science, vol. 17, no. 11, pp. 712716, 1971.
 [6] H. Liu, C. Jin, B. Yang and A. Zhou, “Finding topk shortest paths with diversity,” IEEE Trans. Knowl. Data Eng., vol. 30, no. 3, pp. 488502, 2018.
 [7] B. Yang, C. Guo, Y. Ma and C. S. Jensen, “Toward personalized, contextaware routing,” The VLDB Journal, vol. 24, no. 2, pp. 297318, 2015.
 [8] A. Grover and J. Leskovec, “node2vec: Scalable feature learning for networks,” in SIGKDD, 2016, pp. 855864.

[9]
F. C. Grey, “Inferring probability of relevance using the method of logistic regression,”
in SIGIR, 1994, pp. 222231.  [10] L. Rigutini, T. Papini, M. Maggini and F. Scarselli, “Learning to rank by a neuralbased sorting algorithm,” in SIGIR, 2008.
 [11] T. Joachims, “Optimizing search engines using clickthrough data,” in SIGKDD, 2002, pp. 133142.
 [12] Z. Cao, T. Qin, T. Liu, M. Tsai and H. Li, “Learning to rank: from pairwise approach to listwise approach,” in ICML, 2007, pp. 129136.
 [13] P. Huang, X. He, J. Gao, L. Deng, A. Acero and L. Heck, “ Learning deep structured semantic models for web search using clickthrough data,” in CIMK, 2013, pp. 23332338.
 [14] Y. Shen, X. He, J. Gao, L. Deng and G. Msenil, “Learning semantic representations using convolutional nueral networks for web search,” in WWW, 2014, pp. 373374.
 [15] L. Pang, Y. Lan, J. Guo, J. Xu, J. Xu and X. Cheng, “DeepRank: A new deep architecture for relevance ranking in information retrieval,” in CIKM, 2017, pp. 257266.
 [16] P. Cui, X. Wang, J. Pei, and W. Zhu, “A survey on network embedding,” IEEE Trans. Knowl. Data Eng., 2018, DOI: 10.1109/TKDE.2018.2849727.
 [17] B. Perozzi, R. AiRfou and S. Skjena, “Deepwalk: Oneline learning of social representations,” in SIGKDD, 2014, pp. 701710.
 [18] J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan and Q. Mei, “Line: Largescale information network embedding,” in WWW, 2015, pp. 10671077.
 [19] T. Mikolv, C. Kar, C. Greg and D. Jeffrey, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
 [20] S. Cao, W. Lu, and Q. Xu, “Deep neural networks for learning graph representation,” in AAAI, 2016.
 [21] J. Dai, B. Yang, C. Guo, and Z. Ding. Personalized route recommendation using big trajectory data. In ICDE, pages 543–554, 2015.
 [22] J. Dai, B. Yang, C. Guo, C. S. Jensen, and J. Hu. Path cost distribution estimation using trajectory data. PVLDB, 10(3):85–96, 2016.
 [23] C. Guo, Y. Ma, B. Yang, C. S. Jensen, and M. Kaul. Ecomark: evaluating models of vehicular environmental impact. In SIGSPATIAL, pages 269–278, 2012.
 [24] C. Guo, B. Yang, O. Andersen, C. S. Jensen, and K. Torp. Ecomark 2.0: empowering ecorouting with vehicular environmental models and actual vehicle fuel consumption data. GeoInformatica, 19(3):567–599, 2015.
 [25] H. Wang, J. Wang, J. Wang, M. Zhao, W. Zhang, F. Zhang, X. Xie, and M. Guo, “GraphGAN: Graph representation learning with generative adversarial nets,” in AAAI, 2018, pp. 25082515.
 [26] P. Newson and J. Krumm, “Hidden markov map matching through noise and sparseness,” in SIGSPATIAL, 2009, pp. 336343.
 [27] X. Li, K. Zhao, G. Cong, C. S. Jensen and W. Wei, “Deep representation learning for trajectory similarity computation,” in ICDE, 2018, pp. 617628.
 [28] J. Y. Yen, “Finding the k shortest loopless paths in a network,” Informs, vol. 17, no. 11, pp. 712716, 1971.
 [29] J. Hershberger, M. Maxel and S. Suri, “Finding the k shortest simple paths: A new algorithm and its implementation,” ACM Trans. Algori., vol. 3, no. 4, pp. 45, 2007.
 [30] N. Katoh, T. Ibaraki and H. Mine, “An efficient algorithm for k shortest simple paths,” Networks, vol. 12, no. 4, pp. 411427, 1982.
 [31] D. Eppsetin, “Finding the k shortest paths,” SIAM Jour. Comput., vol. 28, no. 2, pp. 652673, 1998.
 [32] C. Guo, B. Yang, J. Hu, and C. S. Jensen. Learning to route with sparse trajectory sets. In ICDE, pages 1073–1084, 2018.
 [33] J. Hu, B. Yang, C. Guo, and C. S. Jensen. Riskaware path selection with timevarying, uncertain travel costs: a time series approach. VLDB J., 27(2):179–200, 2018.
 [34] J. Hu, B. Yang, C. S. Jensen, and Y. Ma. Enabling timedependent uncertain ecoweights for road networks. GeoInformatica, 21(1):57–88, 2017.
 [35] T. Kieu, B. Yang, C. Guo, and C. S. Jensen. Distinguishing trajectories from different drivers using incompletely labeled trajectories. In CIKM, pages 863–872, 2018.
 [36] T. Kieu, B. Yang, C. Guo, and C. S. Jensen. Outlier detection for time series with recurrent autoencoder ensembles. In IJCAI, 2019.
 [37] D. LibenNowell and J. Kleinberg, “The linkprediction problem for social networks,” Journal of the American society for information science and technology, vol. 58, no. 7, pp. 10191031, 2007.
 [38] X. Wang, P. Cui, J. Wang, J. Pei, W. Zhu and S. Yang, “Community preserving network embedding,” in AAAI, 2017.
 [39] X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick and J. Han, “Personalized entity recommendation: A heterogeneous information network approach,” in WSDM, 2014, pp. 283292.
 [40] D. Wang, P. Cui and W. Zhu, “Structural deep network embedding,” in SIGKDD, 2016, pp. 12251234.
 [41] T. Kieu, B. Yang, and C. S. Jensen. Outlier detection for multidimensional time series using deep neural networks. In MDM, pages 125–134, 2018.
 [42] B. Yang, J. Dai, C. Guo, C. S. Jensen, and J. Hu. PACE: a pathcentric paradigm for stochastic path finding. VLDB J., 27(2):153–178, 2018.

[43]
B. Yang, C. Guo, and C. S. Jensen.
Travel cost inference from sparse, spatiotemporally correlated time series using markov models.
PVLDB, 6(9):769–780, 2013.  [44] L. van der Maaten and G. Hinton, “Visualizing data using tSNE,” Journal of Machine Learning Research, vol. 9, pp. 25792605, 2008.
 [45] D. Erhan, Y. Bengio, A. Courville, P. Manzagol, P. Vincent and S. Bengio, “Why does unsupervised pretraining help deep learning?” Jour. Machine Learning Research, vol. 11, pp. 625660, 2010.
 [46] R. Cirstea, D. Micu, G. Muresan, C. Guo, and B. Yang. Correlated time series forecasting using multitask deep neural networks. In CIKM, pages 1527–1530, 2018.
 [47] B. Yang, M. Kaul, and C. S. Jensen. Using incomplete information for complete weight annotation of road networks. IEEE Trans. Knowl. Data Eng., 26(5):1267–1279, 2014.
 [48] J. Hu, C. Guo, B. Yang, and C. S. Jensen. Stochastic weight completion for road networks using graph convolutional networks. In ICDE, pages 1274–1285, 2019.
Comments
There are no comments yet.