Location prediction is a central theme of mobile computing. It has found its wide applications in domains such as providing smooth handoffs between wireless communication cells, offering cognitive assistance, providing additional information for in-car navigation system and so on [9, 13, 18, 6, 2]. The problem can be defined as: given the current location and the starting location of a partial trip already traveled by a user, we seek to find out the destination of the whole journey.
Our destination prediction scheme consists of the following components (Figure 1). First, two novel data constructs Efficient Transition Probability (ETP) and Transition Probability with Detours (TPD) are employed to efficiently train the offline prediction model. ETP and TPD can pinpoint the minimum indispensable computation. Then we locate Obligatory Transit Point (OTP) and Transition Affected Area (TAA) to efficiently update the preceding offline model (only recompute the altered transition probabilities). OTP and TAA further impose constraints on the minimum region of interest (hotspot). Subsequently we use a semi-lazy method to identify the most probable future location regarding the recent route choice of a user. This strategy is applied in conjunction with the Bayesian theory-based online prediction to improve the prediction results.
(1) cannot deal with the data spasity problem very well. Some locations are never covered in the historical data. The lack of data reduces the prediction accuracy of these methods.
(2) underutilize the available historical data. Most previous work builds a model with the data and very often this model is not a lossless representation of the original one.
(3) rather notable efficiency improvement of algorithm can be gained through our optimization. We note that matrix multiplication involved in our baseline can be simplified through our dynamic-programming like approach. Moreover, frequent update of the model can be achieved by restraining the computation to the minimum amount combined with this mechanism.
The main contributions of our work are summarized as below.
(1) We incorporate the semi-lazy framework into our prediction model to address the problem overlooked by most earlier work [8, 9, 17, 19, 24] to adequately consider the route choice between the starting location and the current location.
(2) We propose an efficient dynamic-programming like algorithm which includes two flexible data constructs – ETP and TPD – that vastly improves the training efficiency.
(3) We devise effective mechanisms OTP and TAA to deal with the alteration of transition probabilities confined merely to several regions (around 5% of the total amount). We then exploit the efficiency gain for more frequent update of our model to improve prediction accuracy.
The rest of the paper is organized as follows. We introduce the related work in this domain and give an overview of our approach in section 2. After that, the adaptation of semi-lazy prediction, the optimization of matrix multiplication, harnessing the effect of detour distances, efficient frequent update of the prediction model, are laid out in section 3, section 4 and section 5 respectively. In section 6, we present the experiments. Finally, we conclude our work in section 7.
2 Related Work
Previous work pertaining to this subject tends to discover patterns which they term as “popular” for subsequent decision-making that seeks to optimize a certain goal, be it best routes connecting two endpoints [5, 10, 15, 16, 21, 22], the next most probable stops or the region that appears to be of interest to drivers [1, 7, 11, 12, 20].
Krumm and Horvitz  incorporated in their approach multiple features including driving efficiency, ground cover, trip times etc. and employed an open-world model to capture the probabilities of users leaving for places which have never been visited in the past. Ziebart et al.  employed a sophisticated context-aware behavior model PROCAB to infer intersection decisions, routes and destinations of drivers. Gao et al.  demonstrated the breach of privacy induced by insurance schemes through predicting the destinations by exploiting only the speed data of a vehicle. The aim of our work differs from all the preceding research since we concentrate on destination prediction using only the historical trajectories.
Jeung et al.  proposed two query processing techniques that can obtain future movement prediction through a novel access index. The knowledge of the possible end points of a journey also facilitates opportunistic routing, conceived by Eric Horvitz et al. , that recommends sensible diversions along one trip route to a primary destination. Monreale et al.  designed a T-pattern Tree which is learnt from Trajectory Patterns. Future trajectory prediction  has even been applied in the Decision Support System (DST) of Air Traffic Management (ATM). Trinh Minh Tri Do et al.  developed an ensemble method that builds a contextual variable into a probabilistic framework for human mobility prediction. These studies are directed at the prediction of subsequent movement, next places or the future trajectory all of which do not address the problem of destination prediction as our work does.
A majority of the aforementioned research focuses on one or several geo locations that matter most, either the current position or some statistically significant places, and then perceive them as several discrete states of a Markov model or a HMM[1, 6, 9, 11, 17, 18].
Our approach prioritizes the most recent movements. This is an apparent advantage over our baselines  which essentially consider only the starting location and the current location. In fact, Xue et al.  utilizes this trait for privacy protection against their SubSyn algorithm by removing two endpoints (i.e. the origin and the current position).
Our work relies solely on the historical trajectory dataset to predict destinations, which is notably different from most previous work [1, 6, 9, 15, 24] (i.e. no other information such as time or user profiles is included). This general setting allows us to analyze user movement when such knowledge is not available, which is often the case.
|The confidence threshold strikes the balance between the length of and prediction accuracy|
|Total trip distance|
|Distance traveled so far|
|The length of predicted path|
|Subinterval boundary point (distance)|
|The trajectory traveled so far|
|Most probable future location|
|The length of predicted path of the semi-lazy prediction framework|
|Total transition probability for a trip from location to location|
|Transition probability for a trip starting at location and ending at location|
|-step Markov transition matrix|
|The L1 distance between location and location|
|Trajectories starting at|
|Trajectories starting at and ending at|
|Length of detour|
3 Adaptation of THE SEMI-LAZY PREDICTION FRAMEWORK
In this section, we discuss about the adaptation and incorporation of the semi-lazy framework into our prediction model during the online training phase. We first determine the predicted length of ongoing trajectory and then identify the location that is most likely to be traveled in the future. After that our model produces the predicted results given the knowledge of this most probable future location. The notations used in this paper are given in Table 1.
3.1 The Workflow
The basic workflow (Figure 2) of our adaptation is as follows. First we employ the semi-lazy path prediction framework  to generate a path that connects the current location to the most probable future location . We specify the desirable predicted length through applying a logarithmic decay to
, the estimated total trip distance at the current timestamp. Then the end point of(i.e. ) replaces the current location since is more likely to be closer to the final destination and thus gives better prediction results.
3.2 The Predicted Path
Trip Estimation. To decide on a proper value for the length of a predicted path, we first create a frequency diagram depicting the distribution of total trip distance of our historical data. Specifically, we let distance measurements fall into one of the subintervals separated by . The expected value of total travel distance is calculated by
Then we iteratively estimate the total trip distance at a particular instant of time as
where should satisfy
Equation 2 provides the expected value of total trip distance given the current trip that has been traveled so far. It offers us a ballpark figure of the journey distance at a specific time which can be used to determine the predicted length . Predicted length imposes a constraint on the upper bound of the length of trajectories generated by the semi-lazy framework which requires us to specify a proper threshold .
Logarithmic Decay. As mentioned in the overview of this chapter, we seek to identify and thus only a certain proportion of (i.e. ) is taken to achieve this aim. Moreover, the rationale for the reduction of predicted path is that should also approach the current location as the trip gradually comes to its end. Hence we employ a logarithmic function to perform this task which is given as
where the argument of the logarithm quantifies the estimated trip completion percentage based on our preceding . The base , the decay factor, indicates how fast the predicted percentage should decline. We repeatedly alter the decay factor in our experiment and find that setting it to 0.004 works the best.
Translation Between Predicted Path and Confidence Threshold. Once we have selected a proper value for , the confidence value can then be known which is in proportion to . According to , a longer path produces a lower confidence value.
The semi-lazy framework compares the confidence value and the confidence threshold to determine the length of the predicted path. We modify the semi-lazy path prediction algorithm by incrementally comparing the length of its inferred path and that of our to suit our need. The pseudo code of our approach is presented in Algorithm 1.
denotes the total transition probability for a journey from location , the most probable future location, to a presumed destination . Likewise, represents a trip which starts at . Note that is actually an element of an -step Markov transition matrix which we obtain through multiplying the single-step matrix times. reflects the proportion of trajectories that begin at the same origin (denominator ) but end up at different locations (numerator ).
4 Optimizing the Markov Transition Matrix Multiplication
4.1 The Motivation for Optimization
Markov transition matrix multiplication remains to be the major hurdle of performance improvement for our offline training. In our case, matrix multiplication can be a computationally formidable task particularly when the size and the number of step of transition required by the Markov transition matrix are large. According to Xue et al. , the offline training for SubSyn takes beyond 1 hour for a map of medium grid granularity setting on a commodity machine.
4.2 Efficient Transition Probability
First let us introduce some key concepts.
Definition 1 (ETP – Efficient Transition Probability) Given two locations and , is the probability of the transition taking the most efficient route whose length corresponds to the L1 distance .
Definition 2 (Relative Adjacent Pair - RAP) Given two locations and , the relative adjacent pair, , comprises precisely two cells that are immediately adjacent to regarding in the L1-metric sense . These two adjoining cells are on the route that links with .
A special case arises when two cells are in the same row/column. In this case the RAP comprises soley one element which is the adjoining cell of regarding .
Simply put, setting off from location one must pass either of the two cells of RAP to reach location . (Say in Figure 3, then ) This notion is essential in our solution to the cut of computational cost of one order of magnitude since it corresponds to the efficient routes taken and circumvent the extra computation brought about by sparse matrix multiplication – two cells which are impossible to reach in this case.
The Computation of ETP. Next we show how to compute ETP (Efficient Transition Probability) through dynamic-programming like recursion. The relationship between -step transition and -step transition can be found by
Here , the Single Step Transition Probability, measures the frequency of transition from to . Equation 4 consists of exactly the two components of RAP (Relative Adjacent Pair), i.e. and to recursively obtain the efficient transition probabilities. The strength of this technique compared with sparse matrix multiplication is its ability to calculate the necessary transition probabilities only once and save them for later computation. It is akin to the divide-and-conquer tactic in that every problem (reaching location ) can be worked out by dealing with its sub problems (reaching the RAP of location ).
4.3 Detour Distances
Although L1 distance takes into consideration the case when the shortest Euclidean route is constantly infeasible, scenarios may emerge where drivers intentionally opt for a slightly longer itinerary. Despite the fact that Xue et al.  has excluded detour distances from their approach claiming that the exclusion leads to greater simplicity and little degradation in prediction accuracy, we report around 7% increase in prediction accuracy with the consideration of detour distances.
Definition 4 (TPD - Transition Probabilities with Detours) Given two locations and , gauges the probability of transition taking the route with a detour whose length is .
The definition of TPD resembles that of ETP since when no detour is involved, i.e. . Next we can find that TPD can be obtained recursively by
where denotes the 4 cells that are immediately adjacent to . This data construct has the capability to suit the needs of detours of different lengths without being prone to performance reduction as is matrix multiplication. Shown in Figure 3 are the locations surrounding the starting point (cell 56) alternating between two states, either reachable or unreachable.
4.4 The Upper Bound
It is apparent that during each step of transition one can travel to only half of the locations in the neighboring region of the starting point. This implies that at least 50% of the transition probabilities calculated by matrix multiplication are destined to be zero. However, the techniques employed by [17, 18] require to update every element of the transition matrix during its multiplication, be it dense  or sparse , which essentially incurs more than double the necessary cost of both computation and storage. Here by claiming “more than double”, we are referring to the fact that at least 50% of the entries of a transition matrix are zero which can indeed be stated as a theorem below.
Theorem 1. Given a -by- -step Markov transition matrix , the amount of non-zero entries is , then .
Proof. From any location in a map one can travel to the 4 directly adjacent cells. The ensuing move should land him on any of the 4 adjoining cells of his previous move. In order to get to his subsequent destination, he has to first leave his previous starting location, which essentially rules out his arrival on these places (starting points) in this turn of transition (self-transition excluded). Furthermore, since none of the immediately adjacent 4 cells is reachable after the previous step of transition, these starting cells cannot be got to from the other locations during this turn of transition as well (non self-transition also excluded now). The preceding reasoning applies to every step of transition, precisely rendering at least half of the total locations (starting points) impossible to get to and the other half reachable (destinations). Hence the conclusion can be drawn that the non-zero elements constitute no more than 50% of a transition matrix. A more mathematically rigorous treatment of this issue is offered in the Appendix.
5 Frequent update of the model
5.1 The Justification for Frequent Update
Traffic conditions undergo instantaneous changes at every moment. It is rational that we capture its latest trend by frequently updating our model. We notice that only a portion of all the transition probabilities varies during a short period of time, for instance at three-minute intervals. To factor into such changes of road traffic, previous approaches[17, 18] have to perform matrix multiplications for all of the cells residing in a map. However, according to our analysis, since changes occur at merely a part of the whole map, redundant computations are carried out by this solution. Our approach differs from its predecessor in that it breaks down the structure of Markov transition matrix and is directed at the items that are integral to the transitions between cells in a map (ETP and TPD). Consequently, we can adjust the proportion of updates for transition probabilities to meet the requirements of the constantly changing traffic condition while keeping the incidental cost regarding these alterations as low as possible.
5.2 Transit Points and Affected Areas
First we present some definitions concerning the frequent update of the model.
Definition 5 (OTP - Obligatory Transit Point) One needs to travel past the obligatory transit point on the route from location to location .
Definition 6 (TAA – Transition Affected Area) Departing from location and walking past the intermediate transit point , one is likely to reach any of the cell residing in the transition affected area after he takes his ensuing moves that may include a detour adding up to the distance of the whole trip. Moreover, now the intermediate transit point is exactly the OTP of any of the entire constituents of TAA.
Consider the following example: one starts from location 56 and makes a stop at location 62 which we also perceive it as the OTP of the . For some reason the changing traffic results in the alteration of the transition probabilities of location 62 with regard to its four locations in the vicinity. Then it is evident that is composed of the cells forming the rectangle whose diagonal spans from location 62 to location 90 (Figure 4).
expands beyond the area of , further occupying the adjoining cells of its top and right border (Figure 4). Such expansion occurs with the increase of the length of detour, enabling us to identify the TAA in a recursive fashion (i.e. to compute the values of ETP in each TAA). Notice that the increment of the third item of the tuple TAA, the detour , should always be 2 according to Theorem 1.
5.3 The Training Phase Algorithms
The training algorithm for Efficient Destination Prediction can be broken down into two parts: the first phase initializing the total transition probabilities for all of the origin-destination pairs (Algorithm 2), and the second one which continuously enhances our prediction by frequently updating the model (Algorithm 3). After the initialization of the total transition probabilities, we can proceed to continuously improve our model in a timely manner.
The Initial EDP Training (Algorithm 2) first obtains the ETP regarding two locations and whose corresponding L1 distance is . Then we can yield the TPD pertaining to a detour of in addition to the distance associated with location and . Note that when the TPD and ETP with respect to and are essentially identical. Hence ETP is first calculated in order to find TPD. We call this strategy ’Efficient First Detours Later’ which enables us to obtain TPD in an efficient iterative dynamic-programming like manner.
When we compute the values of TPD, the increment of distance is always two. This is because the value of one of the two TPDs whose difference of is 1 must be 0 according to Theorem 1. Subsequently, we store the sum of TPDs in the corresponding total transition probability .
Similarly, when we update the EDP model (Algorithm 3) we first compute the values of ETPs residing within a TAA. And then we move on to find their corresponding TPDs and keep track of the sum in . The step size of this loop is still 2 in accordance with Theorem 1.
The TAA of interest can be found by first identifying the initial rectangular area with respect to cell and its OTP – (line 3, Algorithm 4). This rectangle essentially comprises all the cells whose detours are 0 and can be obtained as follows: first we draw a vertical line and then a horizontal one across ; the whole map now is partitioned into 4 regions; then the rectangle that is in the diagonal direction of the region containing cell is the desirable . Once we have initialized this 0-detour TAA, we can then move on to find TAAs with longer detours by gradually extending their smaller counterparts through taking in their border neighbors as is shown in Figure 4 (line 6, Algorithm 4).
6 Experimental Evaluation
We assess our algorithm and its competitors in this section. The dataset and the evaluation criteria are first described. Then we present their running time and the effectiveness of responding to the queries. Specifically, the study on run-time efficiency concerns both the time for model training and query answering. All the experiments are conducted on a desktop computer with 4GB of memory and a quad core 2.7GHz CPU.
Shown in Figure 5 (on the left) is an example demonstrating the benefit of our approach which predicts an area that is closer to the final destination.
Starting from location 1, the trajectory ends at location 21. Assume that the driver is now at location 6. Our baseline predicts that the destination is somewhere around location 11 (cell 778). Differing from the SubSyn series algorithm which considers only two discrete locations ( location 1 and location 6 in this case), our approach draws on the route information concerning the two places and discovers the most probable future one to be location 13 (). And this inference leads us to the prediction that is closer to the true destination. Also notice that the route (length=24) taken by the driver is longer than the L1 distance (length=22) between the origin-destination endpoints. This confirms the effectiveness of our incorporation of detour distances.
The distinction between the two nearest cells gets increasingly blurred when the granularity grows. In the example shown in Figure 5 (on the right), it is apparent that the cell (light purple) residing in the 40x40 map (top) is roughly identical to the combined region of the two cells (light green) in the 60x60 map (bottom). The two regions mutually covers a large area of one another. This substantial degree of overlap between the two demonstrates why 2nd order Markov model (SubSynEA) tends to bring about limited improvement to the baseline SubSyn or SubSynE.
Baseline Algorithms. SubSynE and SubSynEA are used in our experiment to illustrate two facets of our algorithm – the capacity to efficiently train the model that considers detour distances (EDP V.S. SubSynE) and the benefits accruing from both the constant updating of the model and the integration of detours (EDP V.S. SubSynEA).
Datasets. We test all the algorithms on a real-world dataset that is openly available and on a synthetic one.
Real-World Data: This dataset  encompasses the GPS location records pertaining to the real-time whereabouts of 514 taxis running in San Francisco Bay Area during a time span of 30 days. 10000 trajectories are randomly selected as the queries submitted by users while the remaining portion serves as the training set. This dataset is used to evaluate both the prediction accuracy and training efficiency of all the methods.
We generate one synthetic data set in the form of single-step Markov matrix filled with transition probabilities. This dataset is solely for the assessment of training efficiency. Its size corresponds to the respective granularity of a real-world dataset.
The Effect of Decay Factor. The decay factor determines the speed of decline of the predicted percentage. Figure 6 plots the deviation against the decay factor. The bars at the bottom of this chart, with three bars regarding their respective completion point in a group standing close, indicate the resulted difference between one decay factor and the one that yields the least deviation.
Our experiment shows that, at the earlier part of a trip, a larger decay factor is preferred. As the trip gradually draws to its end, a smaller makes a more favorable choice. We set this parameter to 0.004 to strike the right balance.
6.1 Efficiency of Training Algorithm
As is shown in Figure 7, it is apparent that our approach is more superior to SubSynE and SubSynEA in terms of training time as the granularity rises. This is particularly true once the map becomes more fine-grained. Our approach is 16.1 times and 348.5 times faster than SubSynE and SubSynEA respectively when we set the granularity to 50. The training time of both SubSynE and SubSynEA has soared even more dramatically after the granularity exceeds 50. The zero entries most of the time constitute far beyond 50% of the elements of transition matrix for SubSynE and SubSynEA. They repeatedly go through the process of data retrieval from main memory (cache will simply not fit owing to the enormous amount of them), double-precision floating point multiplication and storing them back. This process imposes an onerous yet unnecessary burden on the overall efficiency. After the preceding computation is done, it often just yields another zero that contributes little to the computation of non-zero transition probabilities but actively involves in yet another vicious cycle of this sort. Furthermore, the inadvertent inclusion of detour distances dictated by matrix multiplication significantly exacerbates the performance of SubSynEA (Figure 7).
6.2 Evaluation of Prediction Algorithm
6.2.1 Efficiency Evaluation
In the phase of responding to user queries, our solution needs the incorporation of semi-lazy trajectory preprocessing to locate the most likely future position . Thus it takes extra time for our approach to factor into this determinant. This trade-off is favorable since it additionally incurs merely a fraction of a second (around an extra 65ms in most cases) but vastly improves the accuracy. The succeeding step of destination prediction can be performed very fast (around 0.05ms) as it simply retrieves the numeric values of transition probabilities for further calculation and comparison. The predominant factor of these algorithms is the training time which sets our algorithm apart from its competitors.
6.2.2 Accuracy Evaluation
First we would like to discuss about the measures we use to gauge the performance. Two specific locations in the course of a trip – the 30% and 70% completion point - particularly draw our attention as they indicate how well an algorithm will fare soon after a traveler just begins his trip or soon before he arrives at his destination. Moreover, the impact of grid granularity is of our concern since it correlates strongly with the effectiveness of our approach. Besides we also alter the completion percentage of trip and the ratio of identical trajectories (shown in Figure 9). Here identical trajectories refer to those in the testing dataset that are perceived as the same with their counterparts in the training dataset. Judged against the yardstick of this ratio, all the algorithms can be examined from a more practical perspective which reflects their capability of dealing with the recurring historical data (i.e. exact matches) as well as generalizing to completely novel scenarios (i.e. new routes that emerge for the first time).
30% - 70% Completion Points: We compute the mean of the deviation distances of the top three destinations given by the algorithms. Our solution consistently outperforms SubSyn and SubSynEA in terms of prediction accuracy quantified by the average deviation from the ground truth (Figure 8). The granularity of a map plays an essential role in improving the accuracy though this effect gradually dwindles as the map becomes more fine-grained. Moreover, SubSynEA (second-order model) produces much better prediction results than SynSynE (first-order model) in the more coarse-grained settings. This distinction slowly fades away as the granularity increases, which is in agreement with our earlier analysis that the second-order model is prone to degradation since the rising amount of cells in a map obscures the distinction of two geographically isolated locations (Figure 5 on the right).
Different Completion Points: The ability to pinpoint the cause of transitioning variation (OTP and TAA) and to address this problem by only re-computing the affected transition probabilities makes our approach very efficient and more accurate. The synergy of the two aforementioned mechanisms gives rise to the definite edge of our approach over its competitors in terms of prediction accuracy, which is particularly evident during the course starting from the 25% completion point and ending at the 85% completion point, the primary stage for location prediction (Figure 8). The potential opportunities for various tasks such as POI (point of interest) recommendation and advertising abound especially in this course..
6.3 Analysis of Accuracy Improvement
The second-order Markov model underlying SubSynEA enhances the prediction accuracy at the expense of a substantial increase in transition states which are offset to some extent by sparse matrix multiplication. Moreover, SubSynEA excludes the consideration for detour distances favoring the simplification of the model. Rather than employing a high-order Markov model, we stick with the first-order model and apply the semi-lazy path prediction algorithm first to discover the most probable future location. We find that our solution outperforms its competitors since the route picked by a user should be best described by the model focusing on the trajectory itself as opposed to several discrete Markov states.
The lack of a proper way handling the user-chosen route undermines the chances of right prediction of SubSyn. SubSynEA attempts to remedy this problem by only additionally considering the nearest historical location. However, a similar issue will arise regarding this technique that the itinerary traveled so far is still partially represented by merely three locations. The effectiveness of this strategy gradually diminishes as the granularity of grid map becomes less coarse. The distinction of the two states associated with the neighboring region of current location gets increasingly blurred, implicating that SubSynEA has the inherent propensity to fall back into SubSyn in fine-grained settings. This may well account for its mediocre performance under such conditions in our experiment.
In this paper we propose an efficient scheme for destination prediction that runs an order of magnitude faster and gains an increase of over 30% in accuracy, compared with the state-of-the-art approach. Our solution mainly involves the inclusion of semi-lazy prediction, the optimization of the Markov transition matrix multiplication and a feasible frequent update method for our model. Experimental results with respect to the preceding two dimensions of our work have demonstrated its efficiency and effectiveness.
Acknowledgments. The authors would like to thank Prof. Xifeng Yan for his valuable comments.
-  T. M. T. Do and D. Gatica-Perez. Contextual conditional models for smartphone-based human mobility prediction. In UbiComp, pages 163–172, 2012.
-  K. Evensen, A. Petlund, H. Riiser, P. Vigmostad, D. Kaspar, C. Griwodz, and P. Halvorsen. Mobile video streaming using location-based network prediction and transparent handover. In Proceedings of the 21st international workshop on Network and operating systems support for digital audio and video, pages 21–26, 2011.
-  X. Gao, B. Firner, S. Sugrim, V. Kaiser-Pendergrast, Y. Yang, and J. Lindqvist. Elastic pathing: Your speed is enough to track you. In UbiComp, pages 975–986, 2014.
-  Y. Ge, H. Xiong, A. Tuzhilin, K. Xiao, M. Gruteser, and M. Pazzani. An energy-efficient mobile recommender system. In KDD, pages 899–908, 2010.
-  H. Gonzalez, J. Han, X. Li, M. Myslinska, and J. P. Sondag. Adaptive fastest path computation on a road network: a traffic mining approach. In PVLDB, pages 794–805, 2007.
-  E. Horvitz and J. Krumm. Some help on the way: Opportunistic routing under uncertainty. In UbiComp, pages 371–380, 2012.
-  H. Jeung, Q. Liu, H. T. Shen, and X. Zhou. A hybrid prediction model for moving objects. In ICDE, pages 70–79, 2008.
-  J. Krumm. Real time destination prediction based on efficient routes. In Society of Automotive Engineers (SAE) 2006 World Congress, volume 7, 2006.
-  J. Krumm and E. Horvitz. Predestination: Inferring destinations from partial trajectories. In UbiComp, pages 243–260. 2006.
-  W. Luo, H. Tan, L. Chen, and L. M. Ni. Finding time period-based most frequent path in big trajectory data. In SIGMOD, pages 713–724, 2013.
W. Mathew, R. Raposo, and B. Martins.
Predicting future locations with hidden markov models.In UbiComp, pages 911–918, 2012.
-  A. Monreale, F. Pinelli, R. Trasarti, and F. Giannotti. Wherenext: a location predictor on trajectory pattern mining. In SIGKDD, pages 637–646, 2009.
-  D. J. Patterson, L. Liao, K. Gajos, M. Collier, N. Livic, K. Olson, S. Wang, D. Fox, and H. Kautz. Opportunity knocks: A system to provide cognitive assistance with transportation services. In UbiComp, pages 433–450. 2004.
-  H. S. Samet Ayhan. Aircraft trajectory prediction made easy with predictive analytics. In KDD, 2016.
-  H. Su, K. Zheng, J. Huang, H. Jeung, L. Chen, and X. Zhou. Crowdplanner: A crowd-based route recommendation system. In ICDE, pages 1144–1155, 2014.
-  L.-Y. Wei, Y. Zheng, and W.-C. Peng. Constructing popular routes from uncertain trajectories. In SIGKDD, pages 195–203, 2012.
-  A. Y. Xue, J. Qi, X. Xie, R. Zhang, J. Huang, and Y. Li. Solving the data sparsity problem in destination prediction. The VLDB Journal, 24(2):219–243, 2015.
-  A. Y. Xue, R. Zhang, Y. Zheng, X. Xie, J. Huang, and Z. Xu. Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In ICDE, pages 254–265, 2013.
-  A. Y. Xue, R. Zhang, Y. Zheng, X. Xie, J. Yu, and Y. Tang. Desteller: A system for destination prediction based on trajectories with privacy protection. PVLDB, 6(12):1198–1201, 2013.
-  J. Yuan, Y. Zheng, and X. Xie. Discovering regions of different functions in a city using human mobility and pois. In SIGKDD, pages 186–194, 2012.
-  C. Zaiben, S. H. Tao, and Z. Xiaofang. Discovering popular routes from trajectories. In ICDE, pages 900–911, 2011.
-  J. Zheng and L. M. Ni. Modeling heterogeneous routing decisions in trajectories for driving experience learning. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pages 951–961, 2014.
-  J. Zhou, A. K. Tung, W. Wu, and W. S. Ng. A “semi-lazy” approach to probabilistic path prediction in dynamic environments. In SIGKDD, pages 748–756, 2013.
-  B. D. Ziebart, A. L. Maas, A. K. Dey, and J. A. Bagnell. Navigate like a cabbie: Probabilistic reasoning from observed context-aware behavior. In UbiComp, pages 322–331, 2008.