1 Introduction
Deciding suitable itineraries is often challenging for tourists in an unfamiliar city. Due to the popularity of locationbased social networks, an unprecedented volume of historical trips and itineraries, each represented as an ordered sequence of PointsOfInterest (POIs) visited, has become available. This opens up a new avenue to learn popular and suitable itineraries from the historical trip^{1}^{1}1Trips, routes, and itineraries can be used interchangeably; however, hereafter we use trips/routes to denote historical paths of users and itineraries to refer to the recommended paths. data. In the past few years, many techniques [23, 4, 11, 29] have been proposed that learn from historical trips in the city and recommend the most popular itinerary from a given source to a given destination . Intuitively, such a popular itinerary system returns a sequence of POIs which has been most frequently adopted by past users while traveling from to .
However, recommending a single itinerary is often too restrictive and may not meet a user’s needs. Therefore, it is preferable to recommend multiple alternative itineraries. Quality alternative itineraries must not only be popular, but also dissimilar (or diverse) to each other. Without loss of generality, we use the terms diversity and dissimilarity interchangeably as both refer to the alternate itineraries with minimum overlap. In this paper, we propose learningbased techniques to report alternative itineraries that are popular and are also dissimilar to each other. To the best of our knowledge, we are the first to learn multiple quality alternative itineraries from historical routes that are both popular and diverse with each other at the same time.
Motivating Example: Figure 1 shows an example where, for a source (Rome Central Station) and a destination (a hotel), five different historical routes ( to ) are shown in solid black colored lines that pass through nine POIs in total ( to ). Assume that the routes in descending order of popularity are given as , , , and . Existing systems that return the most popular itinerary would learn to return . If the user wants the top popular itineraries, and would be recommended which are both very similar to each other as and . This may not be desirable for a user looking for alternative itineraries to choose from. Also, if a system attempts to return diverse itineraries not considering their popularity, it may return and . This may also be not desirable, as itinerary is not well supported by the historical trips. In this case, a better solution is to return two popular yet diverse itineraries such as and . These two will be considered as quality alternatives by the querying user. Note that (shown in red dotted lines) does not exist in the historical routes. Our learningbased algorithms are able to discover popular and diverse itineraries that do not necessarily exist in the historical trips (e.g., ).
Limitations of Existing Works: All existing learningbased itinerary recommendation systems [23, 4, 11, 29] are designed to recommend the most popular itinerary and, to the best of our knowledge, there does not exist any learningbased technique to recommend multiple alternative itineraries.
There exists some searchbased techniques [20, 32, 8, 25] that return a set of diverse itineraries based on predefined popularity and/or diversity objective functions. However, these techniques are not applicable to the problem studied in this paper because the notion of diversity used in these techniques is different from ours. For example, [20] aims to return routes such that the POIs within a route have diverse features. In contrast, we consider two itineraries to be more diverse if they have a smaller overlap (i.e., have fewer common POIs), so that they would be considered as alternatives with respect to each other. Also, [32] defines diversity to be the minimum Euclidean distance of any two POIs in two different itineraries. Thus, two itineraries that have even one common POI are considered to have zero diversity even if all other POIs are very different and far from each other.
Furthermore, these searchbased techniques suffer from a number of limitations. First of all, most of these works only consider optimising either the popularity or the diversity, and do not take both into account at the same time while recommending a set of itineraries. However as we explain in our motivating example, attempting to optimize only diversity or only popularity without considering the other would not lead to quality alternative itineraries. Secondly, as noted in [19], it is not trivial to define quantitative measures to evaluate the quality of alternative itineraries and there is no agreed definition of what constitutes a set of highquality alternative itineraries. But these searchbased techniques typically require explicit modeling of popularity and/or diversity. Users may not have any prior knowledge to define and tune such metrics and, more importantly, these techniques may not be able to recommend suitable itineraries if the user fails to do so. Thirdly, since these systems do not learn from the historical trips, they cannot incorporate the semantics of the sequence of visits in their solution. Last but not the least, these algorithms are unable to handle userdefined constraints which limit their applicability.
Our Contributions: We address the above limitations and propose two novel deeplearningbased algorithms, called DeepAltTripLSTM and DeepAltTripSamp, that learn from the historical trips and recommend alternative itineraries that are both popular and diverse. A key component of these algorithms is the Itinerary Net (ITRNet), which estimates the likelihood of different POIs to be in an itinerary, using two LSTMs.
Also, both of our algorithms are metricagnostic in the sense that these do not require or rely on any specific diversity or popularity metrics, i.e., the users do not need to worry about defining suitable popularity or diversity metrics. Nevertheless, we extensively evaluate the algorithms using some widely used popularity and diversity metrics on realworld datasets, and these experimental studies show that both algorithms recommend high quality alternatives.
In many realworld applications, users may want to impose certain constraints on the alternative itineraries recommended by the system. For example, the total cost to visit the POIs (including traveling cost and entry tickets etc.) in each recommended itinerary must not be more than a user defined budget or each recommended itinerary must pass through some specific “mustsee” POIs chosen by the user etc. Existing learningbased algorithms are unable to trivially handle such constraints. We propose a novel sampling algorithm, DeepAltTrip
Samp, that can seamlessly handle a wide variety of such user defined constraints. It employs an enhanced Markov Chain Monte Carlo (MCMC) algorithm, a variation of Gibbs sampling
[12], which facilitates in pruning the candidate itineraries that do not meet the user defined criteria.Our contributions in this paper are summarized below.

We are the first to present learningbased algorithms called, DeepAltTripLSTM and DeepAltTripSamp, to recommend alternative itineraries using historical trips without requiring any explicit popularity or diversity modeling.

A unique advantage of our algorithm DeepAltTripSamp is that it can seamlessly support additional constraints on the generated itineraries.

We conduct an extensive experimental study using 8 real world datasets drawn from two domains and evaluate the algorithms on several widely used popularity and diversity metrics. The results demonstrate that our metricagnostic algorithms propose high quality results and significantly outperform the competitors.
2 Related Works
To our knowledge, there is no work that directly solves the problem of recommending multiple alternative itineraries through learning from historical trips. However, there are three realms of work relevant to our problem: traditional trip recommendation system that recommends a single itinerary, either through heuristics or through learning; search based techniques that attempt to return top
itineraries based on an optimization problem to maximize an explicitly defined objective function; and POI recommendation methods that are mostly focused on recommending individual POIs rather than itineraries.2.1 Trip Recommendation Systems
Earlier works model tour recommendation as an orienteering problem, where the goal is to find an itinerary that maximizes a certain objective function (e.g., popularity) satisfying given constraints (e.g., budget). [9] constructs itineraries based on user visits by first constructing a POI graph and then generating an itinerary based on this graph that maximizes the total POI popularity within the user budget. [6] argues that popular routes cannot be properly inferred only through counting from historical routes, and proposes a heuristic solution by first obtaining a transfer network and then inferring itineraries based upon an absorbing Markov Chain model built based on the network. [2] recommends itineraries based on user budget, time limitations and past historical data. [3] uses a two phased approach where it first interacts with the user to know her venue specifications and then uses crowdsourced data to generate personalized POIs for the user. [23, 24] provide personalized user recommendation through modeling the problem as an integer programming problem given the budget constraints. [4] is the first to learn POI preferences and optimize the itinerary based on historical trip data and various features such as POI category, distance, and visiting time. Several other variations of the trip recommendation problem have been studied [28, 21, 22]. A comprehensive survey on this group of works is presented in [22]. Since human mobility is correlated with the location and category of POIs, [27] and [11] proposed an adversarial model to generate a itinerary for a user query. [29]
provide a personalized itinerary through a Nerualized A* search using LSTM and self attention to estimate an observable cost and an MLP leveraging graph attention models to estimate the heuristic cost. All of the above methods provide a single itinerary for a given source and destination, and cannot be trivially extended for providing
diverse and popular itineraries.2.2 Search Based Techniques for Itineraries
This realm of work adopt search based techniques to provide itineraries. Liang et al.[20] provide top itineraries through searching, where they use a submodular function to specify diversity requirement. Xu et al. [32] compute itineraries maintaining a minimum spatial distance covering a set of POI categories, where the objective is to maximize the popularity satisfying the diversity constraints. Wang et al.[31] leverage POI semantics information to develop an efficient algorithm for providing itineraries with the least cost. [30] provides top trajectories based on user suggested location and category keyword preference. Major focus in all these works is to reduce query processing time for search algorithms, where they gather statistical POI data and require explicit diversity constraints to guide the search process.
Another group of works attempt to determine alternative routes for shortest path queries through searching in a road network graph. [5, 7] leverage penalty based techniques to increase edge weights of previously used paths to gain k shortest paths which maintain some diversity.[15] creates separate shortestpath trees from the source and destination nodes. The connecting branches, called plateaus are considered for alternative routes, as longer plateaus tend to have higher dissimilarity.[8, 25] define a dissimilarity function and then attempt to find alternative paths that exceed a predefined user dissimilarity threshold. In contrast to these works, the focus of our work is to develop a learning based approach that finds popular and diverse itineraries based on the historical trips.
2.3 POI Recommendations
3 Problem Formulation
Let be the list of POIs in a city. Each POI is represented as a pair, where is the location of represented by its latitude and longitude coordinates and is the category type of . Let be a multiset containing all historical routes of the past users visiting POIs in . Each route is an ordered sequence of POI visits, where is the POI at position in the route and is the total number of POIs in the route.
Given historical routes and a query with starting location , ending location and an integer , our aim is to recommend a set of alternative itineraries that are both diverse and popular, where each itinerary is an ordered sequence of POIs with and .
Intuitively by popular itineraries we mean that the sequence of POIs visited have been frequently adopted by past users going from to , and by diversity or dissimilarity we mean that the set of recommended itineraries have minimal overlap. Specific metrics to evaluate popularity and diversity are mentioned in Section 5.3.
Note that unlike in a search based procedure a recommended itinerary is not necessarily a historical route, i.e., it is possible that . Also note that we attempt to learn to recommend alternative itineraries rather than attempting to maximize any popularity or diversity metric.
4 Our Approach
The DeepAltTrip consists of two main components: (i) the Itinerary Net
(ITRNet) to estimate the probabilities of POIs at a particular position of a given itinerary by using two (forward and backward) LSTMs, and (ii) an itinerary generation algorithm to generate
alternative itineraries passing through prominent POIs obtained using the ITRNet. For itinerary generation we propose two variants of DeepAltTrip: first one is an LSTM based itinerary generation technique, and the second on is a sampling based technique that provides flexibility to accommodate user constraints.Notation  Meaning 

Dataset consisting of historical routes  
List of POIs in  
User query with source POI , destination POI and no. of itineraries  
Ground truth set of routes with unique pair  
Recommended set of itineraries for query  
No. of itineraries in  
length of recommended itineraries  
POI in POI list  
Input source and destination POI to the model  
Route with length  
Shorthand for route where  
Position of source and destination POI  
POI at position  
Prominent POI  
Itinerary generated at iteration  
POI at position on the itinerary  
route in or  
No. of occurrences of POI 
4.1 ITRNet Model
The ITRNet consists of two LSTM’s, namely a forward and a backward LSTM. The forward LSTM takes a partial route sequence from the starting POI and estimates the probability of a POI being the next POI in the route sequence. The backward LSTM takes a route sequence in reverse order, starting from the destination POI, to estimate the next POI in the reverse route sequence. Both LSTM’s also take into consideration the actual source and destination POIs of the route being generated.
Let be a route, and are the source and destination POIs respectively. Formally, given the forward partial sequence , the source and destination , the probability of POI replacing ,, can be computed as:
(1) 
Similarly, given the backward partial sequence , the source and destination , the probability of POI replacing can be computed as:
(2) 
Similarly, we compute the probabilities of all POIs in , which is denoted as
dimensional vectors
and respectively, where each element in and represents the conditional probabilities defined in Equation 1 and 2 respectively. The forward LSTM computes , while the backward LSTM computes .To develop the ITRNet, we first compute POI embeddings, which would be later used to train the subsequent ITRNet forward and backward LSTM models.
4.1.1 POI Embedding
To capture the spatial and semantic information of POIs in the ITRNet, we use two graph autoencoders [17] to get the embeddings of POIs. These embedding enable the LSTM models to accurately compute the probability of POIs even if the historical visits to the POIs are sparse.
We first define two graphs and for POI categories and spatial distance, respectively and the nodes of both graphs are the POIs. The weight of an edge between two nodes in and is related to the categorical similarity and the distance of the two corresponding POIs, respectively.
The adjacency matrix for graph is defined as follows.
This adjacency matrix captures the categorical similarity between POIs, where POIs with the same category have a connection. On the other hand, the adjacency matrix of is defined as follows.
Here is the distance between POI and , and and are the maximum and minimum distances between any two POIs, respectively.
We use the Euclidean distance between two POIs calculated using their latitude and longitude, which is a reasonable approximation of the road network distance [14]. Without loss of generality, Euclidean distances can be replaced with road network distances if underlying road network is available. We use exponential terms to amplify the difference between the normalized weight values. The matrix gives a larger weight to edges between POIs that are nearer to each other. For the lowest distance between any two nodes and , the edge weight between them will be 1. On the other hand the two nodes with the maximum distance between them will have an edge weight of 0.
We obtain two embeddings and from the two graph autoencoders respectively, where and are the embedding dimensions. We concatenate these two embeddings to obtain the final POI embeddings, , where ,‘’ is the concatenation operation and row of corresponds to the embedding of POI .
The reconstructed adjacency matrix for is calculated as:
for both autoencoders, where . Here, is the POI ground truth adjacency matrix,
is the featureless identity matrix input, and GCN is a graph convolution operation
[16]. The learned node embeddingsthus captures the categorical and distance information between POIs, which assists the downstream LSTM models to predict the next POI in a route sequence more accurately. We use MSE and crossentropy loss functions for the autoencoders corresponding to
and , respectively.4.1.2 Forward and Backward LSTM Models
This is the main component of ITRNet, which uses two LSTMs to generate two conditional probabilities based on a known partial route sequence, a given source POI and a given destination POI. It also uses the POI embeddings obtained from the previous step.
To estimate the forward conditional probability , we first obtain the encoding of the given partial route sequence. The encoding of a subroute upto any position , that is the partial route where is obtained using a forward LSTM model as follows.
where, , , is the embedding of POI at step , source POI and destination POI respectively, is the hidden state LSTM vector at , and is the hidden state vector at step .
After obtaining the encoding of the observed subroute, we calculate the probability of a POI for position as follows.
where
Here is the POI embedding of POI , is the subroute encoding upto and
is a twolayer perceptron network, which outputs a score
. Passing these scores through a softmax layer gives us the final forward conditional probability estimation vector
. The thus computes the probability of a POI being the next POI in the route, given the encoding of the partial route upto step and the embedding of the POI .Similarly, we develop the backward LSTM model to estimate the backward conditional probability . It takes the backward subroute in reverse order, along with the source POI and destination POI . Essentially, we generate the encoding at step , where , as follows.
Using the above encoding and the earlier POI embeddings we estimate . The procedure is similar to the procedure to obtain , so we do not repeat it here. Thus the forward LSTM predicts the next POI given a forward partial route sequence, whereas the backward LSTM predicts the next POI in the reverse partial route sequence, i.e. the immediate previous POI given a sequence of POIs visited after the predicted POI.
While training, we adopt the binary crossentropy loss to train both the LSTM models.
4.2 Generating Itineraries Using ITRNet
In this phase, we generate alternative itinerary recommendations using the ITRNet backward and forward LSTM models. Given a query , we first compute a relevancy score of all POIs for a given source POI and destination POI using the ITRNet. Then, at each iteration we extract a prominent POI based on the computed relevancy scores and generate alternative itineraries each going through a different prominent POI. We first describe how we compute the POI relevancy, after that we describe how an itinerary is generated. Finally we describe how alternative itineraries are obtained in an iterative manner.
4.2.1 Computing POI Relevancy
By using the ITRNet, we define a relevancy function that outputs a relevancy score for every POI for given query source POI and destination POI .
Consider the route , where is the variable POI. We define the function as follows:
This function provides higher scores to POIs which are more relevant in the context of the query source and destination POIs.
4.2.2 Generating an Itinerary
Based on the relevancy function, we obtain a prominent POI, , the POI with the maximum relevancy score. After obtaining a prominent POI, we generate an itinerary containing that prominent POI. DeepAltTrip uses the forward and backward LSTM models of ITRNet to generate the partial itinerary from the prominent POI to the source POI and the partial itinerary from the prominent POI to the destination POI. We call the partial itinerary from the source POI to the prominent POI the first half itinerary and the partial itinerary from the prominent POI to the destination POI the second half itinerary.
We generate the half itineraries starting from the prominent POI , as then the corresponding LSTM that would be used will output probabilities with the knowledge that the prominent POI is present in the itinerary being generated.
There are two ways to develop the full itinerary through generating the first and second half itineraries starting from the prominent POI:

Generate the first half itinerary in reverse order using the backward LSTM model of ITRNet. Then given the first half itinerary as partial sequence, generate the second half itinerary using the forward LSTM model.

Generate the second half itinerary using the forward LSTM model. Then given the second half itinerary as a partial sequence, generate the first half itinerary in reverse order using the backward LSTM model.
Note that in all cases the source and destination POI input to the LSTM models are the query source and destination POIs and , respectively. We assume a maximum allowable length of a half itinerary . To obtain the first half itinerary (following the first way), we use the backward LSTM of ITRNet. We place the prominent POI at position and generate POI probabilities for positions to . We determine the position of the source POI in the reverse first half itinerary as:
We place the source POI in position . For all other positions from down to we choose the POI in the sequence through:
Note that during this choice we avoid selection of a POI which is already in the partial sequence generated to avoid loops in the recommended itinerary. We also avoid selection of the given source and destination POIs too. Finally we adopt the partial itinerary sequence from position to as our first half itinerary.
After generating the first half itinerary, we generate the second half itinerary from position to given the first half itinerary, source POI and destination POI . We determine the position of the destination POI in the second half itinerary as:
We place the destination POI at . For all other positions from to we choose the POI in the sequence using the forward LSTM model through:
Finally we put at position . The partial itinerary sequence from to make up our second half itinerary. Thus the first and second half itineraries make up our desired itinerary from POI to through the given prominent POI .
Similarly we can generate the itinerary using the second way as mentioned above. We place the prominent POI at position . Among the two generated itineraries, we choose the one with the lowest perplexity or negative log likelihood, calculated using the forward LSTM model:
(3) 
Where is the itinerary of length with and .
4.2.3 Generating Alternative Itineraries
To obtain alternative itineraries as specified in a given query , we generate itineraries iteratively through determining a prominent POI at each iteration. We keep track of the total number of occurrences for all POIs in the itineraries generated until the current iteration. Let be the number of occurrences of POI in all the itineraries generated up to the current iteration. At each iteration , we obtain the set of POIs with minimum occurrence . Then we obtain the prominent POI at any iteration as:
(4) 
We then take and generate an itinerary as described in Section 4.2.2 using as the prominent POI. After obtaining an itinerary we update the values of for each in the itinerary obtained in this iteration.
We run the same process times and obtain our desired itineraries. An overview of the whole DeepAltTripsystem is given in Algorithm 1.
4.3 Generating Itineraries Through Sampling
In a realworld scenario, a user may want to impose some constraints on the generated alternative itineraries, such as setting a fixed budget or time limit, or specifying mustsee POIs, etc. It is not possible to support such constraints in our proposed LSTM based trip generation technique described in Section 4.2.2. To overcome such limitations, in this section, we propose an alternate sampling algorithm to generate an itinerary starting from , passing through the prominent POI , and ending at . Our sampling based approach is as follows.
We iteratively generate candidate itineraries. At iteration of the sampling method, we start with an initial itinerary . In the iteration process, at iteration , suppose we have itinerary of length . We generate a sample at iteration , i.e., of length by modifying sample .
For modification, we randomly select a POI at position of the itinerary , where . Then we perform one of the following four operations at , namely at POI of itinerary :
Insertion: We first assign . We then insert a POI between and
. We now define the following conditional probability distribution using the ITRNet:
The above equation gives us the probability of a POI at a given position given both the partial sequence from the source POI to the POI before at position and the partial sequence starting after from position to the destination POI at position . We can compute this for all POIs in , which can be denoted as an dimensional vector . We compute as follows:
(5) 
where, . Intuitively, we give more weights to the model that has seen a longer subroute and thus have a greater contextual information.
We obtain at position , where the newly inserted POI is located in the itinerary. We sample a POI from and assign the obtained sample as . The rest of the itinerary remains unchanged.
Deletion: We delete at position of the itinerary and keep the rest of the itinerary unchanged.
Replacement
: We obtain the conditional probability distribution
as in Equation 5 using the ITRNet. We then take a sample from this distribution to obtain a POI and get replacing POI .Swap and Replace: We randomly select a position , between to , and swap the position of POI at position and POI at position . If is not a prominent POI, we also perform the Replacement operation (as described in the previous paragraph) at position after the swap.
To perform an operation at any iteration , we choose any one of the operations with equal probability assigned to all allowed operations. If the selected POI is the prominent POI, we do not perform the deletion or replacement operation on that POI. Also, to avoid loops, we omit inserting or putting through replacement a POI that is already present in except for POI .
At each iteration we check the following two conditions:

All the predefined user constraints are satisfied (if any)

The perplexity of as defined in Equation 3 is lower than itinerary at iteration , or no new itinerary has been accepted for previous two iterations.
If both conditions are satisfied, we adopt itinerary for generating itinerary at iteration . Otherwise we retain itinerary and use this to generate itinerary at iteration , meaning we reject the modification operation performed at iteration . The sampling runs for iterations. The itinerary generated and accepted with the minimum perplexity is returned as the desired itinerary. Note that if any user defined constrains is given, we have to first build an initial itinerary satisfying all the given constraints. Any such itinerary that satisfies all the conditions given will suffice as the initial itinerary.
4.3.1 Satisfying User Constraints
Our sampling algorithm makes it possible to generate alternative itineraries that can satisfy a variety of user constraints. Examples include:

A given fixed budget: If the cost from visiting one POI to another is given, users may want itineraries that they can visit within a fixed budget. We may omit a candidate itinerary generated at an iteration if the itinerary exceed the budget.

Must see POIs: Users may want itineraries that must include one or more specific given POIs. In such cases, we keep those POIs in the initial sequence and treat them similar to the prominent POI, i.e., we don’t delete or replace those POIs.

Time constraints for POIs and Itineraries: Many times POIs have opening and closing hours. Given a start time along with the source and destination POIs, the average staying time in a POI and average travel times between POIs, we can check whether all the constraints are met while generating itineraries in different iterations. We can also consider only those POIs during sampling in insertion or replacement that would satisfy the time constraints. Also users may want itineraries that they can travel within a fixed time limit. This can be also satisfied, where we omit itineraries generated in an itinerary when the time budget is not satisfied.
Note that the aforementioned constraints cannot be trivially satisfied in a traditional deep learning algorithm. Thus the itinerary generation technique of DeepAltTripSamp is effective in many practical scenarios for generating itineraries in a constraint setting.
5 Experiments
In this section, we present the experimental evaluations for DeepAltTrip to recommend alternative itineraries for a given source and destination POI. In particular, depending upon the itinerary generation strategy we have two versions of DeepAltTrip: (i) DeepAltTripLSTM that uses LSTMs for generating an itinerary (Section 4.2.2), and (ii) DeepAltTripSamp that adopts a sampling based flexible approach for generating an itinerary (Section 4.3).
5.1 Baselines
We are the first to learn alternative itineraries from historical routes. As there are no prior works in the literature that directly solves our problem, we adapt two stateoftheart trip recommendation techniques that learn from historical routes, and modify them to recommend alternative itineraries. Our two baselines are as follows.

Markov+DBS: We extend the Rank+Markov of [4] to generate alternative itineraries. We first compute score ranks for POIs based on their features for a given query. Then a POItoPOI transition matrix is computed from featuretofeature transition probabilities. From the computed POI scores and transition probabilities, we use Viterbi algorithm to generate a route of a specific length for a given source and destination. We incorporate the diversified beam search measure given by [18] to maintain paths at each step of the algorithm.

NASR+DBS: We adopt NASR [29] which uses selfattentionbased LSTM to estimate the conditional probability (similar to our forward LSTM model). Again we run the diversified beam search [18] on top of this model and return the top itineraries ending at the destination POI with the highest probability scores.
We evaluate the effect of a number of parameters. Specifically, the no. of alternative itineraries recommended, , is varied from to with a default value of . Also to ensure fair comparison, we fix the length of each itinerary (i.e., number of POIs in it including and ) recommended by different algorithms. is varied from to with the default value being . We use a 5fold cross validation: one fold is kept for testing and the other four folds are used to train a model. The average performance metrics among all five folds are reported.
5.2 Datasets
We use eight popular realworld datasets drawn from two different domains. As a first group of datasets, we take geotagged Flickr traces of three touristic cities: Edinburgh, Toronto and Melbourne [23, 4]. In the second group of datasets, we consider trips of five different theme parks: California Adventure, Hollywood, Disneyland, Disney Epcot and Magic Kingdom [21].
Along with the trips involving different POIs, the datasets also contain location and the category of each POI. The trajectories given in these datasets are generated from user checkins, with the visiting time between two consecutive POIs in a trajectory is no more than 8 hours. We filter out multiple occurrences of POIs (if any) from these trajectories to avoid loops. We also only consider trajectories having at least three POIs. Table II shows the details of each dataset including the number of POIs, number of routes having at least three POIs and the number of ground truth set of routes generated with unique (, ) pairs.
Place  # POIs  # routes  unique () pairs 

Edinburgh  28  634  267 
Toronto  29  335  163 
Melbourne  88  442  373 
California Adventure  25  1475  404 
Hollywood  13  901  134 
Disneyland 
31  2792  618 
Disney Epcot  17  1248  207 
Magic Kingdom  27  2218  508 


5.3 Evaluation Metrics
Given a query , we use to denote the set of recommended itineraries by an algorithm and to denote the ground truth routes which consists of all the historical routes that start at and end at . Next, we describe the metrics that we use to measure the quality of itineraries returned by an algorithm. In particular, we measure the quality of our alternate itineraries by using traditional popularity and diversity metrics independently as well as by another metric that considers both popularity and diversity at the same time.
5.3.1 Popularity
We we use the widely used F1 score and pairsF1 score [23, 4] to measure the popularity of a set of recommended itineraries .
Suppose, route is a ground truth route where and and itinerary is a recommended itinerary. Also, let and be the sets of POIs in the ground truth route and recommended itinerary respectively. The precision of the recommended itinerary is calculated as: and the recall is calculated as
. The F1 score is the harmonic mean of precision and recall, i.e.
.In contrast to F1, pairsF1 score considers orders of POIs in the routes. Specifically, precision is the no. of ordered POI pairs present in both and divided by the total no. of ordered POI pairs in . The recall is the total no. of ordered POI pairs present in both and divided by the total number of ordered POI pairs in . The pairsF1 score is the harmonic mean of this precision and recall.
We compute F1 score (resp. pairsF1 score) for each pair in and report the average value as the popularity of the recommended itinerary set . The popularity scores indicate the average quality of the individual recommended itineraries with respect to the historical trips adopted by past users. Note that the F1 or pairsF1 scores are always between 0 and 1.
5.3.2 Diversity
We adapt the diversity metric used in [1], originally defined to measure the diversity among POIs in a set. To measure the diversity of a set of itineraries containing itineraries, we first measure a similarity value for a pair of itineraries and . We then define the diversity value of a recommended set of itineraries of size as
(6) 
In Equation 6, we calculate the average diversity between all pairs in the recommended itinerary set . For each pair, the diversity value is the dissimilarity value between the two itineraries, i.e., . As long as the similarity measure is between 0 to 1, the dissimilarity measure and thus will also remain between 0 to 1, with higher value indicating higher diversity between the recommended itineraries. In our experiments, we adopt the F1 score between a pair of itineraries as the similarity measurement, ignoring the source and destination POIs.
Also observe that, the notion of popularity and diversity is somewhat conflicting. For example, suppose a model is to recommend 2 alternative itineraries. If the model recommends the most popular itinerary 2 times, it will achieve the maximum average popularity score, but the diversity score of the recommended itinerary set would be 0. If it attempts to diverse from this most popular itinerary, it would achieve the diversity score. But unless it can align with an alternatively popular itinerary, the popularity measure of this second recommended itinerary will be low and the average popularity score will drop. Thus to achieve higher popularity and diversity scores at the same time, a model must recommend quality alternative itineraries with respect to the historical trips.
5.3.3 Combination of Popularity and Diversity
Since our goal is to recommend quality alternative itineraries that are both popular and diverse, we need a measure that combines the popularity and diversity of . We employ the widely used weighted sum to define the combined metric as
where is the average popularity score of the recommended itineraries which is computed using the F1 score as discussed earlier in Section 5.3.1. The parameter specifies the relative importance of popularity and diversity.For example, gives very little importance to the popularity of the itineraries, and provides very little importance to the diversity of the itineraries. Thus we consider during our evaluation. To give equal importance to both popularity and diversity, we set as a default value.
It is important to note that our proposed approaches, both DeepAltTripLSTM and DeepAltTripSamp are agnostic to the above metrics, and our motivation is to learn popular alternative routes without any such explicit modeling of popularity and/or diversity. Yet, we show that that our proposed learningbased approaches outperform baselines significantly w.r.t. these traditional metrics.


5.4 Hyperparameter Tuning
We now describe the hyperparameter values used in
DeepAltTripLSTM and DeepAltTripSamp.Recall that our algorithm first obtains POI graph embeddings through two separate graph autoencoders. It then trains two LSTM models. Finally separate itineraries are generated through different prominent POIs, through the use of either an LSTM based technique (DeepAltTripLSTM) or a sampling based technique (DeepAltTripSamp).Graph Autoencoders and ITRNet: For the graph autoencoder, the embedding dimensions and were 12 and 24, respectively. The autoencoders were trained with learning rate of 0.05 and 0.01 respectively, using the Adam optimizer. The hidden layer size for both the forward and backward LSTM models was 32. The dimension of the MLP layer was 30. Here, the learning rate was set 0.001 for the whole model, using the Adam optimizer as before. Both the LSTM models were trained with a batch size of 32.
DeepAltTripLSTM: To generate fixed length itineraries containing POIs, we first generate the half itinerary as prescribed in the algorithm setting to , and then the other half itinerary is generated such that the length of the total itinerary is . Also is set to the length of the longest itinerary found in the training dataset.
DeepAltTripSamp: We start with an initial itinerary consisting of POIs by placing the intermediate prominent POI at a random position between to . We ignore the insert and delete operations here as those operations would change the length. Replacement or Swap and Replace operations are performed each with probability 0.5. For prominent POI, we do not use replace operation and only apply Swap and Replace operation with probability 0.5. The sampling algorithm is run for iterations in total. This ensures that when the search space is larger (i.e., larger ), the algorithm runs more iterations to achieve good quality.
5.5 Performance Comparison
We now discuss performance in terms of the evaluation metrics considered. We first consider the average popularity and diversity of the recommended itineraries independently, after that we consider the combined score to assess the performance of the system to return multiple alternative itineraries. Next we evaluate the effect on performance if we vary the length of the recommended itineraries and also if we vary the no. of itineraries to be recommended. We also compare the running times of the variations of
DeepAltTrip and also compare them with the baselines.5.5.1 Considering Popularity and Diversity Independently
Table III shows the popularity (using both F1 score and pairsF1 score) and the diversity of the itineraries recommended by each approach. We see that the average F1 and pairsF1 scores of both of our approaches are similar to those of the competitors which are the stateoftheart for returning most popular itineraries. On the other hand, the average diversity of the recommended itineraries provided by our approaches far exceed those of the competitors in all datasets. For example in the Edinburgh dataset, DeepAltTripLSTM and DeepAltTripSamp provide 19.13% and 12.60% higher average diversity, respectively, than the nearest competing baseline. This shows that our approaches provide much more diverse itineraries while keeping the popularity of the recommended itineraries on par with the other baselines. In other words, these baselines primarily focus on popularity, and the competitive F1 and pairsF1 scores show that our approaches generate diverse itineraries without compromising on the popularity of the recommended itineraries, thus providing quality alternative itineraries.




5.5.2 Considering Combined Popularity and Diversity Score
We vary in the combined metric from to and show the results for each value in Table V. Due to space constraints, we only present the results for two datasets, one from each group. Results on the other datasets show similar trends. Again the length of the recommended itineraries is set to 3 and the no. of itineraries recommended is set to 5. The scores of the combined metric is shown in Table V.
We observe that both variants of DeepAltTrip outperform the competitors even when is set to 0.9, i.e., the popularity is given a much higher importance than diversity. When both are given equal importance i.e., , we see that in the Epcot dataset DeepAltTripLSTM and DeepAltTripSamp outperform the nearest competing baseline by 29.24% and 25.34%, respectively.
5.5.3 Effect of Varying Length of Recommended Itineraries
We vary the length of the recommended itineraries as 3, 5, 7, and 9. Table IV shows the results. We show the results for the Edinburgh and Epcot datasets (other datasets also follow similar trend). Note that, for all values of , the average F1 and pairsF1 score remain similar to those of the competitors that primarily focus on providing popular itineraries. However, the diversity of the recommended itineraries by DeepAltTripLSTM and DeepAltTripSamp are significantly higher than these competitors.
We observe that, as the value of increases, both DeepAltTripLSTM and DeepAltTripSamp outperform the nearest competing baseline by a greater margin. For example, for the Edinburgh dataset, DeepAltTripLSTM provides 19.13% increase in diversity and 10.33% increase in the combined score for , whereas it provides a 136% increase in diversity and a 44.38% increase in the combined score for . Similarly, for DeepAltTripSamp in the Edinburgh dataset, we see a 12.60% increase in diversity and 6.72% increase in the combined score for , whereas a 107% increase in diversity and 34.55% increase in the combined score for . This is primarily because the average diversity provided by the baselines significantly drops for larger . Note that, for the case when itinerary length is (including and ), diversity for each approach is maximum (i.e., ) which is because the only intermediate POI in each itinerary is different from the other recommended itineraries.
5.5.4 Effect of k
Here we set the length of itineraries recommended, to 5. Again we show the results in two datasets for space constraints, taking one each from the two different domains. Other datasets show similar trends. The results are shown in Table VI.
Our proposed approaches consistently achieve higher diversity even for larger . The average F1 score of the recommended itineraries slightly drop for all approaches, however, our approaches are comparable to the baselines. Consequently, we see that both DeepAltTripLSTM and DeepAltTripSamp provide significantly higher combined scores. For example, in the Edinburgh dataset, DeepAltTripLSTM provides 11.74% , 10.32%, 10.69% and 11.72% higher combined score with for and , respectively. Also DeepAltTripSamp provides 9.53%, 6.72%, 7.57% and 8.42% higher combined scores for and , respectively, in the same dataset.
5.6 Running Time Comparison
We run all the algorithms on the same machine equipped with Intel corei7 8565U CPU, 16GB RAM. We record the average time per query (in seconds) for five folds of the dataset, and report the average time per query taken across the five folds.
We vary the length of recommended itineraries, as 3,5,7 and 9 and keep the no of recommended itineraries as 3. As the Melbourne dataset has the maximum no. of POIs and Disneyland dataset has maximum no. trips, we show the results for these two datasets to depict the scalability of the algorithms. The results are shown in Table VII.


We observe that DeepAltTripLSTM and NASR+DBS have similar execution times in both datasets; whereas DeepAltTripSamp takes more time than the other approaches. We also see that the execution time increases with the increase of . However as trip recommendation systems generally do not recommend excessively long routes to users [6], this increasing trend of query execution time is quite acceptable. We observe that although the Melbourne dataset have almost three times more POIs than the Disneyland dataset, the average execution time per query remains similar for NASR+DBS, DeepAltTripLSTM and DeepAltTripSamp. Hence, the running times of these three algorithms are not significantly influenced by the no. of POIs; whereas the execution time of Markov+DBS increases substantially as the no. of POIs increases.
6 Conclusion
This paper proposed two deeplearningbased approaches that learns to recommend top alternative itineraries for a given source and destination. We first developed Itinerary Net (ITRNet) that estimates the likelihood of POIs on an itinerary by using graph autoencoders and two (forward and backward) LSTMs. Based on the ITRNet, we have developed two variants, DeepAltTripLSTM and DeepAltTripSamp, to recommend alternative itineraries using historical trips without requiring any explicit popularity or diversity modeling. Our DeepAltTripSamp solution can also trivially incorporate various user defined constraints. Extensive experiments using realworld datasets show that the DeepAltTripLSTM and DeepAltTripSamp outperform the best performing baselines by up to 29.24% and 25.34%, respectively, for the default settings w.r.t. the combined popularity and diversity measure. In the future, we plan to incorporate user personalization leveraging information of the historical routes visited by the querying user.
References
 [1] (2016) A package recommendation framework for trip planning activities. In Proceedings of the 10th ACM Conference on Recommender Systems, pp. 203–206. Cited by: §2.3, §5.3.2.
 [2] (2013) Where shall we go today? planning touristic tours with tripbuilder. In The Conference on Information and Knowledge Management (CIKM), pp. 757–762. Cited by: §2.1.
 [3] (2014) TripPlanner: personalized trip planning leveraging heterogeneous crowdsourced digital footprints. IEEE Transactions on Intelligent Transportation Systems 16 (3), pp. 1259–1273. Cited by: §2.1.
 [4] (2016) Learning points and routes to recommend trajectories. In The Conference on Information and Knowledge Management (CIKM), pp. 2227–2232. Cited by: §1, §1, §2.1, 1st item, §5.2, §5.3.1.
 [5] (2007) Reliable pretrip multipath planning and dynamic adaptation for a centralized road navigation system. IEEE Transactions on Intelligent Transportation Systems 8 (1), pp. 14–20. Cited by: §2.2.
 [6] (2011) Discovering popular routes from trajectories. In 2011 IEEE 27th International Conference on Data Engineering, pp. 900–911. Cited by: §2.1, §5.6.
 [7] (2019) Shortestpath diversification through network penalization: a washington dc area case study. In Proceedings of the 12th ACM SIGSPATIAL International Workshop on Computational Transportation Science, pp. 1–10. Cited by: §2.2.
 [8] (2018) Finding kdissimilar paths with minimum collective length. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 404–407. Cited by: §1, §2.2.
 [9] (2010) Automatic construction of travel itineraries using social breadcrumbs. In Proceedings of the 25th ACM conference on Hypertext and social media, pp. 35–44. Cited by: §2.1.
 [10] (2018) Deepmove: predicting human mobility with attentional recurrent networks. In Proceedings of the 2018 World Wide Web conference, pp. 1459–1468. Cited by: §2.3.
 [11] (2019) DeepTrip: adversarially understanding human mobility for trip recommendation. In Proceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 444–447. Cited by: §1, §1, §2.1.
 [12] (1984) Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI6 (6), pp. 721–741. Cited by: §1.
 [13] (2017) Geographical diversification in poi recommendation: toward improved coverage on interested areas. In Proceedings of the 11th ACM Conference on Recommender Systems, pp. 224–228. Cited by: §2.3.
 [14] (2018) Is euclidean distance really that bad with road networks?. In Proceedings of the 11th ACM SIGSPATIAL International Workshop on Computational Transportation Science, pp. 11–20. Cited by: §4.1.1.
 [15] (2012August 21) Method of and apparatus for generating routes. Google Patents. Note: US Patent 8,249,810 Cited by: §2.2.
 [16] (2016) Semisupervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. Cited by: §4.1.1.
 [17] (2016) Variational graph autoencoders. arXiv preprint arXiv:1611.07308. Cited by: §4.1.1.
 [18] (2016) A simple, fast diverse decoding algorithm for neural generation. arXiv preprint arXiv:1611.08562. Cited by: 1st item, 2nd item.
 [19] (2021) Comparing alternative route planning techniques: a comparative user study on melbourne, dhaka and copenhagen road networks. IEEE Transactions on Knowledge and Data Engineering. Cited by: §1.
 [20] (2018) Topk route search through submodularity modeling of recurrent poi features. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 545–554. Cited by: §1, §2.2.
 [21] (2017) Personalized itinerary recommendation with queuing time awareness. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 325–334. Cited by: §2.1, §5.2.
 [22] (2019) Tour recommendation and trip planning using locationbased social media: a survey. Knowledge and Information Systems, pp. 1–29. Cited by: §2.1.

[23]
(2015)
Personalized tour recommendation based on user interests and points of interest visit durations.
In
The 24th International Joint Conference on Artificial Intelligence(IJCAI)
, Cited by: §1, §1, §2.1, §5.2, §5.3.1.  [24] (2018) Personalized trip recommendation for tourists based on user interests, points of interest visit durations and visit recency. Knowledge and Information Systems 54 (2), pp. 375–406. Cited by: §2.1.
 [25] (2017) Finding topk shortest paths with diversity. IEEE Transactions on Knowledge and Data Engineering 30 (3), pp. 488–502. Cited by: §1, §2.2.
 [26] (2016) Predicting the next location: a recurrent model with spatial and temporal contexts. In Thirtieth AAAI conference on artificial intelligence, Cited by: §2.3.
 [27] (2018) A nonparametric generative model for human trajectories. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), pp. 3812–3817. Cited by: §2.1.
 [28] (2014) The shortest path to happiness: recommending beautiful, quiet, and happy routes in the city. In Proceedings of the 25th ACM conference on Hypertext and social media, pp. 116–125. Cited by: §2.1.

[29]
(2019)
Empowering a* search algorithms with neural networks for personalized route recommendation
. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 539–547. Cited by: §1, §1, §2.1, 2nd item.  [30] (2017) Answering topk exemplar trajectory queries. In The IEEE 33rd International Conference on Data Engineering (ICDE), pp. 597–608. Cited by: §2.2.
 [31] (2019) Semanticaware topk multirequest optimal route. Complexity 2019. Cited by: §2.2.
 [32] (2019) Diversifying topk routes with spatial constraints. Journal of Computer Science and Technology 34 (4), pp. 818–838. Cited by: §1, §2.2.
 [33] (2020) Where to go next: a spatiotemporal gated network for next poi recommendation. IEEE Transactions on Knowledge and Data Engineering. Cited by: §2.3.