1. Introduction
With the growing popularity of LocationBased Social Networks (LBSNs) such as Foursquare and Yelp users share their checkin records and the experiences they had while visiting different points of interest (POIs), such as restaurants, shopping malls and museums. The availability of such a myriad of data has opened various research opportunities in POI recommendation (Liu et al., 2016; Aliannejadi and Crestani, 2018; Liu et al., 2017). For instance, much work has been done with a focus on modeling sequential checkin information. Since sequential checkins reveal a wealth of latent information about POIs and user preferences before or after visiting specific POIs, relevant studies argue that modeling the sequential checkin information is crucial for POI recommendation (Liu et al., 2016; Feng et al., 2017).
Recent work has shown the effectiveness of pretrained POI embeddings in improving the recommendation performance (Erhan et al., 2010; He et al., 2017). In particular, the main idea behind this approach is to learn a representation of POIs based on a large amount of checkin data and use the pretrained embeddings as initial values of the latent representations of POIs in conventional recommendation models (Chang et al., 2018; Zhao et al., 2017b). Another line of research points out the importance of POI categories as they convey useful information regarding users’ interests and habits (Aliannejadi and Crestani, 2018; Zhang and Chow, 2015; Aliannejadi et al., 2018). Therefore, it is critical to incorporate this information while learning the POI embeddings. However, no effort has been done yet in this direction.
In this paper, we aim to study the effect of categorical information on the performance of our proposed POI embedding model. To this end, we propose a twophase embedding model that captures the sequential checkin patterns of users together with the categorical information existing on LBSN data. Our model, called CategoryAware POI Embedding (CATAPE), learns a high dimensional representation of POIs based on two data modalities, i.e., checkin sequence and POI categories. More specifically, the contributions of this paper can be summarized as follows:
We shows that the characteristics of POIs is important in POI embeddings;
We propose a novel categoryaware POI embedding model that utilizes a user’s checkin sequence information, as well as, POI categorical information;
We evaluate our proposed method of CATAPE on two largescale datasets, comparing the performance of our model with stateoftheart approaches. The experimental results demonstrate the effectiveness of our POI embedding, outperforming stateoftheart POI recommendation models significantly. Furthermore, we show that incorporating the categorical information into the embedding enables CATAPE to capture users’ interests and POIs’ characteristics more accurately.
2. Related Work
In this section, we give a review of previous POI recommendation models. Modeling contextual information such as geographical, temporal, categorical, and social in POI recommendation systems has been proven to be a necessary step to improve the quality of recommendation results (Cheng et al., 2012; Yuan et al., 2013). To develop contextaware applications, Li et al. (2015) modeled the task of recommending POIs as the problem of pairwise ranking, and used the geographical information to propose a rankingbased geographical factorization method. Ye et al. (2011) argued that users’ checkin behavior is affected by the spatial influence of locations and proposed a unified location recommender system incorporating spatial and social influence to address the data sparsity problem. However, this method does not consider the spatial information based on each individual user, but rather models it based on all users’ checkin distribution. Cheng et al. (2012) proposed a multicenter Gaussian model to capture users’ movement pattern as they assumed users’ movements happen around several centers.
With the increasing interest in modeling sequential patterns in other fields such as NLP and the successful use of these approaches in POI recommendation to consider the relation between the sequence of visited locations, much attention has recently been drawn towards modeling sequential patterns in POI recommendation (Liu et al., 2016; Chang et al., 2018; Feng et al., 2017; Zhao et al., 2017b). Liu et al. (2016) adopted the word2vec framework to model the checkin sequences capturing the sequential checkin patterns. Zhao et al. (2017b) proposed a model for POI recommendation with attention to the fact that checkin sequences depend on the day of the week, for instance, work on weekday and entertainment on weekend. Moreover, Feng et al. (2017), presented a latent representation model that is able to incorporate the geographical influence. However, they do not incorporate the contextual information in their model and do not consider the characteristics of POIs. More recently, Chang et al. (2018) proposed a contentaware POI embedding model which incorporated the textual content of POIs into the embedding model. Our work, in contrast, focuses on incorporating categorical information into the POI embedding model to model the characteristics of POIs.
3. Proposed Method
In this section, we propose a categoryaware POI embedding model called CATAPE, which captures both the geographical and categorical information of POIs. Our model generates a high dimensional representation of POIs, which is then plugged into a POI recommendation model to produce a recommendation list. In the following, we first describe an overview of our embedding model and further explain how it is implemented and how the generated POI embeddings are incorporated into a recommender system.
Formally, let be the set of POIs and be the set of categories where determines the category label that can be associated with any . Moreover, let be the sequence of checkins that occurred before and after where determines the checkin to POI .
CATAPE estimates the probability
, whereis a binary random variable indicating whether the POI
with category should be predicted as part of the POI sequence. As mentioned earlier, provides the sequential context of a checkin. The POI recommendation probability is estimated as follows:(1) 
where and denote Checkin module and Category module, respectively. , on the other hand, is a POI recommender component that takes the POI embeddings learned by and and generates a POI classification probability.
Checkin module. The visited POIs by users, which are consecutive in the user’s checkin context, are geographically influenced by each other. The checkin module proposed in the study by (Liu et al., 2016) employs the Skipgram based POI embedding model to capture the user checkin sequence to POIs. For every target POI , generates the context of , extracted from . It includes the POIs visited before and after
based on the predefined window size. The checkin module trains the POI embedding vector by maximizing the objective function of Skipgram model
(Mikolov et al., 2013) in the following way:(2) 
For a given POI , the probability of context POI is computed using the softmax function as follows:
(3) 
where and are the latent vectors of the target and context POI, respectively. is the dimensionality of the latent space and is the number of POIs in the set .
As the size of in Eq. (2) is typically very large, to improve and speed up the process of optimization we adopt a negative sampling technique (Mikolov et al., 2013)
. Thus, the loss function (i.e., negative log) for checkin module is defined as follows:
(4) 
where
is the sigmoid function and
is the set of negative POI samples, i.e., the POIs that do not appear in the context window of .Category module. We propose inspired by the idea of word2vec (Mikolov et al., 2013) where we consider the categories of POIs visited by a user as a “sentence” and every single category as a “word.” Therefore, is estimated as follows:
(5) 
where is the category information of POI , is the target category, and is the categories visited before and after category based on a predefined window size. The probability function is estimated using the softmax function as follows:
(6) 
where and is the concatenation function and is the number of categories in the set . Similar to (4), for efficiency, we formulate the loss function of the category module using the negative sampling technique as follows:
(7) 
where is the sigmoid function and is the set of negatively sampled categories. Finally, we combine the Checkin and Category modules. The final objective function of our categoryaware POI embedding model CATAPE is:
(8) 
CATAPE maximizes the final objective function when it simultaneously learns the geographical and categorical influence of POIs.
System overview. As we discussed earlier, the POI embeddings are learned as part of a classification model where the model determines a missing POI in a sequence of visited POIs (i.e., Skipgram model). This is not an actual POI recommendation setting which considers users and POI at the same time. In the next step, we extract the learned POI embeddings from the trained network to feed to a POI recommendation model. Figure 1 illustrates our proposed workflow of recommendation using pretrained POI embeddings. As seen, based on the checkin data in the training set, CATAPE learns the highdimensional POI embeddings. Then, the learned POI embedding is fed to a recommender model which is able to utilize this information to provide accurate recommendation. In this work, we used Metric Factorization (Zhang et al., 2018) as the recommender model since it is a stateoftheart recommender model that is able to use pretrained POI embeddings. This model takes our pretrained POI embeddings and learns user embeddings to recommend top recommendation list.
4. Experiments
In this section, we evaluate the performance of CATAPE in comparison with a set of stateoftheart POI recommendation models.
4.1. Experimental Setup
Dataset. We evaluate the performance of CATAPE on two realworld data, Yelp (Liu et al., 2017) (access date: Feb 2016) and Gowalla (Yuan et al., 2013) were collected between Feb. 2009 and Oct. 2010. The Yelp dataset consists of checkins made by users on POIs and unique categories. The Gowalla dataset contains checkins made by users at POIs with categories. For every user, we consider the first of checkins in chronological order as the training set, followed by the last checkins as test.
Evaluation metrics.
We measure the effectiveness of the recommendation task using two standard recommendation evaluation metrics: Precision@
for the top recommended POIs and Recall@ for the top recommended POIs. In Table 1, statistically significant results are shown, which achieved by performing a twotailed paired ttest at a
confidence interval ().Compared methods. The main focus of our work is capturing POI checkin sequence and category context^{1}^{1}1In our experiments, we set the embedding dimension and predefined window size to 100 and 4, respectively.. These context information can be derived from a pure checkin dataset without any additional data modality (e.g, user comments, user profiles). Respectively, to be fair we have chosen a number of baselines that also perform on checkin datasets. These models are, (Ye et al., 2011), (Cheng et al., 2012), (Li et al., 2015), (Zhao et al., 2017a) that can capture geographical, social, and categorical context. It should be mentioned that, among the previous works which involve learning embeddings during a pretraining phase, the one proposed in (Chang et al., 2018) considers availability of textual context provided in user reviews to incorporate characteristics of POIs. Therefore, comparison with this model is out of our scope. Our model can be considered complementary to these models. We compare the performance of CATAPE with the following models:

[leftmargin=*]

USG (Ye et al., 2011) takes advantage of three modules of userbased CF, social influence, and geographical information.

MGMPFM (Cheng et al., 2012) combines geographical influence with Probabilistic Factorization Model (PFM), assuming a MultiCenter Gaussian Model (MGM) of the probability of a user’s checkin behavior.

BPRMF (Rendle et al., 2009) adopts a Bayesian criterion to directly optimize for personalized rankings based on users’ implicit feedback.

RankGeoFM (Li et al., 2015) is a stateoftheart rankingbased geographical factorization method. It incorporates the geographical information in a latent ranking model.

HGMF (Zhao et al., 2017a) is a stateoftheart hierarchical geographical matrix factorization model to utilize the hierarchical structures of both users and POIs with categorical information for POI recommendation.

Metric Factorization (Zhang et al., 2018) places users and POIs in a low dimensional space and measures their explicit similarity using Euclidean distance.

CATAPENoCat is a variation of CATAPE in which we remove the category module from the model. Therefore, it is trained using only the checkin module.
Method  Yelp  Gowalla  

27 914  P@5  P@10  P@20  R@5  R@10  R@20  P@5  P@10  P@20  R@5  R@10  R@20 
USG  0.0282  0.0244  0.0197  0.0281  0.0523  0.0753  0.0502  0.0471  0.0413  0.0517  0.0568  0.0625 
MGMPFM  0.0197  0.0173  0.0136  0.0211  0.0293  0.0493  0.0281  0.0215  0.0197  0.0263  0.0291  0.0319 
BPRFM  0.0285  0.0221  0.0185  0.0296  0.0361  0.0599  0.0493  0.0443  0.0342  0.0497  0.0529  0.0581 
RankGeoFM  0.0421  0.0362  0.0292  0.0392  0.0673  0.0838  0.0567  0.0501  0.0492  0.0591  0.0642  0.0718 
HGMF  0.0532  0.0491  0.0401  0.0478  0.0702  0.0915  0.0798  0.0711  0.0683  0.0715  0.0773  0.0819 
Metric Factorization  0.0593  0.0552  0.0481  0.0533  0.0782  0.0974  0.0821  0.0782  0.0717  0.0784  0.0814  0.0862 
CATAPENoCat  0.0641  0.0613  0.0568  0.0589  0.0831  0.1013  0.0892  0.0828  0.0784  0.0815  0.0898  0.0979 
CATAPE  0.0702  0.0692  0.0631  0.0621  0.0881  0.1121  0.0924  0.0894  0.0813  0.0872  0.0953  0.1283 
4.2. Results and Discussion
In the following section, we report the performance of CATAPE, analyzing its effectiveness compared with other methods.
Performance comparison. Table 1 lists the performance of CATAPE, as well as, the compared methods in terms of Precision@ and Recall@, respectively. As seen, between the baseline methods MGMPFM has the least performance in terms of all metrics. USG outperforms the MGMPFM by in terms of Rec@10 on Yelp. The results show that the HGMF consistently achieves the best performance against USG, MGMPFM, BPRMF, and RankGeoFM considering the hierarchical structure and incorporating the categorical information as one the most effective contextual signals in the hierarchical model. However, HGMF considers the dot product of users and POIs in measuring their similarity. Among the baselines, it is seen that Metric Factorization outperforms HGMF and other methods, suggesting that using Euclidean distance is a more precise measure of similarity as opposed to dot product.
As seen in Table 1, CATAPE significantly outperforms all of the baseline methods in terms of all evaluation metrics. This indicates that the checkin module is able to learn POI latent representation by modeling the context of users’ visited POIs and the sequence of POIs. Furthermore, the results suggest that incorporating category information enables CATAPE to model the characteristics of the POIs more effectively. It is worth noting that our proposed POI embedding model can be pretrained on a large dataset of checkins to be used in various POI recommendation models. Also, it is seen that CATAPENoCat is able to outperform all the baselines significantly, indicating that learning POI embeddings only based on checkin information is able to capture complex sequential relations between POIs.
Impact of category module. To show the effect of category information on the performance, we compare the performance of CATAPE with CATAPENoCat. The results in Table 1 show that the performance of model significantly drops when we remove category information, indicating that the category information enables the model to capture the similarities between POIs more accurately. As mentioned in the literature (Aliannejadi and Crestani, 2018), category information is crucial for capturing users regular habits. For instance, a user may stop by a drivethru coffee shop every morning, just before going to their workplace. Despite the performance drop, it is seen that CATAPENoCat is able to outperform all the baseline methods significantly in terms of all evaluation metrics. More specifically, it is seen that CATAPENoCat outperforms Metric Factorization by in terms of Pre@20 on Yelp and in terms of Rec@20 on Gowalla. Finally, note that a similar experiment, removing the checkin module would not be possible because in the category module latent vectors of POIs, computed by the checkin module, are required.
5. Conclusions
In this paper, we introduced a novel POI embedding model and demonstrated the importance of characteristics of POIs in POI embedding. Our model captures the sequential influence of POIs from checkin sequence of users, as well as, characteristics of POIs using the category information. The experimental results showed that our model contributes to improving POI recommendation performance.
References
 (1)
 Aliannejadi and Crestani (2018) Mohammad Aliannejadi and Fabio Crestani. 2018. Personalized ContextAware Point of Interest Recommendation. ACM Trans. Inf. Syst. 36, 4 (2018), 45:1–45:28.
 Aliannejadi et al. (2018) Mohammad Aliannejadi, Dimitrios Rafailidis, and Fabio Crestani. 2018. A Collaborative Ranking Model with Multiple Locationbased Similarities for Venue Suggestion. In ICTIR. ACM, 19–26.
 Chang et al. (2018) Buru Chang, Yonggyu Park, Donghyeon Park, Seongsoon Kim, and Jaewoo Kang. 2018. ContentAware Hierarchical PointofInterest Embedding Model for Successive POI Recommendation.. In IJCAI. 3301–3307.
 Cheng et al. (2012) Chen Cheng, Haiqin Yang, Irwin King, and Michael R Lyu. 2012. Fused Matrix Factorization with Geographical and Social Influence in LocationBased Social Networks.. In AAAI, Vol. 12. 17–23.

Erhan et al. (2010)
Dumitru Erhan, Yoshua
Bengio, Aaron Courville, PierreAntoine
Manzagol, Pascal Vincent, and Samy
Bengio. 2010.
Why does unsupervised pretraining help deep learning?
Journal of Machine Learning Research
11, Feb (2010), 625–660.  Feng et al. (2017) Shanshan Feng, Gao Cong, Bo An, and Yeow Meng Chee. 2017. POI2Vec: Geographical Latent Representation for Predicting Future Visitors.. In AAAI. 102–108.
 He et al. (2017) Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and TatSeng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 173–182.
 Li et al. (2015) Xutao Li, Gao Cong, XiaoLi Li, TuanAnh Nguyen Pham, and Shonali Krishnaswamy. 2015. Rankgeofm: A ranking based geographical factorization method for point of interest recommendation. In SIGIR. 433–442.
 Liu et al. (2016) Xin Liu, Yong Liu, and Xiaoli Li. 2016. Exploring the Context of Locations for Personalized Location Recommendations.. In IJCAI. 1188–1194.
 Liu et al. (2017) Yiding Liu, TuanAnh Nguyen Pham, Gao Cong, and Quan Yuan. 2017. An experimental evaluation of pointofinterest recommendation in locationbased social networks. VLDB 10 (2017), 1010–1021.
 Mikolov et al. (2013) Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
 Rendle et al. (2009) Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, and Lars SchmidtThieme. 2009. BPR: Bayesian personalized ranking from implicit feedback. In UAI. 452–461.
 Ye et al. (2011) Mao Ye, Peifeng Yin, WangChien Lee, and DikLun Lee. 2011. Exploiting geographical influence for collaborative pointofinterest recommendation. In SIGIR. 325–334.
 Yuan et al. (2013) Quan Yuan, Gao Cong, Zongyang Ma, Aixin Sun, and Nadia Magnenat Thalmann. 2013. Timeaware pointofinterest recommendation. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval. ACM, 363–372.
 Zhang and Chow (2015) JiaDong Zhang and ChiYin Chow. 2015. GeoSoCa: Exploiting Geographical, Social and Categorical Correlations for PointofInterest Recommendations. In SIGIR. 443–452.
 Zhang et al. (2018) Shuai Zhang, Lina Yao, Yi Tay, Xiwei Xu, Xiang Zhang, and Liming Zhu. 2018. Metric Factorization: Recommendation beyond Matrix Factorization. arXiv preprint arXiv:1802.04606 (2018).
 Zhao et al. (2017a) Pengpeng Zhao, Xiefeng Xu, Yanchi Liu, Ziting Zhou, Kai Zheng, Victor S Sheng, and Hui Xiong. 2017a. Exploiting hierarchical structures for POI recommendation. In 2017 IEEE International Conference on Data Mining (ICDM). IEEE, 655–664.
 Zhao et al. (2017b) Shenglin Zhao, Tong Zhao, Irwin King, and Michael R Lyu. 2017b. Geoteaser: Geotemporal sequential embedding rank for pointofinterest recommendation. In WWW. 153–162.