Log In Sign Up

Where to Go Next: A Spatio-temporal LSTM model for Next POI Recommendation

Next Point-of-Interest (POI) recommendation is of great value for both location-based service providers and users. Recently Recurrent Neural Networks (RNNs) have been proved to be effective on sequential recommendation tasks. However, existing RNN solutions rarely consider the spatio-temporal intervals between neighbor check-ins, which are essential for modeling user check-in behaviors in next POI recommendation. In this paper, we propose a new variant of LSTM, named STLSTM, which implements time gates and distance gates into LSTM to capture the spatio-temporal relation between successive check-ins. Specifically, one-time gate and one distance gate are designed to control short-term interest update, and another time gate and distance gate are designed to control long-term interest update. Furthermore, to reduce the number of parameters and improve efficiency, we further integrate coupled input and forget gates with our proposed model. Finally, we evaluate the proposed model using four real-world datasets from various location-based social networks. Our experimental results show that our model significantly outperforms the state-of-the-art approaches for next POI recommendation.


page 1

page 2

page 3

page 4


Deep Differential Recurrent Neural Networks

Due to the special gating schemes of Long Short-Term Memory (LSTM), LSTM...

Hierarchical Transformer with Spatio-Temporal Context Aggregation for Next Point-of-Interest Recommendation

Next point-of-interest (POI) recommendation is a critical task in locati...

SANST: A Self-Attentive Network for Next Point-of-Interest Recommendation

Next point-of-interest (POI) recommendation aims to offer suggestions on...

STAN: Spatio-Temporal Attention Network for Next Location Recommendation

The next location recommendation is at the core of various location-base...

Category-Aware Location Embedding for Point-of-Interest Recommendation

Recently, Point of interest (POI) recommendation has gained ever-increas...

Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural Networks

Gate functions in recurrent models, such as an LSTM and GRU, play a cent...

1 Introduction

Recent years have witnessed the rapid growth of location-based social network services, such as Foursquare, Facebook Places, Yelp and so on. These services have attracted many users to share their locations and experiences with massive amounts of geo-tagged data accumulated, e.g., 55 million users generated more than 10 billion check-ins on Foursquare until December 2017. These online footprints (or check-ins) provide an excellent opportunity to understand users’ mobile behaviors. For example, we can analyze and predict where a user will go next based on historical footprints. Moreover, such analysis can benefit POI holders to predict the customer arrival in the next time period.

(a) Language Modeling (b) Next Basket RS
(c) Next POI RS
Figure 1: in (a) represents the -th word. In (b), represents the -th item and is time interval between two neighbor items. In (c), further represents distance interval between two successive check-ins.

In the literature, approaches like latent factor model and Markov chain have been widely applied for sequential data analysis and recommendation.

[Rendle et al.2010] proposed Factorizing Personalized Markov Chain (FPMC), which bridges matrix factorization and Markov chains together, for next-basket recommendation. [Cheng et al.2013] extended FPMC to embed personalized Markov chain and user movement constraint for next POI recommendation. [He et al.2016]

proposed a unified tensor-based latent model to capture the successive check-in behavior by exploring the latent pattern-level preference for each user. Recently, Recurrent Neural Networks (RNNs) have been successfully employed on modeling sequential data and become state-of-the-art methods.

[Hidasi et al.2015] focused on RNN solutions for session-based recommendation task, where no user id exists, and recommendations are made only on short session data. [Zhu et al.2017]

proposed a variant of Long-Short Term Memory network (LSTM), called Time-LSTM, to equip LSTM with time gates to model time intervals for next item recommendation.

However, none of the above recommendation methods considers both time intervals and geographical distances between neighbor items, which makes next POI recommendation different from other sequential tasks such as language modeling and next-basket recommender system (RS). As shown in Figure 1, there is no spatio-temporal interval between neighbor words in language modeling, and there is no distance interval between neighbor items in next-basket RS, while there are time and distance intervals between neighbor check-ins in next POI recommendation. Traditional RNN and its variants, e.g., LSTM and GRU, do well in modeling the order information of sequential data with constant intervals, but cannot model dynamic time and distance intervals as shown in Figure 1(c). A recent work ST-RNN [Liu et al.2016a] tried to extend RNN to model the temporal and spatial context for next location prediction. In order to model temporal context, ST-RNN models multi-check-ins in a time window in each RNN cell. Meanwhile, ST-RNN employs time-specific and distance-specific transition matrices to characterize dynamic time intervals and geographical distances, respectively. Thus, ST-RNN can obtain improvement in the spatio-temporal sequential recommendation. However, there exist some challenges preventing ST-RNN from becoming the best solution for next POI recommendation.

First of all, ST-RNN may fail to model spatial and temporal relations of neighbor check-ins properly. ST-RNN adopts time-specific and distance-specific transition matrices between cell hidden states within RNN. Due to data sparsity, ST-RNN cannot learn every possible continuous time intervals and geographical distances but partition them into discrete bins. Secondly, ST-RNN is designed for short-term interests and not well designed for long-term interests. [Jannach et al.2015] reported that users’ short-term and long-term interests are both significant on achieving the best performance. The short-term interest here means that recommended POIs should depend on recently visited POIs, and the long-term interest means that recommended POIs should depend on all historical visited POIs. Thirdly, it is hard to select the proper width of the time window for different applications in ST-RNN since it models not one element in each layer but multi-elements in a fixed time period.

To this end, in this paper, we propose a new recurrent neural network model, named ST-LSTM, to model users’ sequential visiting behaviors. Time intervals and distance intervals of neighbor check-ins are modeled by time gate and distance gate, respectively. Note that there are two time gates and two distance gates in the ST-LSTM model. One pair of time gate and distance gate is designed to exploit time and distance intervals to capture the short-term interest, and the other is to memorize time and distance intervals to model the long-term interest. Furthermore, enlightened by [Greff et al.2017], we use the coupled input and forget gates to reduce the number of parameters, making our model more efficient. Experimental results on four real-world datasets show ST-LSTM significantly improves next POI recommendation performance.

To summarize, our contributions are listed as follows.

  • To the best of our knowledge, this is the first work that models spatio-temporal intervals between check-ins under LSTM architecture to learn user’s visiting behavior for the next POI recommendation.

  • A ST-LSTM model is proposed to incorporate carefully designed time gates and distance gates to capture the spatio-temporal interval information between check-ins. As a result, ST-LSTM well models user’s short-term and long-term interests simultaneously.

  • Experiments on four large-scale real-world datasets are conducted to evaluate the performance of our proposed model. Our experimental results show that our method outperforms state-of-the-art methods.

2 Related Work

In this section, we discuss related work from two aspects, which are POI recommendation and leveraging neural networks for recommendation.

2.1 POI Recommendation

Different from traditional recommendations (e.g., movie recommendation, music recommendation), POI recommendation is characterized by geographic information and no explicit rating information [Ye et al.2011, Lian et al.2014]. Moreover, additional information, such as social influence, temporal information, review information, and transition between POIs, has been leveraged for POI recommendation. [Ye et al.2011] integrated the social influence with a user-based Collaborative Filtering (CF) model and modeled the geographical influence by a Bayesian model. [Yuan et al.2013] utilized the temporal preference to enhance the efficiency and effectiveness of the solution. [Kurashima et al.2013] proposed a topic model, in which a POI is sampled based on its topics and the distance to historical visited POIs of a target user. [Liu et al.2016b] exploited users’ interests and their evolving sequential preferences with temporal interval assessment to recommend POI in a specified time period.

Next POI recommendation, as a natural extension of general POI recommendation, is recently proposed and has attracted great research interest. Research has shown that the sequential influence between successive check-ins plays a crucial role in next POI recommendation since human movement exhibits sequential patterns. A tensor-based model, named FPMC-LR, was proposed by integrating the first-order Markov chain of POI transitions and distance constraints for next POI recommendation [Cheng et al.2013]. [He et al.2016] further proposed a tensor-based latent model considering the influence of user’s latent behavior patterns, which are determined by the contextual temporal and categorical information. [Feng et al.2015] proposed a personalized ranking metric embedding method (PRME) to model personalized check-in sequences for next POI recommendation. [Xie et al.2016] proposed a graph-based embedding learning approach, named GE, which utilize bipartite graphs to model context factors in a unified optimization framework.

2.2 Neural Networks for Recommendation

Neural networks are not only naturally used for feature learning to model various features of users or items, but also explored as a core recommendation model to simulate nonlinear, complex interactions between users and items [Wang and Wang2014, Zhang et al.2016]. [Zheng et al.2016] further improved it with an autoregressive method. [Yang et al.2017a]

proposed a deep neural architecture named PACE for POI recommendation, which utilizes the smoothness of semi-supervised learning to alleviate the sparsity of collaborative filtering.

[Yang et al.2017b] jointly modeled a social network structure and users’ trajectory behaviors with a neural network model named JNTM. [Zhang et al.2017] tried to learn user’s next movement intention and incorporated different contextual factors to improve next POI recommendation. [Zhu et al.2017] proposed a Time-LSTM model and two variants, which equip LSTM with time gates to model time intervals for next item recommendation.

A recent work proposed a model named ST-RNN, which considers spatial and temporal contexts to model user behavior for next location prediction, is closely related to our work [Liu et al.2016a]. However, our proposed ST-LSTM model differs significantly from ST-RNN in two aspects. First, ST-LSTM equips the LSTM model with time and distance gates while ST-RNN adds spatio-temporal transition matrices to the RNN model. Second, ST-LSTM well models time and distance intervals between neighbor check-ins to extract long-term and short-term interests. However, ST-RNN recommends next POI depending only on POIs in the nearest time window which may be hard to distinguish short-term and long-term interests.

3 Preliminaries

In this section, we first give the formal problem definition of next POI recommendation, and then briefly introduce LSTM.

3.1 Problem Formulation

Let be the set of users and be the set of POIs. For user , she has a sequence of historical POI visits up to time represented as , where means user visit POI at time . The goal of next POI recommendation is to recommend a list of unvisited POIs for a user to visit next at time point . Specifically, a higher prediction score of a user to an unvisited POI

indicates a higher probability that the user

would like to visit the POI at time . According to prediction scores, we can recommend top- POIs to user .

3.2 Lstm

LSTM [Hochreiter and Schmidhuber1997], a variant of RNN, is capable of learning short and long-term dependencies. LSTM has become an effective and scalable model for sequential prediction problems, and many improvements have been made to the original LSTM architecture. We use the basic LSTM model in our approach for the concise and general purpose, and it is easy to extend to other variants of LSTM. The basic update equations of LSTM are as follows:


where , , represent the input, forget and output gates of the -th object, deciding what information to store, forget and output, respectively.

is the cell activation vector representing cell state, which is the key to LSTM.

and represent the input feature vector and the hidden output vector, respectively. represents a sigmoid layer to map the values between 0 to 1, where 1 represents “complete keep this” while 0 represents “completely get rid of this”. , , and are the weights of gates. , , and are corresponding biases. And represents for the element-wise (Hadamard) product. The update of cell state has two parts. The former part is the previous cell state that is controlled by forget gate , and the latter part is the new candidate value scaled by how much we decided to add state value.

4 Our Approach

In this section, we first propose a spatio-temporal LSTM model, ST-LSTM, which utilizes time and distance intervals to model user’s short-term interest and long-term interest simultaneously. Then, we improve ST-LSTM with coupled input and output gates for efficiency.

4.1 Spatio-temporal LSTM

When using LSTM for next POI recommendation, represents user’s last visited POI, which can be exploited to learn user’s short-term interest. While contains the information of user’s historical visited POIs, which reflect user’s long-term interest. However, how much the short-term interest determines where to go next heavily depends on the time interval and the geographical distance between the last POI and the next POI. Intuitively, a POI visited long time ago and long distance away has little influence on next POI, and vice versa. In our proposed ST-LSTM model, we use time gate and distance gate to control the influence of the last visited POI on next POI recommendation. Furthermore, the time gate and the distance gate can also help to store time and distance intervals in cell state , which memorizes user’s long-term interest. In this way, we utilize time and distance intervals to model user’s short-term interest and long-term interest simultaneously.

Figure 2: ST-LSTM has two time gates and two distance gates, i.e., , , and . and are designed to model time and distance intervals for short-term interests while and are to model time and distance intervals for long-term interest.

As shown in two dotted red rectangles in Figure 2, we add two time gates and two distance gates to LSTM, denoted as , , and respectively. and are used to control the influence of the latest visited POI on next POI, and and are used to capture time and distance intervals to model user’s long-term interest. Based on LSTM, we add equations for time gates and distance gates as follows:


We then modify Eq. (4)-(6) to:


where is the time interval and is the distance interval. Besides input gate , can be regarded as an input information filter considering time interval, and can be regarded as another input information filter considering distance interval. We add a new cell state to store the result, then transfer to the hidden state and finally influences next recommendation. Along this line, is filtered by time gate and distance gate as well as input gate on current recommendations.

Cell state is used to memory users general interest, i.e., long-term interest. We designed a time gate and a distance gate to control the cell state update. first memorizes then transfers to , further to . So helps store to model user long-term interest. In the similar way, memorizes and transfers to cell state to help model user long-term interest. In this way, captures user long-term interest by memorizing not only the order of user’s historical visited POIs, but also the time and distance interval information between neighbor POIs. Modeling distance intervals can help capture user’s general spatial interest, while modeling time intervals helps capture user’s periodical visiting behavior.

Normally, a more recently visited POI with a shorter distance should have a larger influence on choosing next POI. To incorporate this knowledge in the designed gates, we add constraints and in Eq. (7) and Eq. (9). Accordingly, if is smaller, would be larger according to Eq. (7). In the similar way, if is shorter, would be larger according to Eq. (9). For example, if time and distance intervals are smaller between and next POI, then better indicates the short-term interest, thus its influence should be increased. If or is larger, would have a smaller influence on the new cell state . In this case, the short-term interest is uncertain, so we should depend more on the long-term interests. It is why we set two time gates and two distance gates to distinguish the short-term and long-term interests update.

4.2 Variation of coupled input and forget gates

Figure 3: A variant of ST-LSTM using coupled input and forget gates.

Enlightened by [Greff et al.2017], we propose another version of ST-LSTM, named ST-CLSTM, to reduce the number of parameters and improve efficiency. ST-CLSTM uses coupled input and forget gates instead of separately deciding what to forget and what new information to add, as shown in Figure 3. Specifically, we remove the forget gate, and modify Eq. (11) and Eq. (12) to:


Since time gate and distance gate are regarded as input filters, we replace the forget gate with in Eq. (15). and are used to store time intervals and distance intervals respectively, thus we use in Eq. (16).

4.3 Training

The way we adapt our model to next POI recommendation is as follows. Firstly we transform to . Then in ST-LSTM is equivalent to , is equivalent to , and is equivalent to , where

is the function computing the distance between two geographical points. Moreover, we make use of all users’ behavioral histories for learning and recommendation. We leverage the mini-batch learning method, and train the model on users’ existing histories until convergence. The model output is a probability distribution on all POIs calculated by

and . And then we take a gradient step to optimize the loss based on the output and one-hot representations of .

We use Adam, a variant of Stochastic Gradient Descent(SGD), to optimize the parameters in ST-LSTM, which adapts the learning rate for each parameter by performing smaller updates for frequent parameters and larger updates for infrequent parameters. We use the projection operator described in

[Rakhlin et al.2012] to meet the constraints in Eq. (7) and in Eq. (9). If we have during the training process, we set . And parameter is set in the same way.

The computational complexity of learning LSTM models per weight and time step with the stochastic gradient descent (SGD) optimization technique is . Hence, the LSTM algorithm is very efficient, with an excellent update complexity of , where is the number of weights and can be calculated as , where is the number of memory cells, is the number of input units, and is the number of output units. Similarly, ST-LSTM computational complexity is also and can be calculated as . The training time of our proposed model for rounds of training on four datasets after data cleaning is about minutes on GPU M6000.

5 Experiments

In this section, we conduct experiments to evaluate the performance of our proposed model ST-LSTM on four real-world datasets. We first briefly depict the datasets, followed by baseline methods. Finally, we present our experimental results and discussions.

5.1 Dataset

We use four public LBSNs datasets that have user-POI interactions of users and locations of POIs. The statistics of the four datasets are listed in Table 1. CA is a Foursquare dataset from users whose homes are in California, collected from January 2010 to February 2011 and used in [Gao et al.2012]. SIN is a Singapore dataset crawled from Foursquare used by [Yuan et al.2013]. Gowalla111 and Brightkite222 are two widely used LBSN datasets, which have been used in many related research papers. We eliminate users with fewer than 10 check-ins and POIs visited by fewer than 10 users in the four datasets. Then, we sorted each user’s check-in records according to timestamp order, taking the first 70% as training set, the remaining 30% for the test set.

Dataset #user #POI #Check-in Density
CA 49,005 206,097 425,691 0.004%
SIN 30,887 18,995 860,888 0.014%
Gowalla 18,737 32,510 1,278,274 0.209%
Brightkite 51,406 772,967 4,747,288 0.012%
Table 1: Statistics of the four datasets

5.2 Baseline Methods

We compare our proposed model ST-LSTM with seven representative methods for next POI recommendation.

  • FPMC-LR [Cheng et al.2013]: It combines the personalized Markov chains with the user movement constraints around a localized region. It factorizes the transition tensor matrices of all users and predicts next location by computing the transition probability.

  • PRME-G [Feng et al.2015]: It utilizes the Metric Embedding method to avoid drawbacks of the MF. Specifically, it embeds users and POIs into the same latent space to capture the user transition patterns.

  • GE [Xie et al.2016]: It embeds four relational graphs (POI-POI, POI-Region, POI-Time, POI-Word) into a shared low dimensional space. The recommendation score is then calculated by a linear combination of inner products for these contextual factors.

  • RNN [Zhang et al.2014]: This method leverages the temporal dependency in user’s behavior sequence through a standard recurrent structure.

  • LSTM [Hochreiter and Schmidhuber1997] This is a variant of RNN model, which contains a memory cell and three multiplicative gates to allow long-term dependency learning.

  • GRU [Cho et al.2014]: This is a variant of RNN model, which is equipped with two gates to control the information flow.

  • ST-RNN [Liu et al.2016a]: Based on the standard RNN model, ST-RNN replaces the single transition matrix in RNN with time-specific transition matrices and distance-specific transition matrices to model spatial and temporal contexts.

5.3 Evaluation Metrics

To evaluate the performance of our proposed model ST-LSTM and compare with the seven baselines described above, we use two standard metrics Acc@K and Mean Average Precision (MAP). These two metrics are popularly used for evaluating recommendation results, such as [Liu et al.2016a, He et al.2016, Xie et al.2016]. Note that for an instance in testing set, Acc@K is 1 if the visited POI appears in the set of top-K recommendation POIs, and 0 otherwise. The overall Acc@K is calculated as the average value of all testing instances. In this paper, we choose K = {1, 5, 10, 15, 20} to illustrate different results of Acc@K.

5.4 Results and Discussions

Acc@1 Acc@5 Acc@10 MAP Acc@1 Acc@5 Acc@10 MAP
FPMC-LR 0.0378 0.0493 0.0784 0.1791 0.0395 0.0625 0.0826 0.1724
PRME-G 0.0422 0.065 0.0813 0.1868 0.0466 0.0723 0.0876 0.1715
GE 0.0294 0.0329 0.0714 0.1691 0.0062 0.0321 0.0607 0.1102
RNN 0.0475 0.0901 0.1138 0.1901 0.1321 0.1867 0.2043 0.2186
LSTM 0.0486 0.0937 0.1276 0.1975 0.1261 0.1881 0.2019 0.2123
GRU 0.0483 0.0915 0.1216 0.1934 0.1237 0.1921 0.1992 0.2101
ST-RNN 0.0505 0.0922 0.1232 0.2075 0.1379 0.1957 0.2091 0.2239
ST-LSTM 0.0716 0.1232 0.1508 0.2208 0.1978 0.2436 0.2651 0.3194
ST-CLSTM 0.0801 0.1308 0.1612 0.2556 0.2037 0.2542 0.2861 0.3433
Gowalla Brightkite
Acc@1 Acc@5 Acc@10 MAP Acc@1 Acc@5 Acc@10 MAP
FPMC-LR 0.0293 0.0524 0.0849 0.1745 0.1634 0.2475 0.3164 0.33
PRME-G 0.0334 0.0652 0.0869 0.1916 0.1976 0.2993 0.3495 0.3115
GE 0.0174 0.06 0.0947 0.1973 0.0521 0.1376 0.2118 0.2602
RNN 0.0473 0.0892 0.1207 0.1998 0.3401 0.4087 0.432 0.413
LSTM 0.0503 0.0967 0.1241 0.2004 0.3575 0.4146 0.4489 0.4303
GRU 0.0498 0.0931 0.1289 0.2045 0.331 0.4007 0.4377 0.4042
ST-RNN 0.0519 0.09532 0.1304 0.2187 0.3672 0.4231 0.4477 0.4369
ST-LSTM 0.0713 0.1355 0.1669 0.2338 0.4389 0.4807 0.5035 0.5266
ST-CLSTM 0.0778 0.1492 0.1818 0.2557 0.4443 0.4953 0.5231 0.5626
Table 2: Evaluation of next POI recommendation in terms of Acc@K and MAP on four datasets

Method Comparison.

The performance of our proposed model ST-LSTM and the seven baselines on four datasets evaluated by Acc@K and MAP is shown in Table 2. The cell size and the hidden state size are set as 128 in our experiments. The number of Epochs is set as 100 and the batch size is set as 10 for our proposed model. Other baseline parameters follow the best settings in their papers. From the experimental results, we can see following observations: RNN performs better than Markov chain method FPMC-LR and embedding method PRME-G, due to its capability in modeling sequential data and user interests using RNN cell. Both LSTM and GRU slightly improve the performance compare with RNN because of their advantages in modeling long-term interests. The result of GE is not good for missing social and textual information in our datasets. The performance of the state-of-the-art method ST-RNN is close to the standard RNN method, which may be caused by the difficulty of manually setting the windows of time and distance intervals. Another reason may be that the setting of the window does not well model the relation of recently visited POIs and next POI. Our model ST-LSTM outperforms all baselines on the four datasets. The significant improvement of ST-LSTM indicates that it can well model temporal and spatial contexts. This is because we add time and distance gates to integrate time and distance intervals into the model. Moreover, ST-CLSTM not only reduces the number of parameters, but also marginally improve the performance compared with ST-LSTM.

Effectiveness of Time and Distance Gates. There are two time gates and two distance gates in our ST-CLSTM model. We first investigate the effectiveness of time and distance gates on modeling time and distance intervals. Specifically, we set and , in Eq. (9) and Eq. (10), respectively. That is, we close two distance gates and only consider the time intervals. Similarly, we set and , in Eq. (7) and Eq. (8), respectively. That is, we close two time gates and only consider distance information. From Figure 4, we can observe that the time gates and distance gates have almost equal importance on the two datasets (i.e., Gowalla and CA). Moreover, they both are critical for improving the recommendation performances.

We also investigate the effectiveness of time and distance gates on modeling short-term and long-term interests. We set and , in Eq. (8) and Eq. (10), which means we close time and distance gates on long-term interests and only activate time and distance gates on short-term interest. Similarly, we set and , in Eq. (7) and Eq. (9), which means we close time and distance gates for short-term interest. As shown in Figure 4, we can observe that they all perform worse than original ST-CLSTM, which means that time and distance intervals are not only critical to short-term interests but also important to long-term interests. Distance intervals may help model user general spatial preference and time intervals may help to model user long-term periodical behavior.

(a) Gowalla - Acc@K
(b) CA - Acc@K
Figure 4: The performance with different time and distance gates in ST-CLSTM

Performance of Cold Start. We also evaluate the performance of ST-LSTM by comparing with other competitors for cold-start users. If a user just visits a few POIs, we think the user is cold. Specifically, we take users with less than 5 check-ins as a cold user in our experiments. We conduct the experiments on two datasets (i.e., Gowalla and BrightKite) and use Acc@K as the measure metric. As shown in Figure 5, we can observe that ST-CLSTM performs the best among all methods under cold start scenario. The reason is that ST-CLSTM models long-term interests as well as short-term interests with considering time and distance intervals.

(a) Gowalla - Acc@K
(b) BrightKite - Acc@K
Figure 5: The performance of cold start on two datasets

Impact of Parameters. In the standard RNN, different cell sizes and batch sizes may lead to different performances. We investigate the impact of these two parameters for ST-LSTM and ST-CLSTM. We vary cell sizes and batch sizes to observe the performance and the training time of our proposed two models. We only show the impact of the two parameters on Gowalla dataset due to space constraint. As shown in Figure 6, increasing the cell size can improve our model in terms of the Acc@10 metric, and a proper batch size can help achieve the best performance. The cell size determines the model complexity, and the cell with a larger size may fit the data better. Moreover, a small batch size may lead to local optimum, and a big one may lead to insufficient updating of parameters in our two models.

(a) Different Cell Size
(b) Different Batch Size
Figure 6: The performance with different cell sizes and batch sizes on Gowalla

6 Conclusions

In this paper, a spatio-temporal recurrent neural network, named ST-LSTM, was proposed for next POI recommendation. Time and distance intervals between neighbor check-ins were modeled using time and distance gates in ST-LSTM. Specifically, we added a new cell state, and so there are two cell states to memorize users’ short-term and long-term interests respectively. We designed time and distance gates to control user’s short-term interest update and another pair of gates to control long-term interest update, so as to improve next POI recommendation performance. We further coupled time and distance gates to improve ST-LSTM efficiency. Experimental results on four large-scale real-world datasets demonstrated the effectiveness of our model, which performed better than the state-of-the-art methods. In future work, we would incorporate more context information such as social network and textual description content into the model to further improve the next POI recommendation accuracy.


  • [Cheng et al.2013] Chen Cheng, Haiqin Yang, Michael R Lyu, and Irwin King. Where you like to go next: Successive point-of-interest recommendation. In IJCAI, volume 13, pages 2605–2611, 2013.
  • [Cho et al.2014] Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP, pages 1724–1734, 2014.
  • [Feng et al.2015] Shanshan Feng, Xutao Li, Yifeng Zeng, Gao Cong, Yeow Meng Chee, and Quan Yuan. Personalized ranking metric embedding for next new poi recommendation. In IJCAI, pages 2069–2075, 2015.
  • [Gao et al.2012] Huiji Gao, Jiliang Tang, and Huan Liu. gscorr: modeling geo-social correlations for new check-ins on location-based social networks. In CIKM, pages 1582–1586. ACM, 2012.
  • [Greff et al.2017] Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. LSTM: A search space odyssey. IEEE Transactions on Neural Networks and Learning Systems, 28(10):2222–2232, 2017.
  • [He et al.2016] Jing He, Xin Li, Lejian Liao, Dandan Song, and William K Cheung. Inferring a personalized next point-of-interest recommendation model with latent behavior patterns. In AAAI, pages 137–143, 2016.
  • [Hidasi et al.2015] Balázs Hidasi, Alexandros Karatzoglou, Linas Baltrunas, and Domonkos Tikk. Session-based recommendations with recurrent neural networks. CoRR, abs/1511.06939, 2015.
  • [Hochreiter and Schmidhuber1997] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
  • [Jannach et al.2015] Dietmar Jannach, Lukas Lerche, and Michael Jugovac. Adaptation and evaluation of recommendations for short-term shopping goals. In RecSys’16, pages 211–218. ACM, 2015.
  • [Kurashima et al.2013] Takeshi Kurashima, Tomoharu Iwata, Takahide Hoshide, Noriko Takaya, and Ko Fujimura. Geo topic model: joint modeling of user’s activity area and interests for location recommendation. In WSDM, pages 375–384. ACM, 2013.
  • [Lian et al.2014] Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, Enhong Chen, and Yong Rui. Geomf: joint geographical modeling and matrix factorization for point-of-interest recommendation. In SIGKDD, pages 831–840. ACM, 2014.
  • [Liu et al.2016a] Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. Predicting the next location: A recurrent model with spatial and temporal contexts. In AAAI, pages 194–200, 2016.
  • [Liu et al.2016b] Yanchi Liu, Chuanren Liu, Bin Liu, Meng Qu, and Hui Xiong. Unified point-of-interest recommendation with temporal interval assessment. In KDD, pages 1015–1024, 2016.
  • [Rakhlin et al.2012] Alexander Rakhlin, Ohad Shamir, and Karthik Sridharan. Making gradient descent optimal for strongly convex stochastic optimization. In ICML’12, pages 1571–1578, 2012.
  • [Rendle et al.2010] Steffen Rendle, Christoph Freudenthaler, and Lars Schmidt-Thieme. Factorizing personalized markov chains for next-basket recommendation. In WWW, pages 811–820. ACM, 2010.
  • [Wang and Wang2014] Xinxi Wang and Ye Wang.

    Improving content-based and hybrid music recommendation using deep learning.

    In MM, pages 627–636. ACM, 2014.
  • [Xie et al.2016] Min Xie, Hongzhi Yin, Hao Wang, Fanjiang Xu, Weitong Chen, and Sen Wang. Learning graph-based poi embedding for location-based recommendation. In CIKM, pages 15–24. ACM, 2016.
  • [Yang et al.2017a] Carl Yang, Lanxiao Bai, Chao Zhang, Quan Yuan, and Jiawei Han. Bridging collaborative filtering and semi-supervised learning: A neural approach for poi recommendation. In SIGKDD, pages 1245–1254. ACM, 2017.
  • [Yang et al.2017b] Cheng Yang, Maosong Sun, Wayne Xin Zhao, Zhiyuan Liu, and Edward Y. Chang. A neural network approach to jointly modeling social networks and mobile trajectories. ACM TOIS, 35(4):36:1–36:28, 2017.
  • [Ye et al.2011] Mao Ye, Peifeng Yin, Wang-Chien Lee, and Dik-Lun Lee. Exploiting geographical influence for collaborative point-of-interest recommendation. In SIGIR, pages 325–334. ACM, 2011.
  • [Yuan et al.2013] Quan Yuan, Gao Cong, Zongyang Ma, Aixin Sun, and Nadia Magnenat Thalmann. Time-aware point-of-interest recommendation. In SIGIR, pages 363–372. ACM, 2013.
  • [Zhang et al.2014] Yuyu Zhang, Hanjun Dai, Chang Xu, Jun Feng, Taifeng Wang, Jiang Bian, Bin Wang, and Tie-Yan Liu. Sequential click prediction for sponsored search with recurrent neural networks. In AAAI, pages 1369–1375, 2014.
  • [Zhang et al.2016] Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. Collaborative knowledge base embedding for recommender systems. In SIGKDD, pages 353–362. ACM, 2016.
  • [Zhang et al.2017] Zhiqian Zhang, Chenliang Li, Zhiyong Wu, Aixin Sun, Dengpan Ye, and Xiangyang Luo. Next: A neural network framework for next poi recommendation. arXiv preprint arXiv:1704.04576, 2017.
  • [Zheng et al.2016] Yin Zheng, Bangsheng Tang, Wenkui Ding, and Hanning Zhou. A neural autoregressive approach to collaborative filtering. In ICML, pages 764–773, 2016.
  • [Zhu et al.2017] Yu Zhu, Hao Li, Yikang Liao, Beidou Wang, Ziyu Guan, Haifeng Liu, and Deng Cai. What to do next: Modeling user behaviors by time-lstm. In IJCAI-17, pages 3602–3608, 2017.