1. Introduction
Recommender system has been drawing much attention in recent decades, and achieving great successes in many realworld applications such as vedio (Covington et al., 2016), ecommerce (Wang et al., 2018), and PointofInterest (POI) (e.g., restaurant and hotel) recommendation (Yang et al., 2017), to solve the information overload problem. Take POI recommendation as an example, most promising models are centralizedly built on the basis of collecting users’ private data, which causes serious privacy concerns (Lam et al., 2006; McSherry and Mironov, 2009; Riboni and Bettini, 2012).
A motivating example. Figure 1 shows the framework of most existing POI recommendation approaches, where the data include user profiles (e.g., age and gender), POI descriptions (e.g., category and visited count), and userPOI actions (e.g., click and checkin). Among them, both user profiles and userPOI actions are private, whereas POI descriptions are public to all the users. The model refers to the built recommendation model, e.g., the latent factors of Matrix Factorization (MF) model (Koren, 2008), that predicts users’ preferences on POIs. First, besides the public POI data, users’ private data, including user profiles and userPOI actions, are collected, and these data explicitly show users’ private information and may be abused by the recommender. Second, the models of most existing POI recommendation approaches implicitly indicate users’ private information, e.g., the latent factors of MF can directly infer users’ ratings on items. Therefore, both data and models of most existing recommender systems could be in high privacy risks (Polat and Du, 2003; Ricci et al., 2015).
There have been some studies focus on protecting user privacy while building recommender systems, including the applications to POI recommendation (Riboni and Bettini, 2012; Chen et al., 2018c). They mainly belong to two types. The first type protects the raw data by adding noises to them (Polat and Du, 2005; Berkovsky et al., 2007; McSherry and Mironov, 2009; Riboni and Bettini, 2012; Hua et al., 2015; Meng et al., 2018). These methods are efficient and easy to implement, however, the recommendation performance decreases when adding too much noises. The second type is based on cryptography techniques (Canny, 2002; Polat and Du, 2005; Aïmeur et al., 2008; Erkin et al., 2010; Nikolaenko et al., 2013). These approaches usually can achieve comparable performance with the traditional recommender systems, however, their efficiencies are too low to be applied in practice. Therefore, how to build a privacy preserving recommender system, which can not only protect user data and model privacy, but also has comparable (or even better) recommendation accuracy and high efficiency, remains a challenge.
To solve above challenges, in this paper, we take POI recommendation as a ClickThrough Rate (CTR) prediction problem, and propose a novel Privacy preserving POI Recommendation (PriRec) framework, which has the following advantages.
PriRec protects data and model privacy. First, to protect data privacy, PriRec keeps users’ private data (features and actions) on their own side, e.g., Cellphone or Pad. To alleviate the storage costs on users’ devices, all the public POIs’ data are still held by the recommender. These public data can be divided into two types: (1) the static POI data that describe the status of a POI such as POI categories, and (2) the dynamic POI data that indicate the popularity of a POI, e.g., visited count. Since the userPOI actions are kept on users’ devices, to obtain the statistics of these action data, we propose to use local differential privacy technique (Ding et al., 2017) to collect perturbed userPOI interaction data, and further generate POI dynamic features by the recommender. Here, different from the existing models (Hua et al., 2015), which directly use the perturbed data to build models, we only use the perturbed userPOI interaction data for generating statistical features. Second, to protect model privacy, motivated by Factorization Machine (FM) (Rendle, 2012), we design the model of PriRec as two parts: (1) the linear models that are decentralized on each user’s side since they directly indicate user preferences, and (2) the feature interaction model that is kept by the recommender has no privacy risk since it can only infer the interaction weights between features. To this end, both users’ private raw data and models are kept by their own hands, and PriRec is able to protect user privacy to a large extent.
PriRec has linear time complexity and promising recommendation accuracy. The learning process in PriRec includes two parts, the learning of linear models on each user’s side and the learning of feature interaction model kept by the recommender. First, inspired by decentralized gradient descent (Nedic and Ozdaglar, 2009; Yuan et al., 2016), we propose a secure decentralized gradient descent protocol for users to learn their linear models collaboratively. Second, motivated by parameter server distributed learning paradigm (Li et al., 2014) and federated learning (Konečnỳ et al., 2016; Bonawitz et al., 2017), we adopt secure aggregation strategy in federated learning paradigm to learn the feature interaction model. Both strategies are efficient and make the learning of PriRec scales linearly with data size in terms of both computation and communication complexities. Moreover, PriRec belongs to decentralized model and it learns the linear models for different users based on location networks. To this end, PriRec can capture users’ individual interests in different locations, and achieve promising recommendation accuracy.
We apply PriRec in realworld datasets, and comprehensive experiments demonstrate that, compared with the traditional ranking model, PriRec achieves comparable or even better recommendation performance, and meanwhile keeps user privacy.
Our main contributions are summarized as follows:

We propose a novel Privacy preserving POI Recommendation (PriRec) framework for POI recommendation, where we propose a secure decentralized gradient descent protocol for learning decentralized linear models and adopt secure aggregation strategy in federated learning paradigm to learn the feature interaction model. PriRec keeps users’ private raw data and models on users’ own side, and therefore protects user privacy to a large extent.

We propose to adopt local differential privacy techniques to generate dynamic POI popularity features from users’ local userPOI actions. This can not only protect private userPOI actions, but also significantly improve recommendation performance, as we will show in experiments.

We conduct experiments on realworld datasets, and the results demonstrate the effectiveness and efficiency of PriRec.
2. Related Work
In this section, we review related knowledge, including the traditional recommender system, privacy preserving recommender system, local differential privacy, and secret sharing.
2.1. Traditional Recommender System
We first review literatures of traditional recommender system, i.e., nonprivacy preserving approaches, including the applications in POI recommendations. The most famous traditional recommender system is Collaborative Filtering (CF) (Sarwar et al., 2001; Su and Khoshgoftaar, 2009; Ye et al., 2011), which is based on the assumption that users who behave similarly on some items will also behave similarly on other items. Among CF, factorization based models achieve promising performance (Koren, 2008; Li et al., 2015; Chen et al., 2018a), which aim to learn user and item latent factors based on known useritem action histories such as ratings and clicks. Popular factorization based CF models include Matrix Factorization (MF) and its variants (Mnih and Salakhutdinov, 2007; Koren, 2008; Cheng et al., 2012; Yang et al., 2013; Lian et al., 2014), regressionbased latent factor models (Agarwal and Chen, 2009), Bayesian personalized ranking (Rendle et al., 2009), deep MF (Xue et al., 2017), neural MF (He et al., 2017), and Hash based MF (Chen et al., 2018b).
Besides the above CF models, in practice, ads, merchandise, and POI recommendations are also taken as a ClickThrough Rate (CTR) prediction problem (Ling et al., 2017)
. Logistic Regression (LR) is popularly used in most Internet companies, e.g., Microsoft
(Richardson et al., 2007) and Google (McMahan et al., 2013), due to its simplicity, scalability, and online learning capability. Deep neural network (DNN) has also been widely used due to its powerful representation ability
(Zhang et al., 2016). Later on, Wide & deep (Cheng et al., 2016) combines the advantages of both LR and DNN for better performance. Besides, Factorization Machine (FM) (Rendle, 2012) and its variations, e.g., DeepFM (Guo et al., 2017) and Fieldaware FM (Juan et al., 2016), are also extensively used since they can capture the highorder interactions between features.Although the traditional recommender systems achieve promising performance, they build centralized recommendation models on the basis of collecting users’ data. Both private data (features and actions) and models are hold by the recommender, which cause serious privacy concerns (Lam et al., 2006; Vallet et al., 2014; Ricci et al., 2015). In this paper, we take POI recommendation as a CTR prediction problem, and propose a novel Privacy preserving POI Recommendation (PriRec) framework for it. PriRec keeps users’ private data and models on users’ own side, e.g., Cellphone or Pad, thus solves the privacy issue.
2.2. Privacy Preserving Recommender System
To date, different approaches have been proposed to solve the privacy issues of the traditional recommender systems. The first type is based on randomized perturbation or differential privacy techniques (Dwork, 2008). That is, they protect users’ original data by adding noise to them. Popular methods of this type include (Polat and Du, 2005; Berkovsky et al., 2007; McSherry and Mironov, 2009; Riboni and Bettini, 2012; Hua et al., 2015; Meng et al., 2018). These methods are efficient and easy to implement, however, there is a tradeoff bewteen privacy and recommendation accuracy, i.e., the recommendation performance decreases when the privacy degree increases. The second type is based on cryptography techniques such as homomorphic encryption (Gentry and Boneh, 2009) and secure MultiParty Computation (MPC) (Yao, 1986), and typical methods include (Canny, 2002; Polat and Du, 2005; Aïmeur et al., 2008; Erkin et al., 2010; Nikolaenko et al., 2013). These approaches usually can achieve comparable performance with the traditional recommender systems, however, the low efficiency of the cryptography techniques limits its application in practice.
Besides the above privacypreserving recommendation models, there are also existing approaches focus on combining the private data of multiparties, e.g., different hospitals and banks, meanwhile training machine learning models such as LR
(Chaudhuri and Monteleoni, 2009; Mohassel and Zhang, 2017), which is the socalled collaborative learning or shared machine learning in literature (Chen et al., 2020). They do this by using differential privacy or MPC. The fundamental difference between these works and ours is that, they assume users’ data have been collected by several parties who want to protect their collected data from other parties, while our approach assumes users’ private raw data are kept on their own devices.The most similar work to ours is Federated Learning (FL) (Konečnỳ et al., 2016; Bonawitz et al., 2017). However, PriRec is different from FL in two aspects: (1) FL assumes that data are decentralized on each user’s device and the model is kept by the server (recommender), while in PriRec, both users’ private data and models are decentralized on each user’s device, and therefore PriRec has better user privacy guarantees; (2) FL only uses secure gradient aggregation strategy to learn neural network model while PriRec uses both secure decentralized gradient descent protocol and secure gradient aggregation strategy to learn FM model.
2.3. Local Differential Privacy
Differential Privacy (DP) has been proposed in the global privacy context to ensure that an adversary should not be able to reliably infer whether or not a particular individual is participating in the database query, while Local Differential Privacy (LDP) was proposed in the local privacy context, as in when individuals disclose their personal information (Kairouz et al., 2014; Cormode et al., 2018)
. LDP has the ability of estimating statistical values of data, e.g., mean and histogram, without disclose users’ raw data, and has been adopted by many companies, including Google
(Erlingsson et al., 2014), Apple (Thakurta et al., 2017), and Microsoft (Ding et al., 2017). Recently, LDP has also been applied in recommender system to protect private useritem ratings (Shen and Jin, 2016; Shin et al., 2018). However, directly using LDP to build models will decrease recommendation performance.In this paper, we propose to adopt LDP to generate dynamic POI features (e.g., the visited count of a POI) instead of directly building models, which can protect userPOI actions and capture the popularity of POIs. We will show in experiments that the generated POI features can significantly improve recommendation performance.
2.4. Secret Sharing
Secret sharing was first proposed in (Shamir, 1979). The basic idea of secret sharing is to distribute a secret amongst a group of participants (parties), each of whom has a share of the secret. The secret can be reconstructed only when a sufficient number of shares are combined together, and individual shares are of no use on their own. We focus on outof Secret Sharing in this paper, i.e., all shares are needed to reconstruct a secret. To share an bit value for party , party generates { and } uniformly at random, sends to to party , and keeps mod . We use to denote the share of party . To reconstruct a shared value , each party sends to one who computes mod .
The above protocols can not work directly with decimal numbers, since it is not possible to sample uniformly in (Cock et al., 2015). We approximate decimal arithmetics following the existing work (Mohassel and Zhang, 2017). Suppose and are two decimal numbers with at most bits in the fractional part, to do fixedpoint multiplication, we first transform them to integers by letting and , and then calculate . Finally, we truncate the last bits of so that it has at most bits representing the fractional part. It has been proven that this truncation technique also works when is secret shared (Mohassel and Zhang, 2017).
Secret sharing has been popolarly used in kinds of machine learning algorithms, including linear regression
(Cock et al., 2015), neural network (Mohassel and Zhang, 2017), and recommender system (Chen et al., 2020). In this paper, we apply secret sharing into decentralized gradient descent, and propose a secure decentralized gradient descent protocol for users to learn the linear model of PriRec collaboratively, without compromising users’ private data and model.3. The Proposed Privacy Preserving POI Recommendation Framework
In this section, we first describe motivations, notations, and problem definitions. Next, we present the Privacy preserving POI Recommendation (PriRec) framework, followed by its main components in details. We then summarize the training and prediction algorithms of PriRec, and finally analyze their complexities.
3.1. Preliminary
We first describe the motivation of our proposed PriRec framework, and then present the notations and problem definition, and finally describe model optimization.
3.1.1. Motivation
User privacy in POI recommendation should include two parts, i.e., the data that explicitly expose user privacy and the model that implicitly indicates user preferences or interests.
Data privacy. Both user and item (POI) features are important to recommendation performance. User features show the private information of users, e.g., age, occupation, and consumption ability, which are the most important information that need to be protected when building privacy preserving POI recommender system. One of the reasonable ways is to decentralize these private information on users’ own device instead of collecting them. POI features show the static profile and dynamic operation status of the POI, both of which are public to all the users. The POI static feature are usually POI profiles, e.g., the dish category of a restaurant. The POI dynamic feature are usually operation status data, e.g., the checkin count of a hotel. However, these dynamic POI data are related to userPOI interaction histories, e.g., userhotel checkin history, which are also a part of users’ private data. Thus, techniques that can not only protect individual userPOI actions but also estimate the userPOI action count for each POI should be considered.
To sum up, a privacy preserving POI recommender system should protect both user features and userPOI interaction data.
Model privacy. We take POI recommendation as a CTR prediction problem and design our model by following Factorization Machine (FM) (Rendle, 2012), since FM and its variants are popularly used due to its scalability and capability of capturing highorder feature interactions. Suppose each sample has realvalued features , its prediction of the 2order FM model is defined as,
(1) 
FM model has two parts, i.e., linear model and highorder feature interaction model. First, are the linear model and each parameter denotes the weight of each feature . Obviously, the linear model indicates the users’ preferences on each feature and implicitly expose users’ interests to some extent. Therefore, it should be kept privately by each user from being exposed to other users or the recommender. Second, is the 2order feature interaction model and is the dimensionality of feature interaction factorization. It can be seen that, is used to capture the weight of each feature interaction pair . Clearly, the weights of feature interaction pairs do not expose users’ data or interests, and therefore, can be publish to the recommender.
In summary, a privacy preserving POI recommender system should protect the sensitive models, e.g., the linear model of FM.
3.1.2. Notations and problem definition
Formally, let be the user set and be the private user features of user . Let be the item (POI) set and be the public POI feature of POI . Let be an interaction between user and item , be the feature^{1}^{1}1For simplification, we do not formalize contextual features such as distance and period of time. of a sample with be the concatenation operation, and be the action, e.g., click or not. Let be the training dataset, where all the useritem interactions are known.
Let W be the linear models of users with each row denotes the private linear model saved on the device of user , and let be the public feature interaction model hold by the recommender. The privacy preserving POI recommendation problem is to predict of unknow userPOI pairs, and meanwhile keeps and private. We summarize the notations used in this paper in Table 1.
Notation  Description 

user set  
item set  
private features of user  
public features of item  
an interaction between user and item  
feature of an interaction  
label of an interaction  
predicted label of an interaction  
W  private linear models 
private linear model of user  
V  public feature interaction models 
and  regularization parameters 
th element in the gradient of  
gradient of V  
learning rate  
neighbor of user  
relation strength between user and  
factorization dimension of V  
feature dimension  
logistic function with input  
th share of secret value  
LDP randomized algorithm  
training dataset 
3.1.3. Model optimization
In this paper, we take POI recommendation as a CTR prediction problem. The optimization task is to minimize the sum of losses over the training dataset
(2) 
where is the logistic function, and are the regularization parameters for linear models and feature interaction model respectively, and is defined in Equation (1). For each userPOI pair , its gradient with respect to each element in the linear model is
(3) 
Its gradient in terms of each element in the feature interaction model V is
(4) 
In traditional centralized setting, all the data and models are kept by the recommender, and FM can be efficiently learnt by using gradient descent (Rendle, 2012). In contrast, as we described earlier, in our privacy preserving setting, the private data and the linear models are decentralizedly hold by users. We will present how to learn the linear models and feature interaction model in the privacy preserving setting in Section 3.4 and Section 3.5, respectively.
3.2. Overview of Privacy Preserving POI Recommendation Framework
Our proposed PriRec framework can protect both private data and models, which is shown in Figure 2. To protect data privacy, users’ private data, including features and actions, are decentralized on their own side, e.g., Cellphone or Pad. Besides, all the public POIs’ data are kept by the recommender, and they are mainly in two types: the static data that describes the status of a POI such as POI category, and the dynamic data that indicate the popularity of a POI, e.g., visited count. To protect model privacy, the linear models of PriRec are also decentralized on each user’s side for privacy purpose, and we propose a secure decentralized gradient descent protocol for users to learn them collaboratively. The feature interaction model is kept by the recommender, since it can only infer the interaction weights between features which has no privacy risk. We adopt secure aggregation strategy in federated learning to learn it. To this end, both users’ private data and models are kept by their own hands, and PriRec only collects the perturbed userPOI interaction data. Therefore, PriRec is able to protect both data and model privacy. We will present each part of the framework in details in the following sections.
3.3. Generating POI Dynamic Feature
We propose to generate dynamic POI features, e.g., click count of a restaurant, by using Local Differential Privacy (LDP) to collect perturbed userPOI interaction data. In LDP, each user randomizes his/her private data using a randomized algorithm (mechanism) locally, before sending them to data collector (recommender).
Definition 3.1 ().
A randomized algorithm : is locally differentially private (LDP) if for any pair of values and any subset of output , we have that
LDP formalizes a type of plausible deniability: no matter what output is released, it is approximately equally as likely to have come from one data point as any other (Bassily and Smith, 2015; Ding et al., 2017). In other words, the recommender can not differentiate whether a user has interaction with a POI or not, although it collects a perturbed userPOI interaction. The userPOI interaction is a binary value^{2}^{2}2We take as in when collecting data and as in when learning model., which can be collected from users’ devices by using the following mechanism:
(5) 
After that, the recommender obtains the bits from all the users and the total interaction count for POI can be estimated as
(6) 
It can be proven that above data collection mechanism preserves
LDP, and meanwhile achieves an unbiased estimation of the POIs’ dynamic features
(Ding et al., 2017). Besides dynamic visited count, LDP can also be used to estimate dynamic realvalued features, e.g., the average consumption of a POI. We finally show how to generate dynamic visited count features using LDP in Figure 3.3.4. Learning Linear Model
The linear models are decentralized on each users’ devices for privacy concerns. Therefore, a key challenge is how should users collaboratively learn their linear models. To solve this challenge, we first show the learning procedure of linear model in traditional centralized setting. By using gradient descent, the linear model is updated as follows
(7) 
where is the learning rate, and is the gradient of at time . In decentralizing learning setting, data are hold by each individual learners and the traditional gradient descent is not suitable any more. Existing researches propose to approximate Equation (7) by using Decentralized Gradient Descent (DGD) (Nedic and Ozdaglar, 2009; Yuan et al., 2016),
(8) 
where is the th model of user at time , denotes the neighbors of on a certain user network, and denotes the edge weight between and . We argue that DGD is not secure in our privacy preserving setting, since directly calculating the weighted sum of neighbors’ linear models, i.e., , needs the plaintext model of neighbors, i.e., , which directly reflects the preferences of users.
algocf[t]
To solve the above problem, we propose a secure decentralized gradient descent protocol, as is shown in Algorithm 1. The main idea is to use secret sharing to calculate the summation of neighbors’ linear models. Its security and correctness can be found in (Shamir, 1979)
. Note that the linear models are usually realvalued vectors, and we adopt the efficient fixedpoint arithmetic method as described in Section 2.4, which has also been proven works in secret sharing settings. With the proposed secure decentralized gradient descent protocol, we can train the linear model without compromising users’ private data and model.
The remaining challenge is how to choose neighbors for model propagation. We address this challenge by analyzing the real data in POI recommendation channels from Koubei APP. Figure 4 shows the relationship between userPOI distances and actions. We can observe that, in practice, users tend to click the POIs nearby. In other words, POIs are likely to be interacted by the nearby users. Therefore, we build the user adjacent network by using user geographical information, similar as the existing researches (Ye et al., 2011; Cheng et al., 2012). Specifically, let be the distance between user and , and the edge weight between and is defined as , where is a mapping function that transforms distance to edge weight. Various mapping function has been proposed in literature (Zhao et al., 2016).
In practice, one can not communicate with all the other users, because (1) the communication cost is expensive, and (2) only a handful of users’ devices are online. Therefore, for each user , we randomly choose his/her closest top neighbors based on the distance. Further more, for simplification, we set the edge weights of the built user adjacent network to 1 after choosing neighbors, i.e., . We will empirically study the effect of the number of maximum neighbors () on our model performance.
3.5. Learning Feature Interaction Model
The feature interaction model is kept by the recommender, the relationship between the recommender and individual learners is similar as that of server and worker in parameterserver distributed learning paradigm (Li et al., 2014). The existing works propose secure aggregation strategy to train neural network model in federated learning settings (Konečnỳ et al., 2016; Bonawitz et al., 2017). Motivated by this, we adopt secure aggregation strategy in federated learning for users to learn the feature interaction model of FM collaboratively. Specifically, once a batch of online users have interactions with a POI, i.e., , these users first pull the current feature interaction model from the recommender. They then calculate its gradient based on Equation (4). After that, they securely aggragate the gradients . Finally, the recommender updates the feature interaction model. In this paper, we adopt the most simple secure aggregation protocol in (Bonawitz et al., 2017), i.e., onetime pad masking based on secret sharing. Please refer to Section 4.0.1 in (Bonawitz et al., 2017) for more details. Finally, the recommender updates as follows
(9) 
This strategy has the similar principle with the parameter server distributed learning paradigm (Li et al., 2014). That is, the server (i.e., the recommender) saves the model parameters (V), and the worker (i.e., each user) loads data and updates the models by communicating with the server. It becomes an asynchronous learning task when multiusers interact with POIs simultaneously, which is also a common task in parameter server (Li et al., 2014).
3.6. Model Training and Prediction Algorithm
The training of PriRec includes two parts, i.e., learning linear models and learning feature interaction model, and we summarize it in Algorithm 1. As we have described in Section 3.4, linear models are decentralized on each users’ devices for privacy concerns, and we propose a secure decentralized gradient descent protocol for users to learn them collaboratively. We summarize the learning algorithm in lines 910. The feature interaction model is kept by the recommender, and we adopt secure aggregation strategy in federated learning for users to learn collaboratively, as is presented in Section 3.5, which corresponds to lines 1215 in Algorithm 2.
The prediction of PriRec also needs the communication between users and the recommender, as is shown in Algorithm 3. In it, line 1 denotes the matching procedure before ranking. Different matching strategies can be used, e.g., the simplest location based matching strategy. We do not describe the matching strategies in details, because it is not the focus of this paper. Line 5 omits the contextual features for conciseness.
In summary, PriRec is able to protect users’ private data and model during model training and prediction procedures. Similar to most prior privacy preserving machine learning algorithms (Mohassel and Zhang, 2017), PriRec can only protect against semihonest adversary using secret sharing technique. That is, PriRec assumes the participants strictly follow the protocol execution. We leave how to solve malicious adversary as a future work.
3.7. Complexity Analysis
We now analyze the communication and computation complexities of Algorithm 1. Recall that is the training data size, is the feature size, is the dimensionality of feature interaction factorization, and denotes the number of maximum neighbors to be communicated.
Communication Complexity. For each userPOI pair, the communication relies on two parts. (1) users communicate with each other to learn linear models, i.e., lines 910, and its complexity is ; (2) users communicate with the recommender to learn feature interaction model, i.e., lines 1215, and its complexity is . Therefore, the total communication cost in Algorithm 1 is . In practice, since , the total communication complexity is linear with data size.
Computation Complexity. For each userPOI pair, the computing bottleneck is Equation (1), and it has linear computation complexity after reformulating it (Rendle, 2012). Therefore, the computation complexity of learning linear models, i.e., lines 910, is ; The computation complexity of learning feature interaction model, i.e., lines 1215, is . In total, the computation cost in Algorithm 1 is also . Since , the total computation complexity is also linear with data size.
For Algorithm 2, we analyze that, for predicting each userPOI pair, the communication complexity is and the computation complexity is , where denotes the feature size of POI . In practice is usually very small, therefore, the complexities are linear with .
4. Empirical Study
In this section, we empirically compare the performance of the proposed PriRec with the existing nonprivate POI recommendation method. We also study the effects of parameters on model performance.
4.1. Setting
Dataset  #user  #item  #interaction  #feature 

Foursquare  11,824  13,924  924,474  6 
Koubei  85,466  118,598  497,838  89 
Datasets. We choose two realworld userPOI interaction datasets for experiments, i.e., Foursquare and Koubei.
First, Foursquare a famous benchmark dataset for POI recommendation (Yang et al., 2016). It contains userPOI action histories in two cities, and we only choose the data in Tokyo. We filter the POIs which are interacted by less than 10 users. Since Foursquare only has positive userPOI interaction data, we randomly sample 1 negative userPOI interactions for each record, and therefore, the ratio of positive and negtive records is 1:1. The original dataset only has user features such as gender, friend count. We also generate POI dynamic features using our proposed local DP technique. Moreover, since the Foursquare dataset does not have the geographic locations when a user interacts with a POI, we can not build user geographic adjacent network. Instead, we build the user adjacent network by random. That is, we randomly select () neighbors for each userPOI interaction.
Second, the Koubei dataset is collected from the POI recommendation channel in Koubei^{3}^{3}3https://www.koubei.com/, which is a product of Alibaba and Ant Financial in China, and we filter the users and POIs whose interactions are less than 5. There are many kinds of POIs in this channel, such as restaurants, cinema, and markets. The Koubei dataset consists of two parts, the positive userPOI interaction () indicates that a user clicks on a POI, the negative userPOI interaction () implies that a user ignores a POI after exposure, and their ratio is about 1:3. The Koubei dataset has three kinds of features, as described in Section 3.1.1, i.e., user features such as hometown and gender, POI static features like POI category, and the generated POI dynamic features such as recently clicked count of POIs. The Koubei dataset has geographic location information, with which we build user geographic adjacent network, and we use to denote the maximum number of neighbors for each user.
Finally, Table 2 shows the statistics of both datasets after preprocess.
Metrics.
Since we take POI recommendation as a CTR prediction problem in this paper, we adopt Area Under the receiver operating characteristic Curve (AUC) as the evaluation metric, which is commonly used to evaluate CTR prediction quality
(Ling et al., 2017). In practice, AUC of a classifier is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance
(Fawcett, 2006), therefore, the higher the better.We split both datasets with two strategies: (1) randomly sample 80% as training set and the rest 20% as test set, and (2) randomly sample 90% as training set and the rest 10% as test set. We use Foursquare80 and Koubei80 to denote the first strategy, and use Foursquare90 and Koubei90 to denote the second strategy. We repeat this procedure three times and report their average results.
Model  Data 

Performance  
MF  rating  rating    
DMF  rating  gradient  MF (Chen et al., 2018c)  
FM 


¿ MF (Rendle, 2012)  
PriRec 


?FM  
PriRec 


?FM 
Datasets  Foursquare80  Foursquare90  

Model  FM  PriRec  PriRec  FM  PriRec  PriRec 
0.8152  0.4777  0.7834  0.8106  0.4722  0.7818  
0.8145  0.4771  0.7831  0.8098  0.4727  0.7824  
0.8131  0.4749  0.7829  0.8083  0.4702  0.7816 
Datasets  Koubei80  Koubei90  

Model  FM  PriRec  PriRec  FM  PriRec  PriRec 
0.7154  0.7484  0.7605  0.7172  0.7495  0.7695  
0.7180  0.7519  0.7633  0.7205  0.7534  0.7713  
0.7192  0.7529  0.7643  0.7207  0.7546  0.7720 
Comparison methods. Our proposed PriRec framework is a novel decentralized algorithm of the existing Factorization Model (FM) (Rendle, 2012), and it belongs to privacypreserving decentralized recommendation approaches. FM has been proven outperform the existing Matrix Factorization (MF) (Mnih and Salakhutdinov, 2007) model due to its ability to handle additional feature information besides the useritem interaction (rating) information. As long as the features are useful, which is always so in practice, FM can beat MF consistantly. Therefore, we only compare our proposed model with FM. Moreover, we would like to study the contribution of the generated dynamic POI features to the accuracy of PriRec, and therefore, we use PriRec to indicate the version that PriRec does not use the generated dynamic POI features by LDP. We summarize the characteristics of the above mentioned models in Table 3. From it, we can see that our proposed PriRec framework can utilize more information without compromising users’ private data.
Hyperparameters. We set for LDP when generating dynamic POI features, following the existing research (Ding et al., 2017). We vary the number of maximum neighbors () and the feature interaction factorization dimension () of FM and PriRec to study their effects on model performance, and vary the maximum number of iterations () to study its effect on model convergency. We find the best values of other hyperparameters, including learning rate () and regularization parameters ( and ), in .
4.2. Comparison Results
We compare PriRec and PriRec with the classic FM model on both Foursquare and Koubei datasets. Note that during the comparison, we use grid search to find the best parameters of each model.
Results on Foursquare. We first report the comparison results on Foursquare in Table 4. From it, we find that

In most of the cases, the recommendation performance of each model decreases with training data size and , where is the dimensionality of feature interaction factorization. This is because the Foursquare dataset only has 6 features, including 3 dynamic POI features generated by LDP, which causes overfitting problem.

The AUC performance of PriRec is even less than 0.5 (random guss), which is quite unsatisfying. This is because the original Foursquare dataset only has 3 user features, with which it is unable to train a reasonable model.

Our proposed dynamic POI popularility features generated by using LDP can significantly improve the recommendation performance of PriRec. For example, the AUC of PriRec improves 65.57% comparing with that of PriRec on Foursquare90 when .

PriRec and FM have comparable recommendation performance (0.78+ vs. 0.81+). That is, our proposed model can protect user privacy by sacrificing little recommendation accuracy.
Results on Koubei. We then report the comparison results on Koubei in Table 5. We observe that:

Recommendation performance of each model increases with . is the dimensionality of feature interaction factorization, and therefore, with enough features, the bigger is, the better the learnt feature interaction model V captures the real relations between features.

Our proposed dynamic POI popularility features generated by using LDP can significantly improve the recommendation performance of PriRec. For example, the AUC of PriRec improves 2.67% comparing with that of PriRec on Koubei90 when .

Recommendation performance of PriRec consistently outperforms FM in all the cases. For example, the AUC of PriRec improves that of FM as high as 6.30% on Koubei80 when . Note that, FM uses all the features, including the dynamic POI features, in the traditional centralized training setting. The reason is, in POI recommendation scenarios, the userPOI interactions obey location aggregation, i.e., most users only active in a certain location. Different from FM, which has a centralized linear model, PriRec belongs to decentralized model and it learns the linear models for different users by using secure decentralized gradient descent. To this end, PriRec is able to capture users’ individual interests in different locations. This is consistent with the reality that users in different places have different tastes.
4.3. Parameter Analysis
We first analyze the convergence of PriRec in this section. We show the average training loss and test loss of PriRec w.r.t. the number of iteration number () in Figure 5, where we set and the number of maximum neighbors . It obviously shows that PriRec converges faster on Foursquare80 than Koubei80. This is because there are only 6 features on Foursquare dataset, in contrast, there are 89 features on Koubei.
Next, we study the effect of the number of maximum neighbors () on PriRec and PriRec, which is shown in Figure 6, where we set . From it, we find that with the increase of , the performances of PriRec and PriRec first increases and then tends to be stable. It indicates that PriRec and PriRec, without and with POI dynamic features respectively, can achieve stable performance with only a handful of neighbors () to communicate, which meets the situations that only a small proportion of devices are online in practice. This experiment proofs the practicalness of our proposed models.
Finally, we study the complexity of PriRec. We show the training time of PriRec w.r.t. the training data size in Figure 7, where and . Note that our experiments are conducted on a single PC, thus the network communication time is ignored. From it, we find that the time complexity of PriRec is indeed linear with training data size, as we analyzed in Section 3.7, which proofs the efficiency of PriRec.
5. Conclusion and Future Work
In this paper, we proposed a novel privacy preserving POI recommendation (PriRec) framework for the POI recommendation channel in Ant Financial. To do this, PriRec keeps users’ private profiles on their own devices, and adopts local differential privacy technique to collect perturbed userPOI interaction data on server for generating dynamic POI popularility features. Motivated by Factorization Machine (FM), our proposed model of PriRec includes two parts: (1) the linear models that are decentralized on each users’ side for privacy purpose, which are learnt collaboratively by our proposed secure decentralized gradient descent protocol, and (2) the feature interaction model that is kept by the recommender, which is learnt by secure aggregation strategy in federated learning paradigm. PriRec not only can protect data and model privacy, but also enjoys promising scalability. We applied PriRec in realworld datasets, and comprehensive experiments demonstrated that, compared with FM, PriRec achieves comparable or even better recommendation performance.
In the future, we would like to deploy PriRec in real products. We will also study how to consolidate our algorithm protect against malicious adversary.
References
 Regressionbased latent factor models. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 19–28. Cited by: §2.1.
 A lambic: a privacypreserving recommender system for electronic commerce. International Journal of Information Security 7 (5), pp. 307–334. Cited by: §1, §2.2.

Local, private, efficient protocols for succinct histograms.
In
Proceedings of the fortyseventh annual ACM symposium on Theory of Computing
, pp. 127–135. Cited by: §3.3.  Enhancing privacy and preserving accuracy of a distributed collaborative filtering. In Proceedings of the 2007 ACM conference on Recommender systems, pp. 9–16. Cited by: §1, §2.2.
 Practical secure aggregation for privacypreserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. Cited by: §1, §2.2, §3.5.
 Collaborative filtering with privacy. In Proceedings 2002 IEEE Symposium on Security and Privacy, pp. 45–57. Cited by: §1, §2.2.
 Privacypreserving logistic regression. In Advances in Neural Information Processing Systems, pp. 289–296. Cited by: §2.2.
 Semisupervised learning meets factorization: learning to recommend with chain graph model. ACM Transactions on Knowledge Discovery from Data (TKDD) 12 (6), pp. 1–24. Cited by: §2.1.
 Secure social recommendation based on secret sharing. arXiv preprint arXiv:2002.02088. Cited by: §2.2, §2.4.
 Distributed collaborative hashing and its applications in ant financial. In SIGKDD, KDD ’18, New York, NY, USA, pp. 100–109. External Links: ISBN 9781450355520, Link, Document Cited by: §2.1.

Privacy preserving pointofinterest recommendation using decentralized matrix factorization..
In
ThirtySecond AAAI Conference on Artificial Intelligence
, pp. 257–264. Cited by: §1, Table 3.  Fused matrix factorization with geographical and social influence in locationbased social networks.. In TwentySixth AAAI Conference on Artificial Intelligence, Vol. 12, pp. 17–23. Cited by: §2.1, §3.4.

Wide & deep learning for recommender systems
. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, pp. 7–10. Cited by: §2.1.  Fast, privacy preserving linear regression over distributed datasets based on predistributed data. In Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, pp. 3–14. Cited by: §2.4, §2.4.
 Privacy at scale: local differential privacy in practice. In Proceedings of the 2018 International Conference on Management of Data, pp. 1655–1658. Cited by: §2.3.
 Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on Recommender Systems, pp. 191–198. Cited by: §1.
 Collecting telemetry data privately. In Advances in Neural Information Processing Systems, pp. 3571–3580. Cited by: §1, §2.3, §3.3, §3.3, §4.1.
 Differential privacy: a survey of results. In International Conference on Theory and Applications of Models of Computation, pp. 1–19. Cited by: §2.2.
 Privacy enhanced recommender system. In Thirtyfirst symposium on information theory in the Benelux, pp. 35–42. Cited by: §1, §2.2.
 Rappor: randomized aggregatable privacypreserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on Computer and Communications Security, pp. 1054–1067. Cited by: §2.3.
 An introduction to roc analysis. Pattern recognition letters 27 (8), pp. 861–874. Cited by: §4.1.
 A fully homomorphic encryption scheme. Vol. 20, Stanford University Stanford. Cited by: §2.2.
 Deepfm: a factorizationmachine based neural network for ctr prediction. arXiv preprint arXiv:1703.04247. Cited by: §2.1.
 Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, pp. 173–182. Cited by: §2.1.
 Differentially private matrix factorization. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, pp. 1763–1770. Cited by: §1, §1, §2.2.
 Fieldaware factorization machines for ctr prediction. In Proceedings of the 10th ACM Conference on Recommender Systems, pp. 43–50. Cited by: §2.1.
 Extremal mechanisms for local differential privacy. In Advances in Neural Information Processing Systems, pp. 2879–2887. Cited by: §2.3.
 Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492. Cited by: §1, §2.2, §3.5.
 Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 426–434. Cited by: §1, §2.1.
 Do you trust your recommendations? an exploration of security and privacy issues in recommender systems. pp. 14–29. Cited by: §1, §2.1.
 Scaling distributed machine learning with the parameter server.. In Symposium on Operating Systems Design and Implementation, Vol. 14, pp. 583–598. Cited by: §1, §3.5, §3.5.
 Rankgeofm: a ranking based geographical factorization method for point of interest recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 433–442. Cited by: §2.1.
 GeoMF: joint geographical modeling and matrix factorization for pointofinterest recommendation. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 831–840. Cited by: §2.1.
 Model ensemble for click prediction in bing search ads. In Proceedings of the 26th International Conference on World Wide Web Companion, pp. 689–698. Cited by: §2.1, §4.1.
 Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1222–1230. Cited by: §2.1.
 Differentially private recommender systems: building privacy into the netflix prize contenders. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 627–636. Cited by: §1, §1, §2.2.
 Personalized privacypreserving social recommendation. In ThirtySecond AAAI Conference on Artificial Intelligence, Cited by: §1, §2.2.
 Probabilistic matrix factorization. In Advances in Neural Information Processing Systems, pp. 1257–1264. Cited by: §2.1, §4.1.
 SecureML: a system for scalable privacypreserving machine learning. In IEEE Symposium on Security and Privacy, pp. 19–38. Cited by: §2.2, §2.4, §2.4, §3.6.
 Distributed subgradient methods for multiagent optimization. IEEE Transactions on Automatic Control 54 (1), pp. 48–61. Cited by: §1, §3.4.
 Privacypreserving matrix factorization. In Proceedings of the 2013 ACM SIGSAC conference on Computer and Communications Security, pp. 801–812. Cited by: §1, §2.2.
 Privacypreserving collaborative filtering using randomized perturbation techniques. In IEEE International Conference on Data Mining, pp. 625–628. Cited by: §1.
 Privacypreserving collaborative filtering. International journal of electronic commerce 9 (4), pp. 9–35. Cited by: §1, §2.2.
 BPR: bayesian personalized ranking from implicit feedback. In Proceedings of the twentyfifth conference on Uncertainty in Artificial Intelligence, pp. 452–461. Cited by: §2.1.
 Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology (TIST) 3 (3), pp. 57. Cited by: §1, §2.1, §3.1.1, §3.1.3, §3.7, §4.1, Table 3.
 Private contextaware recommendation of points of interest: an initial investigation. In IEEE International Conference on Pervasive Computing and Communications Workshops, pp. 584–589. Cited by: §1, §1, §2.2.
 Recommender systems: introduction and challenges. In Recommender systems handbook, pp. 1–34. Cited by: §1, §2.1.
 Predicting clicks: estimating the clickthrough rate for new ads. In Proceedings of the 16th international conference on World Wide Web, pp. 521–530. Cited by: §2.1.
 Itembased collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web, pp. 285–295. Cited by: §2.1.
 How to share a secret. Communications of the ACM 22 (11), pp. 612–613. Cited by: §2.4, §3.4.
 Epicrec: towards practical differentially private framework for personalized recommendation. In Proceedings of the 2016 ACM SIGSAC conference on Computer and Communications Security, pp. 180–191. Cited by: §2.3.
 Privacy enhanced matrix factorization for recommendation with local differential privacy. IEEE Transactions on Knowledge and Data Engineering. Cited by: §2.3.
 A survey of collaborative filtering techniques. Advances in Artificial Intelligence 2009. Cited by: §2.1.
 Learning new words. Google Patents. Note: US Patent 9,594,741 Cited by: §2.3.
 Matrix factorization without user data retention. In PacificAsia Conference on Knowledge Discovery and Data Mining, pp. 569–580. Cited by: §2.1.
 Billionscale commodity embedding for ecommerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 839–848. External Links: ISBN 9781450355520 Cited by: §1.
 Deep matrix factorization models for recommender systems.. In IJCAI, pp. 3203–3209. Cited by: §2.1.
 Bridging collaborative filtering and semisupervised learning: a neural approach for poi recommendation. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254. Cited by: §1.
 PrivCheck: privacypreserving checkin data publishing for personalized location based services. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 545–556. Cited by: §4.1.
 A sentimentenhanced personalized location recommendation system. In Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp. 119–128. Cited by: §2.1.
 How to generate and exchange secrets. In 27th Annual Symposium on Foundations of Computer Science, pp. 162–167. Cited by: §2.2.
 Exploiting geographical influence for collaborative pointofinterest recommendation. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 325–334. Cited by: §2.1, §3.4.
 On the convergence of decentralized gradient descent. SIAM Journal on Optimization 26 (3), pp. 1835–1854. Cited by: §1, §3.4.
 Deep learning over multifield categorical data. In European Conference on Information Retrieval, pp. 45–57. Cited by: §2.1.
 A survey of pointofinterest recommendation in locationbased social networks. arXiv preprint arXiv:1607.00647. Cited by: §3.4.
Comments
There are no comments yet.