Practical Privacy Preserving POI Recommendation

03/05/2020 ∙ by Chaochao Chen, et al. ∙ Ant Financial Zhejiang University Peking University 0

Point-of-Interest (POI) recommendation has been extensively studied and successfully applied in industry recently. However, most existing approaches build centralized models on the basis of collecting users' data. Both private data and models are held by the recommender, which causes serious privacy concerns. In this paper, we propose a novel Privacy preserving POI Recommendation (PriRec) framework. First, to protect data privacy, users' private data (features and actions) are kept on their own side, e.g., Cellphone or Pad. Meanwhile, the public data need to be accessed by all the users are kept by the recommender to reduce the storage costs of users' devices. Those public data include: (1) static data only related to the status of POI, such as POI categories, and (2) dynamic data depend on user-POI actions such as visited counts. The dynamic data could be sensitive, and we develop local differential privacy techniques to release such data to public with privacy guarantees. Second, PriRec follows the representations of Factorization Machine (FM) that consists of linear model and the feature interaction model. To protect the model privacy, the linear models are saved on users' side, and we propose a secure decentralized gradient descent protocol for users to learn it collaboratively. The feature interaction model is kept by the recommender since there is no privacy risk, and we adopt secure aggregation strategy in federated learning paradigm to learn it. To this end, PriRec keeps users' private raw data and models in users' own hands, and protects user privacy to a large extent. We apply PriRec in real-world datasets, and comprehensive experiments demonstrate that, compared with FM, PriRec achieves comparable or even better recommendation accuracy.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Recommender system has been drawing much attention in recent decades, and achieving great successes in many real-world applications such as vedio (Covington et al., 2016), e-commerce (Wang et al., 2018), and Point-of-Interest (POI) (e.g., restaurant and hotel) recommendation (Yang et al., 2017), to solve the information overload problem. Take POI recommendation as an example, most promising models are centralizedly built on the basis of collecting users’ private data, which causes serious privacy concerns (Lam et al., 2006; McSherry and Mironov, 2009; Riboni and Bettini, 2012).

Figure 1. Traditional POI recommendation framework. Users’ private data and models are centralizedly kept by the recommender, which raise serious privacy concerns.

A motivating example. Figure 1 shows the framework of most existing POI recommendation approaches, where the data include user profiles (e.g., age and gender), POI descriptions (e.g., category and visited count), and user-POI actions (e.g., click and check-in). Among them, both user profiles and user-POI actions are private, whereas POI descriptions are public to all the users. The model refers to the built recommendation model, e.g., the latent factors of Matrix Factorization (MF) model (Koren, 2008), that predicts users’ preferences on POIs. First, besides the public POI data, users’ private data, including user profiles and user-POI actions, are collected, and these data explicitly show users’ private information and may be abused by the recommender. Second, the models of most existing POI recommendation approaches implicitly indicate users’ private information, e.g., the latent factors of MF can directly infer users’ ratings on items. Therefore, both data and models of most existing recommender systems could be in high privacy risks (Polat and Du, 2003; Ricci et al., 2015).

There have been some studies focus on protecting user privacy while building recommender systems, including the applications to POI recommendation (Riboni and Bettini, 2012; Chen et al., 2018c). They mainly belong to two types. The first type protects the raw data by adding noises to them (Polat and Du, 2005; Berkovsky et al., 2007; McSherry and Mironov, 2009; Riboni and Bettini, 2012; Hua et al., 2015; Meng et al., 2018). These methods are efficient and easy to implement, however, the recommendation performance decreases when adding too much noises. The second type is based on cryptography techniques (Canny, 2002; Polat and Du, 2005; Aïmeur et al., 2008; Erkin et al., 2010; Nikolaenko et al., 2013). These approaches usually can achieve comparable performance with the traditional recommender systems, however, their efficiencies are too low to be applied in practice. Therefore, how to build a privacy preserving recommender system, which can not only protect user data and model privacy, but also has comparable (or even better) recommendation accuracy and high efficiency, remains a challenge.

To solve above challenges, in this paper, we take POI recommendation as a Click-Through Rate (CTR) prediction problem, and propose a novel Privacy preserving POI Recommendation (PriRec) framework, which has the following advantages.

PriRec protects data and model privacy. First, to protect data privacy, PriRec keeps users’ private data (features and actions) on their own side, e.g., Cellphone or Pad. To alleviate the storage costs on users’ devices, all the public POIs’ data are still held by the recommender. These public data can be divided into two types: (1) the static POI data that describe the status of a POI such as POI categories, and (2) the dynamic POI data that indicate the popularity of a POI, e.g., visited count. Since the user-POI actions are kept on users’ devices, to obtain the statistics of these action data, we propose to use local differential privacy technique (Ding et al., 2017) to collect perturbed user-POI interaction data, and further generate POI dynamic features by the recommender. Here, different from the existing models (Hua et al., 2015), which directly use the perturbed data to build models, we only use the perturbed user-POI interaction data for generating statistical features. Second, to protect model privacy, motivated by Factorization Machine (FM) (Rendle, 2012), we design the model of PriRec as two parts: (1) the linear models that are decentralized on each user’s side since they directly indicate user preferences, and (2) the feature interaction model that is kept by the recommender has no privacy risk since it can only infer the interaction weights between features. To this end, both users’ private raw data and models are kept by their own hands, and PriRec is able to protect user privacy to a large extent.

PriRec has linear time complexity and promising recommendation accuracy. The learning process in PriRec includes two parts, the learning of linear models on each user’s side and the learning of feature interaction model kept by the recommender. First, inspired by decentralized gradient descent (Nedic and Ozdaglar, 2009; Yuan et al., 2016), we propose a secure decentralized gradient descent protocol for users to learn their linear models collaboratively. Second, motivated by parameter server distributed learning paradigm (Li et al., 2014) and federated learning (Konečnỳ et al., 2016; Bonawitz et al., 2017), we adopt secure aggregation strategy in federated learning paradigm to learn the feature interaction model. Both strategies are efficient and make the learning of PriRec scales linearly with data size in terms of both computation and communication complexities. Moreover, PriRec belongs to decentralized model and it learns the linear models for different users based on location networks. To this end, PriRec can capture users’ individual interests in different locations, and achieve promising recommendation accuracy.

We apply PriRec in real-world datasets, and comprehensive experiments demonstrate that, compared with the traditional ranking model, PriRec achieves comparable or even better recommendation performance, and meanwhile keeps user privacy.

Our main contributions are summarized as follows:

  • We propose a novel Privacy preserving POI Recommendation (PriRec) framework for POI recommendation, where we propose a secure decentralized gradient descent protocol for learning decentralized linear models and adopt secure aggregation strategy in federated learning paradigm to learn the feature interaction model. PriRec keeps users’ private raw data and models on users’ own side, and therefore protects user privacy to a large extent.

  • We propose to adopt local differential privacy techniques to generate dynamic POI popularity features from users’ local user-POI actions. This can not only protect private user-POI actions, but also significantly improve recommendation performance, as we will show in experiments.

  • We conduct experiments on real-world datasets, and the results demonstrate the effectiveness and efficiency of PriRec.

2. Related Work

In this section, we review related knowledge, including the traditional recommender system, privacy preserving recommender system, local differential privacy, and secret sharing.

2.1. Traditional Recommender System

We first review literatures of traditional recommender system, i.e., non-privacy preserving approaches, including the applications in POI recommendations. The most famous traditional recommender system is Collaborative Filtering (CF) (Sarwar et al., 2001; Su and Khoshgoftaar, 2009; Ye et al., 2011), which is based on the assumption that users who behave similarly on some items will also behave similarly on other items. Among CF, factorization based models achieve promising performance (Koren, 2008; Li et al., 2015; Chen et al., 2018a), which aim to learn user and item latent factors based on known user-item action histories such as ratings and clicks. Popular factorization based CF models include Matrix Factorization (MF) and its variants (Mnih and Salakhutdinov, 2007; Koren, 2008; Cheng et al., 2012; Yang et al., 2013; Lian et al., 2014), regression-based latent factor models (Agarwal and Chen, 2009), Bayesian personalized ranking (Rendle et al., 2009), deep MF (Xue et al., 2017), neural MF (He et al., 2017), and Hash based MF (Chen et al., 2018b).

Besides the above CF models, in practice, ads, merchandise, and POI recommendations are also taken as a Click-Through Rate (CTR) prediction problem (Ling et al., 2017)

. Logistic Regression (LR) is popularly used in most Internet companies, e.g., Microsoft

(Richardson et al., 2007) and Google (McMahan et al., 2013)

, due to its simplicity, scalability, and online learning capability. Deep neural network (DNN) has also been widely used due to its powerful representation ability

(Zhang et al., 2016). Later on, Wide & deep (Cheng et al., 2016) combines the advantages of both LR and DNN for better performance. Besides, Factorization Machine (FM) (Rendle, 2012) and its variations, e.g., DeepFM (Guo et al., 2017) and Field-aware FM (Juan et al., 2016), are also extensively used since they can capture the high-order interactions between features.

Although the traditional recommender systems achieve promising performance, they build centralized recommendation models on the basis of collecting users’ data. Both private data (features and actions) and models are hold by the recommender, which cause serious privacy concerns (Lam et al., 2006; Vallet et al., 2014; Ricci et al., 2015). In this paper, we take POI recommendation as a CTR prediction problem, and propose a novel Privacy preserving POI Recommendation (PriRec) framework for it. PriRec keeps users’ private data and models on users’ own side, e.g., Cellphone or Pad, thus solves the privacy issue.

2.2. Privacy Preserving Recommender System

To date, different approaches have been proposed to solve the privacy issues of the traditional recommender systems. The first type is based on randomized perturbation or differential privacy techniques (Dwork, 2008). That is, they protect users’ original data by adding noise to them. Popular methods of this type include (Polat and Du, 2005; Berkovsky et al., 2007; McSherry and Mironov, 2009; Riboni and Bettini, 2012; Hua et al., 2015; Meng et al., 2018). These methods are efficient and easy to implement, however, there is a trade-off bewteen privacy and recommendation accuracy, i.e., the recommendation performance decreases when the privacy degree increases. The second type is based on cryptography techniques such as homomorphic encryption (Gentry and Boneh, 2009) and secure Multi-Party Computation (MPC) (Yao, 1986), and typical methods include (Canny, 2002; Polat and Du, 2005; Aïmeur et al., 2008; Erkin et al., 2010; Nikolaenko et al., 2013). These approaches usually can achieve comparable performance with the traditional recommender systems, however, the low efficiency of the cryptography techniques limits its application in practice.

Besides the above privacy-preserving recommendation models, there are also existing approaches focus on combining the private data of multi-parties, e.g., different hospitals and banks, meanwhile training machine learning models such as LR

(Chaudhuri and Monteleoni, 2009; Mohassel and Zhang, 2017), which is the so-called collaborative learning or shared machine learning in literature (Chen et al., 2020). They do this by using differential privacy or MPC. The fundamental difference between these works and ours is that, they assume users’ data have been collected by several parties who want to protect their collected data from other parties, while our approach assumes users’ private raw data are kept on their own devices.

The most similar work to ours is Federated Learning (FL) (Konečnỳ et al., 2016; Bonawitz et al., 2017). However, PriRec is different from FL in two aspects: (1) FL assumes that data are decentralized on each user’s device and the model is kept by the server (recommender), while in PriRec, both users’ private data and models are decentralized on each user’s device, and therefore PriRec has better user privacy guarantees; (2) FL only uses secure gradient aggregation strategy to learn neural network model while PriRec uses both secure decentralized gradient descent protocol and secure gradient aggregation strategy to learn FM model.

2.3. Local Differential Privacy

Differential Privacy (DP) has been proposed in the global privacy context to ensure that an adversary should not be able to reliably infer whether or not a particular individual is participating in the database query, while Local Differential Privacy (LDP) was proposed in the local privacy context, as in when individuals disclose their personal information (Kairouz et al., 2014; Cormode et al., 2018)

. LDP has the ability of estimating statistical values of data, e.g., mean and histogram, without disclose users’ raw data, and has been adopted by many companies, including Google

(Erlingsson et al., 2014), Apple (Thakurta et al., 2017), and Microsoft (Ding et al., 2017). Recently, LDP has also been applied in recommender system to protect private user-item ratings (Shen and Jin, 2016; Shin et al., 2018). However, directly using LDP to build models will decrease recommendation performance.

In this paper, we propose to adopt LDP to generate dynamic POI features (e.g., the visited count of a POI) instead of directly building models, which can protect user-POI actions and capture the popularity of POIs. We will show in experiments that the generated POI features can significantly improve recommendation performance.

2.4. Secret Sharing

Secret sharing was first proposed in (Shamir, 1979). The basic idea of secret sharing is to distribute a secret amongst a group of participants (parties), each of whom has a share of the secret. The secret can be reconstructed only when a sufficient number of shares are combined together, and individual shares are of no use on their own. We focus on -out-of- Secret Sharing in this paper, i.e., all shares are needed to reconstruct a secret. To share an -bit value for party , party generates { and } uniformly at random, sends to to party , and keeps mod . We use to denote the share of party . To reconstruct a shared value , each party sends to one who computes mod .

The above protocols can not work directly with decimal numbers, since it is not possible to sample uniformly in (Cock et al., 2015). We approximate decimal arithmetics following the existing work (Mohassel and Zhang, 2017). Suppose and are two decimal numbers with at most bits in the fractional part, to do fixed-point multiplication, we first transform them to integers by letting and , and then calculate . Finally, we truncate the last bits of so that it has at most bits representing the fractional part. It has been proven that this truncation technique also works when is secret shared (Mohassel and Zhang, 2017).

Secret sharing has been popolarly used in kinds of machine learning algorithms, including linear regression

(Cock et al., 2015), neural network (Mohassel and Zhang, 2017), and recommender system (Chen et al., 2020). In this paper, we apply secret sharing into decentralized gradient descent, and propose a secure decentralized gradient descent protocol for users to learn the linear model of PriRec collaboratively, without compromising users’ private data and model.

3. The Proposed Privacy Preserving POI Recommendation Framework

In this section, we first describe motivations, notations, and problem definitions. Next, we present the Privacy preserving POI Recommendation (PriRec) framework, followed by its main components in details. We then summarize the training and prediction algorithms of PriRec, and finally analyze their complexities.

3.1. Preliminary

We first describe the motivation of our proposed PriRec framework, and then present the notations and problem definition, and finally describe model optimization.

Figure 2. Privacy preserving POI Recommendation (PriRec) framework.

3.1.1. Motivation

User privacy in POI recommendation should include two parts, i.e., the data that explicitly expose user privacy and the model that implicitly indicates user preferences or interests.

Data privacy. Both user and item (POI) features are important to recommendation performance. User features show the private information of users, e.g., age, occupation, and consumption ability, which are the most important information that need to be protected when building privacy preserving POI recommender system. One of the reasonable ways is to decentralize these private information on users’ own device instead of collecting them. POI features show the static profile and dynamic operation status of the POI, both of which are public to all the users. The POI static feature are usually POI profiles, e.g., the dish category of a restaurant. The POI dynamic feature are usually operation status data, e.g., the check-in count of a hotel. However, these dynamic POI data are related to user-POI interaction histories, e.g., user-hotel check-in history, which are also a part of users’ private data. Thus, techniques that can not only protect individual user-POI actions but also estimate the user-POI action count for each POI should be considered.

To sum up, a privacy preserving POI recommender system should protect both user features and user-POI interaction data.

Model privacy. We take POI recommendation as a CTR prediction problem and design our model by following Factorization Machine (FM) (Rendle, 2012), since FM and its variants are popularly used due to its scalability and capability of capturing high-order feature interactions. Suppose each sample has real-valued features , its prediction of the 2-order FM model is defined as,

(1)

FM model has two parts, i.e., linear model and high-order feature interaction model. First, are the linear model and each parameter denotes the weight of each feature . Obviously, the linear model indicates the users’ preferences on each feature and implicitly expose users’ interests to some extent. Therefore, it should be kept privately by each user from being exposed to other users or the recommender. Second, is the 2-order feature interaction model and is the dimensionality of feature interaction factorization. It can be seen that, is used to capture the weight of each feature interaction pair . Clearly, the weights of feature interaction pairs do not expose users’ data or interests, and therefore, can be publish to the recommender.

In summary, a privacy preserving POI recommender system should protect the sensitive models, e.g., the linear model of FM.

3.1.2. Notations and problem definition

Formally, let be the user set and be the private user features of user . Let be the item (POI) set and be the public POI feature of POI . Let be an interaction between user and item , be the feature111For simplification, we do not formalize contextual features such as distance and period of time. of a sample with be the concatenation operation, and be the action, e.g., click or not. Let be the training dataset, where all the user-item interactions are known.

Let W be the linear models of users with each row denotes the private linear model saved on the device of user , and let be the public feature interaction model hold by the recommender. The privacy preserving POI recommendation problem is to predict of unknow user-POI pairs, and meanwhile keeps and private. We summarize the notations used in this paper in Table 1.

Notation Description
user set
item set
private features of user
public features of item
an interaction between user and item
feature of an interaction
label of an interaction
predicted label of an interaction
W private linear models
private linear model of user
V public feature interaction models
and regularization parameters
-th element in the gradient of
gradient of V
learning rate
neighbor of user
relation strength between user and
factorization dimension of V
feature dimension
logistic function with input
-th share of secret value
-LDP randomized algorithm
training dataset
Table 1. Notation and description.

3.1.3. Model optimization

In this paper, we take POI recommendation as a CTR prediction problem. The optimization task is to minimize the sum of losses over the training dataset

(2)

where is the logistic function, and are the regularization parameters for linear models and feature interaction model respectively, and is defined in Equation (1). For each user-POI pair , its gradient with respect to each element in the linear model is

(3)

Its gradient in terms of each element in the feature interaction model V is

(4)

In traditional centralized setting, all the data and models are kept by the recommender, and FM can be efficiently learnt by using gradient descent (Rendle, 2012). In contrast, as we described earlier, in our privacy preserving setting, the private data and the linear models are decentralizedly hold by users. We will present how to learn the linear models and feature interaction model in the privacy preserving setting in Section 3.4 and Section 3.5, respectively.

3.2. Overview of Privacy Preserving POI Recommendation Framework

Our proposed PriRec framework can protect both private data and models, which is shown in Figure 2. To protect data privacy, users’ private data, including features and actions, are decentralized on their own side, e.g., Cellphone or Pad. Besides, all the public POIs’ data are kept by the recommender, and they are mainly in two types: the static data that describes the status of a POI such as POI category, and the dynamic data that indicate the popularity of a POI, e.g., visited count. To protect model privacy, the linear models of PriRec are also decentralized on each user’s side for privacy purpose, and we propose a secure decentralized gradient descent protocol for users to learn them collaboratively. The feature interaction model is kept by the recommender, since it can only infer the interaction weights between features which has no privacy risk. We adopt secure aggregation strategy in federated learning to learn it. To this end, both users’ private data and models are kept by their own hands, and PriRec only collects the perturbed user-POI interaction data. Therefore, PriRec is able to protect both data and model privacy. We will present each part of the framework in details in the following sections.

3.3. Generating POI Dynamic Feature

Figure 3. Generate dynamic features for POI using LDP.

We propose to generate dynamic POI features, e.g., click count of a restaurant, by using Local Differential Privacy (LDP) to collect perturbed user-POI interaction data. In LDP, each user randomizes his/her private data using a randomized algorithm (mechanism) locally, before sending them to data collector (recommender).

Definition 3.1 ().

A randomized algorithm : is -locally differentially private (-LDP) if for any pair of values and any subset of output , we have that

LDP formalizes a type of plausible deniability: no matter what output is released, it is approximately equally as likely to have come from one data point as any other (Bassily and Smith, 2015; Ding et al., 2017). In other words, the recommender can not differentiate whether a user has interaction with a POI or not, although it collects a perturbed user-POI interaction. The user-POI interaction is a binary value222We take as in when collecting data and as in when learning model., which can be collected from users’ devices by using the following mechanism:

(5)

After that, the recommender obtains the bits from all the users and the total interaction count for POI can be estimated as

(6)

It can be proven that above data collection mechanism preserves

-LDP, and meanwhile achieves an unbiased estimation of the POIs’ dynamic features

(Ding et al., 2017). Besides dynamic visited count, LDP can also be used to estimate dynamic real-valued features, e.g., the average consumption of a POI. We finally show how to generate dynamic visited count features using LDP in Figure 3.

3.4. Learning Linear Model

The linear models are decentralized on each users’ devices for privacy concerns. Therefore, a key challenge is how should users collaboratively learn their linear models. To solve this challenge, we first show the learning procedure of linear model in traditional centralized setting. By using gradient descent, the linear model is updated as follows

(7)

where is the learning rate, and is the gradient of at time . In decentralizing learning setting, data are hold by each individual learners and the traditional gradient descent is not suitable any more. Existing researches propose to approximate Equation (7) by using Decentralized Gradient Descent (DGD) (Nedic and Ozdaglar, 2009; Yuan et al., 2016),

(8)

where is the -th model of user at time , denotes the neighbors of on a certain user network, and denotes the edge weight between and . We argue that DGD is not secure in our privacy preserving setting, since directly calculating the weighted sum of neighbors’ linear models, i.e., , needs the plaintext model of neighbors, i.e., , which directly reflects the preferences of users.

algocf[t]    

To solve the above problem, we propose a secure decentralized gradient descent protocol, as is shown in Algorithm 1. The main idea is to use secret sharing to calculate the summation of neighbors’ linear models. Its security and correctness can be found in (Shamir, 1979)

. Note that the linear models are usually real-valued vectors, and we adopt the efficient fixed-point arithmetic method as described in Section 2.4, which has also been proven works in secret sharing settings. With the proposed secure decentralized gradient descent protocol, we can train the linear model without compromising users’ private data and model.

The remaining challenge is how to choose neighbors for model propagation. We address this challenge by analyzing the real data in POI recommendation channels from Koubei APP. Figure 4 shows the relationship between user-POI distances and actions. We can observe that, in practice, users tend to click the POIs nearby. In other words, POIs are likely to be interacted by the nearby users. Therefore, we build the user adjacent network by using user geographical information, similar as the existing researches (Ye et al., 2011; Cheng et al., 2012). Specifically, let be the distance between user and , and the edge weight between and is defined as , where is a mapping function that transforms distance to edge weight. Various mapping function has been proposed in literature (Zhao et al., 2016).

Figure 4. Relationship between user-POI distance and click.

In practice, one can not communicate with all the other users, because (1) the communication cost is expensive, and (2) only a handful of users’ devices are online. Therefore, for each user , we randomly choose his/her closest top neighbors based on the distance. Further more, for simplification, we set the edge weights of the built user adjacent network to 1 after choosing neighbors, i.e., . We will empirically study the effect of the number of maximum neighbors () on our model performance.

3.5. Learning Feature Interaction Model

The feature interaction model is kept by the recommender, the relationship between the recommender and individual learners is similar as that of server and worker in parameter-server distributed learning paradigm (Li et al., 2014). The existing works propose secure aggregation strategy to train neural network model in federated learning settings (Konečnỳ et al., 2016; Bonawitz et al., 2017). Motivated by this, we adopt secure aggregation strategy in federated learning for users to learn the feature interaction model of FM collaboratively. Specifically, once a batch of online users have interactions with a POI, i.e., , these users first pull the current feature interaction model from the recommender. They then calculate its gradient based on Equation (4). After that, they securely aggragate the gradients . Finally, the recommender updates the feature interaction model. In this paper, we adopt the most simple secure aggregation protocol in (Bonawitz et al., 2017), i.e., one-time pad masking based on secret sharing. Please refer to Section 4.0.1 in (Bonawitz et al., 2017) for more details. Finally, the recommender updates as follows

(9)

This strategy has the similar principle with the parameter server distributed learning paradigm (Li et al., 2014). That is, the server (i.e., the recommender) saves the model parameters (V), and the worker (i.e., each user) loads data and updates the models by communicating with the server. It becomes an asynchronous learning task when multi-users interact with POIs simultaneously, which is also a common task in parameter server (Li et al., 2014).

Input: training set (), learning rate (), regularization parameters (, ), maximum propagation users (), feature interaction factorization dimension (), and maximum iterations ()
Output: linear model for all the user (W) and
feature interaction model for the recommender (V)
1 The recommender initializes V
2 for Each user  do
3       Initialize
4 end for
5for  to  do
6       Shuffle training data
7       for each user-POI pair , user  do
8             # learn linear model
9             Calculate based on Equation (3)
10             Update based on the secure decentralized gradient descent protocol in Algorithm 1
11             # learn feature interaction model
12             Pull V from the recommender
13             Calculate based on Equation (4)
14             Push to the recommender using secure aggregation
15             The recommender updates V based on Equation (9)
16            
17       end for
18      
19 end for
return W and V
Algorithm 1 PriRec Model Training

3.6. Model Training and Prediction Algorithm

The training of PriRec includes two parts, i.e., learning linear models and learning feature interaction model, and we summarize it in Algorithm 1. As we have described in Section 3.4, linear models are decentralized on each users’ devices for privacy concerns, and we propose a secure decentralized gradient descent protocol for users to learn them collaboratively. We summarize the learning algorithm in lines 9-10. The feature interaction model is kept by the recommender, and we adopt secure aggregation strategy in federated learning for users to learn collaboratively, as is presented in Section 3.5, which corresponds to lines 12-15 in Algorithm 2.

The prediction of PriRec also needs the communication between users and the recommender, as is shown in Algorithm 3. In it, line 1 denotes the matching procedure before ranking. Different matching strategies can be used, e.g., the simplest location based matching strategy. We do not describe the matching strategies in details, because it is not the focus of this paper. Line 5 omits the contextual features for conciseness.

In summary, PriRec is able to protect users’ private data and model during model training and prediction procedures. Similar to most prior privacy preserving machine learning algorithms (Mohassel and Zhang, 2017), PriRec can only protect against semi-honest adversary using secret sharing technique. That is, PriRec assumes the participants strictly follow the protocol execution. We leave how to solve malicious adversary as a future work.

Input: features for user (), features for each POI (), linear model for user () on his/her device, and feature interaction model on server (V)
Output: recommend top POIs for user
1 Get the matched POI set
2 Pull V from the recommender
3 for each POI , user  do
4       # combine features
5       Pull POI ’s feature from the recommender
6       Concat user ’s feature with POI ’s feature , and get # predict score
7       Predict user ’s score on POI based on Equation (1)
8      
9 end for
10Recommend top POIs for user with the highest scores
return for user
Algorithm 2 PriRec Model Prediction for User

3.7. Complexity Analysis

We now analyze the communication and computation complexities of Algorithm 1. Recall that is the training data size, is the feature size, is the dimensionality of feature interaction factorization, and denotes the number of maximum neighbors to be communicated.

Communication Complexity. For each user-POI pair, the communication relies on two parts. (1) users communicate with each other to learn linear models, i.e., lines 9-10, and its complexity is ; (2) users communicate with the recommender to learn feature interaction model, i.e., lines 12-15, and its complexity is . Therefore, the total communication cost in Algorithm 1 is . In practice, since , the total communication complexity is linear with data size.

Computation Complexity. For each user-POI pair, the computing bottleneck is Equation (1), and it has linear computation complexity after reformulating it (Rendle, 2012). Therefore, the computation complexity of learning linear models, i.e., lines 9-10, is ; The computation complexity of learning feature interaction model, i.e., lines 12-15, is . In total, the computation cost in Algorithm 1 is also . Since , the total computation complexity is also linear with data size.

For Algorithm 2, we analyze that, for predicting each user-POI pair, the communication complexity is and the computation complexity is , where denotes the feature size of POI . In practice is usually very small, therefore, the complexities are linear with .

4. Empirical Study

In this section, we empirically compare the performance of the proposed PriRec with the existing non-private POI recommendation method. We also study the effects of parameters on model performance.

4.1. Setting

Dataset #user #item #interaction #feature
Foursquare 11,824 13,924 924,474 6
Koubei 85,466 118,598 497,838 89
Table 2. Dataset description

Datasets. We choose two real-world user-POI interaction datasets for experiments, i.e., Foursquare and Koubei.

First, Foursquare a famous benchmark dataset for POI recommendation (Yang et al., 2016). It contains user-POI action histories in two cities, and we only choose the data in Tokyo. We filter the POIs which are interacted by less than 10 users. Since Foursquare only has positive user-POI interaction data, we randomly sample 1 negative user-POI interactions for each record, and therefore, the ratio of positive and negtive records is 1:1. The original dataset only has user features such as gender, friend count. We also generate POI dynamic features using our proposed local DP technique. Moreover, since the Foursquare dataset does not have the geographic locations when a user interacts with a POI, we can not build user geographic adjacent network. Instead, we build the user adjacent network by random. That is, we randomly select () neighbors for each user-POI interaction.

Second, the Koubei dataset is collected from the POI recommendation channel in Koubei333https://www.koubei.com/, which is a product of Alibaba and Ant Financial in China, and we filter the users and POIs whose interactions are less than 5. There are many kinds of POIs in this channel, such as restaurants, cinema, and markets. The Koubei dataset consists of two parts, the positive user-POI interaction () indicates that a user clicks on a POI, the negative user-POI interaction () implies that a user ignores a POI after exposure, and their ratio is about 1:3. The Koubei dataset has three kinds of features, as described in Section 3.1.1, i.e., user features such as hometown and gender, POI static features like POI category, and the generated POI dynamic features such as recently clicked count of POIs. The Koubei dataset has geographic location information, with which we build user geographic adjacent network, and we use to denote the maximum number of neighbors for each user.

Finally, Table 2 shows the statistics of both datasets after pre-process.

Metrics.

Since we take POI recommendation as a CTR prediction problem in this paper, we adopt Area Under the receiver operating characteristic Curve (AUC) as the evaluation metric, which is commonly used to evaluate CTR prediction quality

(Ling et al., 2017)

. In practice, AUC of a classifier is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance

(Fawcett, 2006), therefore, the higher the better.

We split both datasets with two strategies: (1) randomly sample 80% as training set and the rest 20% as test set, and (2) randomly sample 90% as training set and the rest 10% as test set. We use Foursquare80 and Koubei80 to denote the first strategy, and use Foursquare90 and Koubei90 to denote the second strategy. We repeat this procedure three times and report their average results.

Model Data
Information
Leakage
Performance
MF rating rating -
DMF rating gradient MF (Chen et al., 2018c)
FM
rating
user feature
POI static feature
POI dynamic feature
rating
and
user feature
¿ MF (Rendle, 2012)
PriRec-
rating
user feature
POI static feature
no
private
information
leakage
?FM
PriRec
rating
user feature
POI static feature
POI dynamic feature
no
private
information
leakage
?FM
Table 3. Summary of the existing models and our proposed models, where users’ private information are shown in italics and ?FM means we want to study the corresponding model performance with FM during experiments.
Datasets Foursquare80 Foursquare90
Model FM PriRec- PriRec FM PriRec- PriRec
0.8152 0.4777 0.7834 0.8106 0.4722 0.7818
0.8145 0.4771 0.7831 0.8098 0.4727 0.7824
0.8131 0.4749 0.7829 0.8083 0.4702 0.7816
Table 4. AUC comparison on Foursquare datasets
Datasets Koubei80 Koubei90
Model FM PriRec- PriRec FM PriRec- PriRec
0.7154 0.7484 0.7605 0.7172 0.7495 0.7695
0.7180 0.7519 0.7633 0.7205 0.7534 0.7713
0.7192 0.7529 0.7643 0.7207 0.7546 0.7720
Table 5. AUC comparison on Koubei datasets

Comparison methods. Our proposed PriRec framework is a novel decentralized algorithm of the existing Factorization Model (FM) (Rendle, 2012), and it belongs to privacy-preserving decentralized recommendation approaches. FM has been proven outperform the existing Matrix Factorization (MF) (Mnih and Salakhutdinov, 2007) model due to its ability to handle additional feature information besides the user-item interaction (rating) information. As long as the features are useful, which is always so in practice, FM can beat MF consistantly. Therefore, we only compare our proposed model with FM. Moreover, we would like to study the contribution of the generated dynamic POI features to the accuracy of PriRec, and therefore, we use PriRec- to indicate the version that PriRec does not use the generated dynamic POI features by LDP. We summarize the characteristics of the above mentioned models in Table 3. From it, we can see that our proposed PriRec framework can utilize more information without compromising users’ private data.

Hyper-parameters. We set for LDP when generating dynamic POI features, following the existing research (Ding et al., 2017). We vary the number of maximum neighbors () and the feature interaction factorization dimension () of FM and PriRec to study their effects on model performance, and vary the maximum number of iterations () to study its effect on model convergency. We find the best values of other hyper-parameters, including learning rate () and regularization parameters ( and ), in .

4.2. Comparison Results

We compare PriRec and PriRec- with the classic FM model on both Foursquare and Koubei datasets. Note that during the comparison, we use grid search to find the best parameters of each model.

Results on Foursquare. We first report the comparison results on Foursquare in Table 4. From it, we find that

  • In most of the cases, the recommendation performance of each model decreases with training data size and , where is the dimensionality of feature interaction factorization. This is because the Foursquare dataset only has 6 features, including 3 dynamic POI features generated by LDP, which causes over-fitting problem.

  • The AUC performance of PriRec- is even less than 0.5 (random guss), which is quite unsatisfying. This is because the original Foursquare dataset only has 3 user features, with which it is unable to train a reasonable model.

  • Our proposed dynamic POI popularility features generated by using LDP can significantly improve the recommendation performance of PriRec. For example, the AUC of PriRec improves 65.57% comparing with that of PriRec- on Foursquare90 when .

  • PriRec and FM have comparable recommendation performance (0.78+ vs. 0.81+). That is, our proposed model can protect user privacy by sacrificing little recommendation accuracy.

Results on Koubei. We then report the comparison results on Koubei in Table 5. We observe that:

  • Recommendation performance of each model increases with . is the dimensionality of feature interaction factorization, and therefore, with enough features, the bigger is, the better the learnt feature interaction model V captures the real relations between features.

  • Our proposed dynamic POI popularility features generated by using LDP can significantly improve the recommendation performance of PriRec. For example, the AUC of PriRec improves 2.67% comparing with that of PriRec- on Koubei90 when .

  • Recommendation performance of PriRec consistently outperforms FM in all the cases. For example, the AUC of PriRec improves that of FM as high as 6.30% on Koubei80 when . Note that, FM uses all the features, including the dynamic POI features, in the traditional centralized training setting. The reason is, in POI recommendation scenarios, the user-POI interactions obey location aggregation, i.e., most users only active in a certain location. Different from FM, which has a centralized linear model, PriRec belongs to decentralized model and it learns the linear models for different users by using secure decentralized gradient descent. To this end, PriRec is able to capture users’ individual interests in different locations. This is consistent with the reality that users in different places have different tastes.

(a) Average training loss on Foursquare80
(b) Average test loss on Foursquare80
(c) Average training loss on Koubei80
(d) Average test loss on Koubei80
Figure 5. Average trainning and test losses of PriRec w.r.t. the number of iteration number ().

4.3. Parameter Analysis

We first analyze the convergence of PriRec in this section. We show the average training loss and test loss of PriRec w.r.t. the number of iteration number () in Figure 5, where we set and the number of maximum neighbors . It obviously shows that PriRec converges faster on Foursquare80 than Koubei80. This is because there are only 6 features on Foursquare dataset, in contrast, there are 89 features on Koubei.

Next, we study the effect of the number of maximum neighbors () on PriRec- and PriRec, which is shown in Figure 6, where we set . From it, we find that with the increase of , the performances of PriRec- and PriRec first increases and then tends to be stable. It indicates that PriRec- and PriRec, without and with POI dynamic features respectively, can achieve stable performance with only a handful of neighbors () to communicate, which meets the situations that only a small proportion of devices are online in practice. This experiment proofs the practicalness of our proposed models.

(a) PriRec- on Foursquare80
(b) PriRec- on Foursquare90
(c) PriRec on Koubei80
(d) PriRec on Koubei90
Figure 6. Effect of the number of maximum neighbors () on the AUC of PriRec- and PriRec.

Finally, we study the complexity of PriRec. We show the training time of PriRec w.r.t. the training data size in Figure 7, where and . Note that our experiments are conducted on a single PC, thus the network communication time is ignored. From it, we find that the time complexity of PriRec is indeed linear with training data size, as we analyzed in Section 3.7, which proofs the efficiency of PriRec.

Figure 7. Training time (in seconds) of PriRec w.r.t. training data size.

5. Conclusion and Future Work

In this paper, we proposed a novel privacy preserving POI recommendation (PriRec) framework for the POI recommendation channel in Ant Financial. To do this, PriRec keeps users’ private profiles on their own devices, and adopts local differential privacy technique to collect perturbed user-POI interaction data on server for generating dynamic POI popularility features. Motivated by Factorization Machine (FM), our proposed model of PriRec includes two parts: (1) the linear models that are decentralized on each users’ side for privacy purpose, which are learnt collaboratively by our proposed secure decentralized gradient descent protocol, and (2) the feature interaction model that is kept by the recommender, which is learnt by secure aggregation strategy in federated learning paradigm. PriRec not only can protect data and model privacy, but also enjoys promising scalability. We applied PriRec in real-world datasets, and comprehensive experiments demonstrated that, compared with FM, PriRec achieves comparable or even better recommendation performance.

In the future, we would like to deploy PriRec in real products. We will also study how to consolidate our algorithm protect against malicious adversary.

References

  • D. Agarwal and B. Chen (2009) Regression-based latent factor models. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 19–28. Cited by: §2.1.
  • E. Aïmeur, G. Brassard, J. M. Fernandez, and F. S. M. Onana (2008) A lambic: a privacy-preserving recommender system for electronic commerce. International Journal of Information Security 7 (5), pp. 307–334. Cited by: §1, §2.2.
  • R. Bassily and A. Smith (2015) Local, private, efficient protocols for succinct histograms. In

    Proceedings of the forty-seventh annual ACM symposium on Theory of Computing

    ,
    pp. 127–135. Cited by: §3.3.
  • S. Berkovsky, Y. Eytani, T. Kuflik, and F. Ricci (2007) Enhancing privacy and preserving accuracy of a distributed collaborative filtering. In Proceedings of the 2007 ACM conference on Recommender systems, pp. 9–16. Cited by: §1, §2.2.
  • K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth (2017) Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. Cited by: §1, §2.2, §3.5.
  • J. Canny (2002) Collaborative filtering with privacy. In Proceedings 2002 IEEE Symposium on Security and Privacy, pp. 45–57. Cited by: §1, §2.2.
  • K. Chaudhuri and C. Monteleoni (2009) Privacy-preserving logistic regression. In Advances in Neural Information Processing Systems, pp. 289–296. Cited by: §2.2.
  • C. Chen, K. C. Chang, Q. Li, and X. Zheng (2018a) Semi-supervised learning meets factorization: learning to recommend with chain graph model. ACM Transactions on Knowledge Discovery from Data (TKDD) 12 (6), pp. 1–24. Cited by: §2.1.
  • C. Chen, L. Li, B. Wu, C. Hong, L. Wang, and J. Zhou (2020) Secure social recommendation based on secret sharing. arXiv preprint arXiv:2002.02088. Cited by: §2.2, §2.4.
  • C. Chen, Z. Liu, P. Zhao, L. Li, J. Zhou, and X. Li (2018b) Distributed collaborative hashing and its applications in ant financial. In SIGKDD, KDD ’18, New York, NY, USA, pp. 100–109. External Links: ISBN 978-1-4503-5552-0, Link, Document Cited by: §2.1.
  • C. Chen, Z. Liu, P. Zhao, J. Zhou, and X. Li (2018c) Privacy preserving point-of-interest recommendation using decentralized matrix factorization.. In

    Thirty-Second AAAI Conference on Artificial Intelligence

    ,
    pp. 257–264. Cited by: §1, Table 3.
  • C. Cheng, H. Yang, I. King, and M. R. Lyu (2012) Fused matrix factorization with geographical and social influence in location-based social networks.. In Twenty-Sixth AAAI Conference on Artificial Intelligence, Vol. 12, pp. 17–23. Cited by: §2.1, §3.4.
  • H. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, W. Chai, M. Ispir, et al. (2016)

    Wide & deep learning for recommender systems

    .
    In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, pp. 7–10. Cited by: §2.1.
  • M. d. Cock, R. Dowsley, A. C. Nascimento, and S. C. Newman (2015) Fast, privacy preserving linear regression over distributed datasets based on pre-distributed data. In Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security, pp. 3–14. Cited by: §2.4, §2.4.
  • G. Cormode, S. Jha, T. Kulkarni, N. Li, D. Srivastava, and T. Wang (2018) Privacy at scale: local differential privacy in practice. In Proceedings of the 2018 International Conference on Management of Data, pp. 1655–1658. Cited by: §2.3.
  • P. Covington, J. Adams, and E. Sargin (2016) Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM conference on Recommender Systems, pp. 191–198. Cited by: §1.
  • B. Ding, J. Kulkarni, and S. Yekhanin (2017) Collecting telemetry data privately. In Advances in Neural Information Processing Systems, pp. 3571–3580. Cited by: §1, §2.3, §3.3, §3.3, §4.1.
  • C. Dwork (2008) Differential privacy: a survey of results. In International Conference on Theory and Applications of Models of Computation, pp. 1–19. Cited by: §2.2.
  • Z. Erkin, M. Beye, T. Veugen, and R. L. Lagendijk (2010) Privacy enhanced recommender system. In Thirty-first symposium on information theory in the Benelux, pp. 35–42. Cited by: §1, §2.2.
  • Ú. Erlingsson, V. Pihur, and A. Korolova (2014) Rappor: randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC conference on Computer and Communications Security, pp. 1054–1067. Cited by: §2.3.
  • T. Fawcett (2006) An introduction to roc analysis. Pattern recognition letters 27 (8), pp. 861–874. Cited by: §4.1.
  • C. Gentry and D. Boneh (2009) A fully homomorphic encryption scheme. Vol. 20, Stanford University Stanford. Cited by: §2.2.
  • H. Guo, R. Tang, Y. Ye, Z. Li, and X. He (2017) Deepfm: a factorization-machine based neural network for ctr prediction. arXiv preprint arXiv:1703.04247. Cited by: §2.1.
  • X. He, L. Liao, H. Zhang, L. Nie, X. Hu, and T. Chua (2017) Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, pp. 173–182. Cited by: §2.1.
  • J. Hua, C. Xia, and S. Zhong (2015) Differentially private matrix factorization. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, pp. 1763–1770. Cited by: §1, §1, §2.2.
  • Y. Juan, Y. Zhuang, W. Chin, and C. Lin (2016) Field-aware factorization machines for ctr prediction. In Proceedings of the 10th ACM Conference on Recommender Systems, pp. 43–50. Cited by: §2.1.
  • P. Kairouz, S. Oh, and P. Viswanath (2014) Extremal mechanisms for local differential privacy. In Advances in Neural Information Processing Systems, pp. 2879–2887. Cited by: §2.3.
  • J. Konečnỳ, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon (2016) Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492. Cited by: §1, §2.2, §3.5.
  • Y. Koren (2008) Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 426–434. Cited by: §1, §2.1.
  • S. Lam, D. Frankowski, and J. Riedl (2006) Do you trust your recommendations? an exploration of security and privacy issues in recommender systems. pp. 14–29. Cited by: §1, §2.1.
  • M. Li, D. G. Andersen, J. W. Park, A. J. Smola, A. Ahmed, V. Josifovski, J. Long, E. J. Shekita, and B. Su (2014) Scaling distributed machine learning with the parameter server.. In Symposium on Operating Systems Design and Implementation, Vol. 14, pp. 583–598. Cited by: §1, §3.5, §3.5.
  • X. Li, G. Cong, X. Li, T. N. Pham, and S. Krishnaswamy (2015) Rank-geofm: a ranking based geographical factorization method for point of interest recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 433–442. Cited by: §2.1.
  • D. Lian, C. Zhao, X. Xie, G. Sun, E. Chen, and Y. Rui (2014) GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 831–840. Cited by: §2.1.
  • X. Ling, W. Deng, C. Gu, H. Zhou, C. Li, and F. Sun (2017) Model ensemble for click prediction in bing search ads. In Proceedings of the 26th International Conference on World Wide Web Companion, pp. 689–698. Cited by: §2.1, §4.1.
  • H. B. McMahan, G. Holt, D. Sculley, M. Young, D. Ebner, J. Grady, L. Nie, T. Phillips, E. Davydov, D. Golovin, et al. (2013) Ad click prediction: a view from the trenches. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1222–1230. Cited by: §2.1.
  • F. McSherry and I. Mironov (2009) Differentially private recommender systems: building privacy into the netflix prize contenders. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 627–636. Cited by: §1, §1, §2.2.
  • X. Meng, S. Wang, K. Shu, J. Li, B. Chen, H. Liu, and Y. Zhang (2018) Personalized privacy-preserving social recommendation. In Thirty-Second AAAI Conference on Artificial Intelligence, Cited by: §1, §2.2.
  • A. Mnih and R. Salakhutdinov (2007) Probabilistic matrix factorization. In Advances in Neural Information Processing Systems, pp. 1257–1264. Cited by: §2.1, §4.1.
  • P. Mohassel and Y. Zhang (2017) SecureML: a system for scalable privacy-preserving machine learning. In IEEE Symposium on Security and Privacy, pp. 19–38. Cited by: §2.2, §2.4, §2.4, §3.6.
  • A. Nedic and A. Ozdaglar (2009) Distributed subgradient methods for multi-agent optimization. IEEE Transactions on Automatic Control 54 (1), pp. 48–61. Cited by: §1, §3.4.
  • V. Nikolaenko, S. Ioannidis, U. Weinsberg, M. Joye, N. Taft, and D. Boneh (2013) Privacy-preserving matrix factorization. In Proceedings of the 2013 ACM SIGSAC conference on Computer and Communications Security, pp. 801–812. Cited by: §1, §2.2.
  • H. Polat and W. Du (2003) Privacy-preserving collaborative filtering using randomized perturbation techniques. In IEEE International Conference on Data Mining, pp. 625–628. Cited by: §1.
  • H. Polat and W. Du (2005) Privacy-preserving collaborative filtering. International journal of electronic commerce 9 (4), pp. 9–35. Cited by: §1, §2.2.
  • S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme (2009) BPR: bayesian personalized ranking from implicit feedback. In Proceedings of the twenty-fifth conference on Uncertainty in Artificial Intelligence, pp. 452–461. Cited by: §2.1.
  • S. Rendle (2012) Factorization machines with libfm. ACM Transactions on Intelligent Systems and Technology (TIST) 3 (3), pp. 57. Cited by: §1, §2.1, §3.1.1, §3.1.3, §3.7, §4.1, Table 3.
  • D. Riboni and C. Bettini (2012) Private context-aware recommendation of points of interest: an initial investigation. In IEEE International Conference on Pervasive Computing and Communications Workshops, pp. 584–589. Cited by: §1, §1, §2.2.
  • F. Ricci, L. Rokach, and B. Shapira (2015) Recommender systems: introduction and challenges. In Recommender systems handbook, pp. 1–34. Cited by: §1, §2.1.
  • M. Richardson, E. Dominowska, and R. Ragno (2007) Predicting clicks: estimating the click-through rate for new ads. In Proceedings of the 16th international conference on World Wide Web, pp. 521–530. Cited by: §2.1.
  • B. Sarwar, G. Karypis, J. Konstan, and J. Riedl (2001) Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th international conference on World Wide Web, pp. 285–295. Cited by: §2.1.
  • A. Shamir (1979) How to share a secret. Communications of the ACM 22 (11), pp. 612–613. Cited by: §2.4, §3.4.
  • Y. Shen and H. Jin (2016) Epicrec: towards practical differentially private framework for personalized recommendation. In Proceedings of the 2016 ACM SIGSAC conference on Computer and Communications Security, pp. 180–191. Cited by: §2.3.
  • H. Shin, S. Kim, J. Shin, and X. Xiao (2018) Privacy enhanced matrix factorization for recommendation with local differential privacy. IEEE Transactions on Knowledge and Data Engineering. Cited by: §2.3.
  • X. Su and T. M. Khoshgoftaar (2009) A survey of collaborative filtering techniques. Advances in Artificial Intelligence 2009. Cited by: §2.1.
  • A. G. Thakurta, A. H. Vyrros, U. S. Vaishampayan, G. Kapoor, J. Freudiger, V. R. Sridhar, and D. Davidson (2017) Learning new words. Google Patents. Note: US Patent 9,594,741 Cited by: §2.3.
  • D. Vallet, A. Friedman, and S. Berkovsky (2014) Matrix factorization without user data retention. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 569–580. Cited by: §2.1.
  • J. Wang, P. Huang, H. Zhao, Z. Zhang, B. Zhao, and D. L. Lee (2018) Billion-scale commodity embedding for e-commerce recommendation in alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 839–848. External Links: ISBN 978-1-4503-5552-0 Cited by: §1.
  • H. Xue, X. Dai, J. Zhang, S. Huang, and J. Chen (2017) Deep matrix factorization models for recommender systems.. In IJCAI, pp. 3203–3209. Cited by: §2.1.
  • C. Yang, L. Bai, C. Zhang, Q. Yuan, and J. Han (2017) Bridging collaborative filtering and semi-supervised learning: a neural approach for poi recommendation. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1245–1254. Cited by: §1.
  • D. Yang, D. Zhang, B. Qu, and P. Cudre-Mauroux (2016) PrivCheck: privacy-preserving check-in data publishing for personalized location based services. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing, pp. 545–556. Cited by: §4.1.
  • D. Yang, D. Zhang, Z. Yu, and Z. Wang (2013) A sentiment-enhanced personalized location recommendation system. In Proceedings of the 24th ACM Conference on Hypertext and Social Media, pp. 119–128. Cited by: §2.1.
  • A. C. Yao (1986) How to generate and exchange secrets. In 27th Annual Symposium on Foundations of Computer Science, pp. 162–167. Cited by: §2.2.
  • M. Ye, P. Yin, W. Lee, and D. Lee (2011) Exploiting geographical influence for collaborative point-of-interest recommendation. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pp. 325–334. Cited by: §2.1, §3.4.
  • K. Yuan, Q. Ling, and W. Yin (2016) On the convergence of decentralized gradient descent. SIAM Journal on Optimization 26 (3), pp. 1835–1854. Cited by: §1, §3.4.
  • W. Zhang, T. Du, and J. Wang (2016) Deep learning over multi-field categorical data. In European Conference on Information Retrieval, pp. 45–57. Cited by: §2.1.
  • S. Zhao, I. King, and M. R. Lyu (2016) A survey of point-of-interest recommendation in location-based social networks. arXiv preprint arXiv:1607.00647. Cited by: §3.4.