DeepAI
Log In Sign Up

PrivMVMF: Privacy-Preserving Multi-View Matrix Factorization for Recommender Systems

09/29/2022
by   Peihua Mai, et al.
National University of Singapore
0

With an increasing focus on data privacy, there have been pilot studies on recommender systems in a federated learning (FL) framework, where multiple parties collaboratively train a model without sharing their data. Most of these studies assume that the conventional FL framework can fully protect user privacy. However, there are serious privacy risks in matrix factorization in federated recommender systems based on our study. This paper first provides a rigorous theoretical analysis of the server reconstruction attack in four scenarios in federated recommender systems, followed by comprehensive experiments. The empirical results demonstrate that the FL server could infer users' information with accuracy >80 nodes. The robustness analysis suggests that our reconstruction attack analysis outperforms the random guess by >30 0.5 for all scenarios. Then, the paper proposes a new privacy-preserving framework based on homomorphic encryption, Privacy-Preserving Multi-View Matrix Factorization (PrivMVMF), to enhance user data privacy protection in federated recommender systems. The proposed PrivMVMF is successfully implemented and tested thoroughly with the MovieLens dataset.

READ FULL TEXT VIEW PDF
07/03/2020

Privacy Threats Against Federated Matrix Factorization

Matrix Factorization has been very successful in practical recommendatio...
08/18/2021

Practical and Secure Federated Recommendation with Personalized Masks

Federated recommendation is a new notion of private distributed recommen...
04/04/2022

Towards Privacy-Preserving and Verifiable Federated Matrix Factorization

Recent years have witnessed the rapid growth of federated learning (FL),...
04/08/2020

Federated Multi-view Matrix Factorization for Personalized Recommendations

We introduce the federated multi-view matrix factorization method that e...
08/25/2020

A Federated Multi-View Deep Learning Framework for Privacy-Preserving Recommendations

Privacy-preserving recommendations are recently gaining momentum, since ...
06/23/2022

LightFR: Lightweight Federated Recommendation with Privacy-preserving Matrix Factorization

Federated recommender system (FRS), which enables many local devices to ...
02/23/2022

TEE-based decentralized recommender systems: The raw data sharing redemption

Recommenders are central in many applications today. The most effective ...

1 Introduction

The recommendation system relies on collecting users’ personal information, such as purchase history, explicit feedback, social relationship, and so on. Recently, some laws and regulations have been enacted to protect user privacy, which places constraints on the collection and exchange of users’ personal data.

To protect user privacy, one way is to develop a recommendation system in federated learning (FL) framework that enables the clients to jointly train a model without sharing their data. In the FL setting, each client computes the updated gradient locally and sends the model update instead of the original data to a central server. The server then aggregates the gradients and updates the global model [10].

Collaborative filtering (CF) is one of the most effective approaches in recommendation systems [21], and matrix factorization (MF) is a popular technique in CF algorithms. MF decomposes a user-item interaction matrix into two low-rank matrices: user latent factors and item latent factors, which are used to generate the preference prediction [11]. One disadvantage of MF-based recommendation is the cold-start problem: if an item or user has no rating information, the model cannot generate a latent factor representation for it and thus suffers difficulty in performing MF recommendation. A solution to the cold-start issue is to incorporate side information, i.e., user and item attributes, into matrix factorization.

Various approaches have been proposed for centralized recommender systems [6, 19, 17, 9]. However, few studies have researched the topic in the federated setting. To the best of our knowledge, Flanagan et al. [8] is the first to propose a federated multi-view matrix factorization (MVMF) to address this problem. However, this method assumed that the conventional FL framework could fully protect user privacy. However, severe privacy risks exist in the federated MVMF recommender system, which is susceptible to server reconstruction attacks, i.e., the attack to recover users’ sensitive information.

To fill this gap, this paper first provides a theoretical analysis of the privacy threat of the federated MVMF method. In theoretical analysis, we develop server reconstruction attacks in four scenarios based on different treatments on unobserved ratings and methods to update user latent factors. The empirical study results indicate that the original federated MVMF method could leak users’ personal information. Then, we design a privacy-preserving federated MVMF framework using homomorphic encryption (HE) to enhance the user data privacy protection in federated recommender systems.

The main contributions of this paper are twofold:

(1) To the best of our knowledge, we are the first to provide a rigorous theoretical analysis of server reconstruction attacks in the federated MVMF recommender system. We also conducted comprehensive experiments, which show that the server could infer users’ sensitive information with accuracy using such attacks, and the attack is effective under a small amount of noise.

(2) To overcome the information leakage problem, we propose PrivMVMF, a privacy-preserving federated MVMF framework enhanced with HE. The proposed framework has two advantages: a) To balance the tradeoff between efficiency and privacy protection, it adopts a strategy in which some unrated items are randomly sampled and assigned a weight on their gradients. b) To reduce complexity, it allocates some decrypting clients to decrypt and transmit the aggregated gradients to the server. A prototype of PrivMVMF is implemented and tested on the movielens dataset.

2 Literature Review

Federated Matrix Factorization: Federated recommender systems enable parties to collaboratively train the model without putting all data on a centralized server. Several federation methods for recommender systems have been introduced in recent works. Ammad-ud-din et al. [1] proposed a federated matrix factorization method for implicit feedback. Each client updates the user latent factor locally and sends back item latent factor gradient to the server for aggregation and update. Duriakova et al. [7] presented a decentralized approach to matrix factorization without a central server, where each user exchanges the gradients with their neighbors. Lin et al. [13] provided a matrix factorization framework based on federated meta learning by generating private item embedding and rating prediction model. The above works haven’t considered cold-start recommendation. To address the problem, Flanagan et al. [8] devised a federated multi-view matrix factorization based on the implicit feedback (e.g., clicks), where three matrices are factorized simultaneously with sharing latent factors.

Cryptographic Techniques in Federated Recommender System: Some studies used encryption schemes to develop privacy-preserving recommendation systems. Chai et al. [5] introduced FedMF, a secure federated matrix factorization framework. To increase security, each client can encrypt the gradient uploaded to the server with HE. Shmueli et al. [18] proposed multi-party protocols for item-based collaborative filtering of vertical distribution settings. In the online phase, the parties communicate only with a mediator that performs computation on encrypted data, which reduces communication costs and allows each party to make recommendations independent of other parties. Although both [5] and our paper adopt HE to enhance the security, our work extends the method by introducing decrypting clients and sampling of unrated items. The decrypting clients improve the efficiency to perform parameters updates, and the unrated items sampling strikes a balance between efficiency and privacy protection.

To the best of our knowledge, Flanagan et al. [8] is the first to devise a federated multi-view matrix factorization to address the cold-start problem, where the users directly upload the plaintext gradients to the server, and no work has considered the information leakage from the gradients. This paper first demonstrates the feasibility of server reconstruction attack, and then proposes a framework to enhance privacy protection. The study is conducted based on the assumption of honest clients and an honest-but-curious server [22].

3 Federated MVMF

The federated MVMF proposed by Flanagan et al. [8] is based on implicit feedback. In this section, we extend the framework to explicit feedback.

3.1 Notations

Table 1 lists the notations and their descriptions used throughout this paper.

Notation Description Notation Description
The number of users Item feature latent factor
The number of items Uncertainty coefficient
Dimension of user attributes Dimension of latent factor
Dimension of item attributes Regularization coefficient
Rating matrix Set of rated items for user
User feature Item feature
User feature latent factor Learning rate
User latent factor Exponential decay rate
Item latent factor Small number
Table 1: Notations Used in the Paper

3.2 Multi-view Matrix Factorization

Multi-view matrix factorization is performed on the three data sources: the rating matrix , the user attribute matrix , and the item content matrix , for users with features, and m items with features. The decomposition of the three matrices is given as:

(1)

where , , , with representing the number of latent factors. For and , each row represents the latent factors for each user and item respectively. For and , each row represents the latent factors for each feature of user and item respectively. The predicted rating of user u on item i is given as:

(2)

The latent factor representation is learned by minimizing the following cost function:

(3)

where is used to adjust how much information the model should learn from side data, and is a regularization term to prevent overfitting. if the rating is unobserved, and otherwise. could be treated as a weight on the error term for each rating record. This paper considers two definitions of :

  • ObsOnly: if , and if

    . Then the loss function only minimize the square error on the observed ratings.

  • InclUnc: if , and if , where is an uncertainty coefficient on the unobserved ratings. This case assigns a lower weight on the loss for unobserved ratings.

The matrix factorization for explicit feedback typically employs the first definition to reduce the bias of unobserved interaction and improve efficiency. However, employing the second definition reveals less information to the FL server. Furthermore, as is shown in section 4 and 6, adopting the second definition would present a challenge for the server attack. Therefore, we will consider both cases when designing the server attack.

3.3 Federated Implementation

The federated setting consists of three parties: clients, FL server, and item server. Each client holds their ratings and attributes locally and performs local update of . FL server receives the gradients from clients and item server, and updates and . Item server is introduced to facilitate the training process. It stores the item features and conducts update of . The following explains the details in the updates of each latent factor matrix.

User feature latent factor is updated on the FL server with the formula as:

(4)

where:

(5)

where is computed on each user locally.

Item latent factor is updated on the FL server with the formula as:

(6)

where:

(7)

where is computed on the item server, and is computed on each user locally. Noted that for ObsOnly, if , the user only computes and sends the gradients of items with , i.e., the rated items. For InclUnc, the gradients for all items will be sent to the server.

Both user latent factor and item feature latent factor adopt two updating methods:

  • Semi-Alternating Least Squares (SemiALS): Optimal and are computed using closed form formula under fixed and . Other parameters are updated using gradient descent method.

  • Stochastic Gradient Descent (SGD): All of the parameters are updated using gradient descent method.

The time complexity for SemiALS is per iteration, higher than that of SGD. However, SGD requires more iterations to achieve the optimum. [23]

User latent factor is updated on each client locally. For SemiALS, it’s updated with the formula as:

(8)

where is a diagnal matrix with .

For SGD, it’s updated with the formula as:

(9)

where:

(10)

Item feature latent factor is updated on the item server. For SemiALS, it’s updated with the formula as:

(11)

For SGD, it’s updated with the formula as:

(12)

where:

(13)

Algorithm 1 outlines the federated implementation of MVMF (FedMVMF

). The gradient descent of U and Q are performed using Adaptive Moment Estimation (Adam) method to stabilize the convergence.

  FL Server:
  Initialize and .
  for t = 1 todo
     Receive and aggregate and from user for .
     Receive from item server.
     Update using equation (4).
     Update using equation (6).
  end for
  
  Item Server:
  while True do
     Receive from FL server.
     Compute local using equation (11).
     Compute item latent factor gradients .
     Transmit gradients to server.
  end while
  
  Client:
  while True do
     Receive and from server.
     Compute local using equation (8).
     Compute gradients for .
     Compute gradients for .
     Transmit gradients to server.
  end while
Algorithm 1 FedMVMF

3.4 Cold-start recommendation

The recommendation for new users and items is discussed as followed.

Cold-start user recommendation: for any new user , the system first generates the user latent factor based on the user’s attribute and the user feature latent factor matrix . Then the predicted rating of user on item is given by the inner product of and . is calculated by minimizing the loss function:

(14)

The optimal solution of is defined as:

(15)

Cold-start item recommendation: given a new item , the system first generates the user latent factor based on the item’s feature and the item feature latent factor matrix . The estimated is then used to compute the predicted rating. is calculated by minimizing the loss function:

(16)

The optimal solution of is defined as:

(17)

4 Sever Reconstruction Attack Analysis

In FedMVMF, the FL server could reconstruct the user ratings and attributes based on the gradients they received. In this section, we consider the attacks for both SemiALS and SGD updates on user latent factor. Within each case, the attacks are slightly different between ObsOnly and InclUnc. The analysis is based on the assumption of honest clients and an honest-but-curious server.

4.1 Reconstruction Attack for SemiALS Update

For SemiALS

the FL server is able to recover the user information within only one epoch given that the server has access to

and .

Attack for ObsOnly: In this case, the clients only upload the gradients for items with observed ratings. Therefore, for any user , the gradients which the FL server receives is given by:

(18)

where and

denote the vector of gradient with length

, denotes the collection of items rated by user , and denote the number of user attributes.

In SemiALS, is updated by equation (8). Given that when , the formula could be reduced to:

(19)

where is the vector of observed ratings, and is the latent factors for items rated by user .

Let , and , both of which could be computed on the FL server. Then could be written as . Plugging into equation (18), we have:

(20)

where with row being , with row being , and:

(21)

Then the FL server obtain a second order non-linear system with equations, consisting of variables, and . Therefore, it’s plausible to find the solution of user ratings and user attributes using methods such as Newton-Raphson algorithm. To reconcile the number of equations and variables, we choose a random factor , and solve the equation systems under the fixed .

Attack for InclUnc: In this case the client sends gradients of all items to the FL server, multiplied by a uncertainty coefficient . For any user , the gradients the FL server is given by:

(22)

Let , and . Then can be written as:

(23)

Plugging into equation (22), we can obtain the final equation system:

(24)

where is user ’s ratings for all items, and and are dependent on .

Since is a function of , the system consists of variables and equations. Therefore, it’s possible to recover the user information by solving the equation system. Similarly, a random factor is fixed to align the number of equations and variables.

4.2 Reconstruction Attack for Sgd Update

For SGD, the FL server is able to recover the user information within only two epochs given that the server has access to and .

Attack for ObsOnly: After two epochs, the gradients FL server receives from users is given by:

(25)

In pure SGD, the user latent factor is updated using equation (9) and (10). Plugging into the first gradient of equation (25), we have:

(26)

where , denote the element of , and denote the element of .

Equation (26) is a multiplication of two terms. By looking at the first term, we have:

(27)

where:

(28)

Then we look at the second term of equation (26), which is given by:

(29)

where:

(30)

Then equation (26) can be written as:

(31)

For , , where is the variable to solve. Noted that , , and could be computed on the FL server.

Since there are variables and equations, there should exist a solution satisfy the system (31). To reconcile the number of equations and variables, we choose a random item , and solve the equation systems under the fixed .

After obtaining , the server could compute and as followed:

(32)

Attack for InclUnc: Similarly, the FL server first obtain the equation system for given by:

(33)

where:

(34)

For detail derivation of equation (33) refer to appendix .1. Noted that is a function of , which is dependent on based on equation (35). Therefore, is linked with .

Given variables and equations, the server should be able to find a solution for the system. Similarly, a random item is fixed when solving the equation system.

Then the rating and user attributes could be computed as:

(35)

where and can be obtained from formula (9) and (10).

5 Privacy-Preserving MVMF (PriMVMF)

To prevent information leakage, we develop PrivMVMF, a privacy-preserving federated MVMF framework enhanced with homomorphic encryption (HE). In this framework, the client encrypts the gradients before sending them to the server, and the server can perform computation on the encoded gradients. The above attacks are based on access to individual gradients, while in HE, these gradients are sent to the server in encrypted form, rendering the reconstruction attacks infeasible.

5.1 Paillier Cryptosystem

This study utilized a partially HE scheme - Paillier cryptosystem [16], which consists of three parts: key generation, encryption, and decryption.

  • Key generation: Based on the , returns the public key shared among all participants, and secret key distributed only among the clients. Before the training process, one of the users generates a key pair.

  • Encryption: encrypts message to cyphertext using public key .

  • Decryption: reverses cyphertext to message using secret key .

Given two plaintexts and , Paillier cryptosystem has the following properties:

  • Addition: .

  • Multiplication:

Number Encoding Scheme: Paillier encryption is only defined for non-negative integer, but the recommendation system contains float and negative numbers. The study follows Chai et al.’s method to convert floating points and negative numbers into unsigned integer [5].

Sampling of Unrated Item: For the treatment of unrated item, this framework strikes a balance between efficiency and privacy protection. The ObsOnly method is efficient while it reveals what items has been rated by the user. The InclUnc method leaks no information but is computation intensive. To reconcile the two objectives, we design a strategy to randomly sample a portion of unrated items. Then the is given as followed:

(36)

where , if item appears in the sampled unrated items for user , and otherwise. Users only send the gradients with .

For each user, we determine the number of sampled unrated items as a multiple of his rated items, denoted by

. Then the upper-bound probability that the FL server could correctly infer whether a given item is rated by the user is given by

.

Decrypting Clients: It’s time-consuming to perform the update using the encrypted gradients. To reduce complexity, the server sends the aggregated gradient to some decrypting users for decryption, and uses the plaintext aggregated gradients to update the parameters.

Algorithms: The detailed steps of PrivMVM are shown in Algorithm 2. Noted that for the update of user latent factor and item feature latent factor , we adopt the SemiALS strategy for the following reason: although SemiALS has higher time complexity per iteration, it requires fewer iterations to achieve the optimum and thus fewer encryption and decryption operations, the bottleneck of the HE scheme.

Privacy Analysis: The privacy of the algorithm is analyzed in terms of information leakage, which is characterized into two forms: i) original information, the observed user data , and ii) latent information, properties of user data [14]. We assume an honest-but-curious server for the analysis, i.e., the server will not deviate from the defined protocol but attempt to learn information from legitimately received messages. During the training of PrivMVMF, the individual gradients are sent to the server in the encrypted form, and only the plaintext aggregated gradients are available to the server. The following shows that given the aggregated gradients, it leaks trivial original information about user data to the server.

Let , be the aggregated gradients for item and user feature , given by:

(37)

where denotes the set of items rated by or appeared in the sampled unrated items for user .

In PrivMVMF, is updated by:

(38)

where is the latent factors for items in .

Let , and . Then can be written as:

(39)

Plugging into equation (38), we can obtain the equation system as followed:

(40)

where:

(41)

The non-linear system consists of equations and variables. When , i.e., the user size is large enough, it’s hard for the server to derive the original information of users.

  Ramdomly select some clients as decrypters
  FL Server:
  Initialize and .
  for t = 1 todo
     Receive and aggregate encrypted and from user for .
     Send encrypted and to decrypters.
     Receive decrypted and from decrypters.
     Receive from item server.
     Update using equation (4).
     Update using equation (6).
  end for
  
  Item Server:
  while True do
     Receive from FL server.
     Compute local using equation (11).
     Compute item latent factor gradients .
     Transmit gradients to server.
  end while
  
  Client:
  while True do
     Receive and from server.
     Compute local using equation (8).
     Compute gradients for .
     Compute gradients for .
     Transmit gradients to server.
  end while
  
  Decrypter:
  while True do
     Receive encoded and from FL server.
     Decrypt and transmit and to FL server.
  end while
Algorithm 2 PrivMVM

6 Experiments

6.1 Dataset and Experimental Setup

The experiment is performed on MovieLens-1M dataset111https://grouplens.org/datasets/movielens/1m/. The dataset contains 914676 ratings from 6040 users on 3952 movies, with each user submitting at least 20 ratings. The experiment is implemented on Ubuntu Linux 20.04 server with 32-core CPU and 128GB RAM, where the programming language is Python.

We construct the rating matrix based on the explicit ratings, where the missing values are set to zero. The following user attributes are considered: Age, Gender, Occupation and Zipcode. Age is discretized into seven groups with equal interval, and Zipcode is linked to the US region. The movie features are described by the tag genome dataset containing tags for movies. To reduce dimensionality, we take the first 20 principal components for the tags features.

We use Bayesian optimization [20]

approach based on four-fold cross validation to optimize the hyperparameters. Table