Big data and deep learning technologies have enabled us to perform scalable data mining across multiple parties to build powerful prediction models. For example, it will be very appealing for different countries to collaborate, utilizing their medical data records to train prediction models for fighting against the COVID-19 pandemic. However, many countries or regions have issued strong privacy protection laws and regulations, such as GDPRRegulation (2016)
, and it is very difficult to straightforwardly collect and combine the data from different parties for a data mining task. To circumvent this major obstacle towards big data mining, a novel machine learning (ML) paradigm named feaderated learning (FL) has recently been proposed, which allows multiple clients coordinated by a central server to train a joint ML model in an iterative mannerMcMahan et al. (2017); He et al. (2020, 2021). In FL, no client can access any training data owned by other clients, leading to a privacy-aware paradigm for collaborative model training. Specific to the example mentioned above, FL can greatly facilitate the scenario where many hospitals hope to build a joint COVID-19 diagnosis ML model from their distributed data. A real-life case has been shown in Xu et al. (2020), where FL has been successfully adopted to build a promising ML model for COVID-19 diagnosis, with the use of the geographically distributed chest CT (Computed Tomography) data collected from patients at different hospitals.
However, many recent studies Melis et al. (2019); Nasr et al. (2019); Zhu et al. (2019); Wang et al. (2019); Truex et al. (2018) have demonstrated that FL fails to provide sufficient privacy guarantees, as sensitive information can be revealed in the training process. In FL, multiple clients send ML model weight or gradient updates derived from local training to a central server for global model training. The communication of model updates renders FL vulnerable to several recently developed privacy attacks, such as property inference attacks Ganju et al. (2018), reconstruction attacks Hitaj et al. (2017), and membership inference attacks (MIAs) Shokri et al. (2017). Among these attacks, MIAs aim to identify whether or not a data record was in the training dataset the model was built on (i.e., a member). This can impose severe privacy risks on individuals. For example, via identifying the fact that a clinical record that has been used to train a model associated with a certain disease, MIAs can infer that the owner of the clinical record has a high chance of having the disease.
However, existing MIAs ignore the source of a training member, i.e., the information of which client owns the training member, while MIAs against FL models distinguish the training members of the model from the non-members. It is essential to explore source privacy in FL beyond membership privacy, because the leakage of such information can lead to further privacy issues. For instance, in the scenario where multiple hospitals jointly train an FL model for COVID-19 diagnosis, MIAs can only reveal who have been tested for COVID-19, but the further identification of the source hospital where the people are from will make them more prone to discrimination, especially when the hospital is in a high risk region or country Devakumar et al. (2020).
In this paper, we propose a novel inference attack called Source Inference Attack (SIA) in the context of FL. SIA aims to determine which client owns a training record in FL. In practice, the SIA can be considered as a natural extension beyond MIAs, i.e., after determining which data instances are training members in MIAs, the adversary can further conduct the SIA to identify which client it comes from. To be practical, it is assumed that the adversary is an honest-but-curious central server, who knows the identities of clients and receives the updates from them. It is worth noting that the server can infer client-private information without interfering with the FL training nor affecting the model prediction performance. While the adversary can be one of the clients, we argue that it is impractical for her to launch SIAs as she knows little about other clients’ identities and can only access the joint models.
Specifically, we innovatively explore the SIA from the Bayesian perspective, and demonstrate that a server can achieve the optimal estimate of the source of a training member in an SIA without violating the FL protocol. To this end, the prediction loss of local models on the training members is utilised to obtain the source information of the training members effectively and non-intrusively. Besides theoretical formulation, we empirically evaluate the SIA in FL trained with one synthetic and five real world datasets, with respect to several FL aspects such as data distributions across clients, the number of clients, and the number of local epochs. The experiment results validate the efficacy of the proposed source inference attack under various FL settings. An important finding is that the success of an SIA is directly relevant to the generalizability of local models and the diversity of the local data.
Our main contribution is multifold, summarized as follows.
First, we propose the source inference attack (SIA), a novel inference attack in FL that identifies the source of a training member. SIA can further breach the privacy of training members beyond membership inference attacks.
Second, we adopt the Bayesian perspective to demonstrate that an honest-but-curious central server can fulfil an effective SIA in a non-intrusive manner by optimally estimating the source of a training member, using prediction loss of local models.
Last, we perform an extensive empirical evaluation on both synthetic and real world datasets under various FL settings, and the results validate the efficacy of the proposed SIA.
We provide all proofs in the full version of our paper, and our source code is available at: https://github.com/HongshengHu/source-inference-FL.
In this section, we briefly review the background of the federated learning and membership inference attacks.
2.1 Federated Learning
Federated learning allows multiple clients to jointly train an ML model in an interactive manner. It is an attractive framework for training ML models without direct access to diverse training data owned by different clients, especially for privacy-sensitive tasks McMahan et al. (2017); Zhao et al. (2018); Melis et al. (2019); Bagdasaryan et al. (2020). The federated averaging (FedAvg) McMahan et al. (2017) algorithm is the first and perhaps the most widely used FL algorithm. During multiple rounds of communication between server and clients, a central model is trained. At each communication round, the server distributes the current central model to local clients. The local clients then perform local optimization using their own data. To minimize communication, clients might update the local model for several epochs during a single communication round. Next, the optimized local models are sent back to the server, who average them to allocate a new central model. The performance of the new central model decides the training is either stopped or a new communication round starts. In FL, clients never share data, only their model weights or gradients.
2.2 Membership Inference Attacks
Membership inference attacks aim to identify whether a data record was part of the target model’s training dataset or not. Shokri et al. Shokri et al. (2017)
present the first MIAs against ML models. Specifically, they demonstrate that an adversary can tell whether a data record has been used to train a classifier or not, solely based on the prediction vector of the data record. Since then, a growing body of work further investigates and explores the feasibility of MIAs on various ML modelsHu et al. (2021). Nevertheless, recent works Melis et al. (2019); Nasr et al. (2019) have demonstrated the success of MIAs on FL models. For example, Melis et al. Melis et al. (2019) have shown that an adversary can infer whether a specific location profile was used to train an FL model on the FourSquare location dataset with high success rate. Although MIAs can distinguish the training members of the FL model from the non-members, existing inference attacks ignore to further explore which client owns the training member identified by MIAs. In this paper, we fill this gap and show the possibilities of breaching the source privacy of training members.
3 Source Inference Attacks
In this section, we formally analyze how an honest-but-curious sever in FL can optimally estimate the source of a training member from the Bayesian perspective.
We focus on the supervised learning of classification tasks. The adversary is the honest-but-curious server who faithfully implements FedAvg while trying to determine where a training data record comes from. Assuming the whole training dataset consists ofi.i.d. data records from a data distribution. Each record is represented as where is an input vector and is the class label. The source status of each record is represented by a -dimensional (assuming there are clients) multinomial vector in which one of the elements equals , and all remaining elements equal . We assume that multinomial source variables are independent, and the training record
comes from the client k with the probability. Without loss of generality, taking the case of , the source inference is defined as follows:
Definition 1 (Source inference)
Given local optimized model , a training record , source inference aims to infer the posterior probability of
, source inference aims to infer the posterior probability ofbelonging to the client k:
For the source inference by Definition 1, we want to derive the explicit formula for from the Bayesian perspective, which establishes the optimal limit that our source inference can achieve. We denote as the set which collects the knowledge about the other training records and their source status. The explicit formula of is given by the following theorem.
Given local optimized model , a training record , the optimal source inference is given by:
is the sigmoid function.
We observe that Theorem 1 does not have the loss form and only relies on the posterior parameter in expectation given
is a random variable. To makemore explicit with the loss term, we assume an ML algorithm produced parameters
follows a posterior distribution. According to energy based modelsLeCun et al. (2006); Du and Mordatch (2019), the posterior distribution of an ML model follows:
where is a temperature parameter controlling the stochasticity of . Following this assumption, given , the posterior distribution of follows:
We further define the posterior distribution of given training samples and their source status (i.e., given ):
where the denominator is a constant. The following theorem explicitly demonstrates how to conduct the optimal source inference with the loss term.
Given a local resulting model , a training record , the optimal SIA is given by:
The term in Equation 9 is the gap between and . Since is a training set that does not contain any information about , corresponds to a posterior distribution of the parameters of an ML model that was trained without seeing . Note that is the local model ’s evaluation of the loss on the training record . Comparing Equation 7 and Equation 8, we can easily find that is the expectation of the loss over the typical models that have not seen . Thus, we can interpret as the difference between ’s loss on and other models’ (trained without ) average loss on .
In FL, the malicious server can implement an SIA in each communication round. The server receives the updated local models from each client and conducts the SIA to identify whether belongs to the client k. Let us qualitatively analyze in Theorem 2. has two important terms and , which decide the posterior probability. In FL, represents the local updated model ’s loss on . represents the average loss of under the local models that are updated without . Note that
is a loss function which measures the performance of a model on a data record. If, which means the client k behaves almost the same as other clients on , then . Since , the posterior probability is equal to . Thus, we have no information gain on beyond prior knowledge. In FL, the prior knowledge is . In this case, the source inference is equal to a random guess. However, if , that is, the client k performs better than other clients on , becomes positive. When , and thus we gain non-trivial source information on . Moreover, since is non decreasing, smaller indicates a higher probability that belonging to the client k.
We conclude that the smaller loss of client k’s local model on a training record , the higher posterior probability that belongs to the client k. This motivates us to design the SIA in FL such that the client whose local model has the smallest loss on a training record should own this record. Moreover, if the client’s local model’s behavior on its local training data is different from that of other clients, our attack will always achieve better performance than random guess. We give more empirical evidence in Section 4. Based on the conclusion above, we propose FedSIA as described in Algorithm 1, an FL framework based on FedAvg McMahan et al. (2017) that allows an honest-but-curious server to implement SIAs without violating the FedAvg protocol.
4.1 Datasets and Model Architectures
In the experiments, we evaluate SIAs on six datasets, i.e., Synthetic, Location111https://sites.google.com/site/yangdingqi/home/foursquare-dataset, Purchase222https://www.kaggle.com/c/acquire-valued-shoppers-challenge/data, CHMNIST333https://www.kaggle.com/kmader/colorectal-histology-mnist
, and CIFAR-10555https://www.cs.toronto.edu/ kriz/cifar.html. Among them, Synthetic is a synthetic i.i.d. dataset, which allows us to manipulate data heterogeneity more precisely. We follow the same generation setup as described in Li et al. (2020b, 2019). Location, Purchase, CHMNIST, MNIST, and CIFAR-10 are realistic datasets which are widely used for evaluating privacy leakage on ML models Shokri et al. (2017); Jayaraman and Evans (2019); Ganju et al. (2018); Wang et al. (2019). For MNIST and CIFAR-10, we use the training dataset and testing dataset given. For the rest of the datasets, we randomly select % samples as the training records and use the remaining % samples as the testing records.
We consider deep neural networks (DNN) as the collaborative models for the classification tasks. In particular, for MNIST, CHMNIST, CIFAR-10, we use a convolutional neural network with two 5x5 convolution layers (the first with 32 channels, the second with 64, each followed with 2x2 max pooling), two fully connected layers with 512 and 128 units and ReLu activation, and a final softmax output layer. For Synthetic, Location, and Purchase, we use a fully-connected neural network with 1-hidden layer with 200 units each using ReLu activations. For each client in FL, we set a local mini-batch size offor all the experiments. For all models, we use SGD with the learning rate of . Our DNN architecture does not necessarily achieve the highest classification accuracy for the considered datasets, as our goal is not to attack the best DNN architecture. Our goal is to show that SIAs can identify which local client a training record comes from when the DNN classifier is trained in a federated manner.
In our experiment, we randomly select training records from each client as the target training examples of which the server wants to identify the source. We set the fraction of the clients to 1 in FL to simplify our experiments as we ignore the efficiency of the FL training when analyzing the privacy leakage. We consider attack success rate
(ASR) as the evaluation metric for the source inference. The ASR is defined as the fraction of the target records’ where the source status are correctly identified by the server. We consider a trivial attack ofrandomly guessing, which randomly selects a client as the source of the target training record as the performance baseline of an SIA. For all the learning tasks, we train the central model for
rounds, which is enough for the central model to converge. We record ASR during each communication round and report the highest ASR. All experiments are implemented using PyTorch with a single GPU NVIDIA Tesla P40.
4.2 Factors in Source Inference Attack
Data Distribution. The training data across clients are usually non-i.i.d. (heterogeneity) in FL. That is, a client’s local data can not be regarded as samples drawn from the overall data distribution. If the training data is more heterogeneous, each local optimized model will be more different during the FL training, which benefits SIAs. Intuitively, an SIA is more effective when the degree of data heterogeneity increases. To simulate heterogeneity of training data, we follow the method used in Xie et al. (2019); Bagdasaryan et al. (2020); Lin et al. (2020)
and use a Dirichlet distribution to divide the training records. The degree of data heterogeneity is controlled by a hyperparameter() of the Dirichlet distribution. In general, the reverse of the magnitude of reflects the degree of data heterogeneity.
Number of Local Epochs. In each communication round of FL, the client locally runs SGD on the current central model using its entire training dataset for several epochs and then submits the optimized model to the server. Recent studies Song et al. (2017); Carlini et al. (2019) have demonstrated that ML models are prone to memorize their training data. Intuitively, if a client updates the model on its local dataset with more epochs in each communication round, its local resulting model remembers the information of the local dataset better, which benefits SIAs.
4.3 Source Inference on Synthetic Dataset
We first conduct experiments on Synthetic to investigate how data distribution affects SIAs, because synthetic data allows us to manipulate the heterogeneity of the training data precisely. Without loss of generality, we assume there are clients and . Fig. 1 depicts the ASR of SIAs in each communication round during the FL training. We observe that our proposed SIAs always perform better than the random guessing baseline. This serves as empirical evidence for our theoretical analysis that random guess is the lower bound of our optimal source inference. The attacker performs better when the local data changes from i.i.d. to non-i.i.d., and the ASR increases as the heterogeneity of data increases.
4.4 Source Inference on All Dataset
We have demonstrated that SIAs are effective on synthetic data in both i.i.d. and non-i.i.d. settings. Now we use real datasets to further validate the effectiveness of SIAs and investigate the factors affecting the performance of SIAs. The SIAs leverages the local models’ different prediction loss on the training examples. Intuitively, if the local model is overfitted, it will perform much better on its training members than other data, i.e., distinguishable prediction loss between the local training data and other data. We link the level of non-i.i.d. and the number of local epochs to overfitting to study how the two factors affect the performance of SIAs.
Fig. 2 shows the SIAs’ ASR of different FL models under different overfitting levels. The overfitting level of the FL model here is calculated as the average of all local models’ generalization gap. As we can see, increasing the level of non-i.i.d. across clients (i.e., changing from to ) increases the ASR of SIAs in all models, as increasing the level of non-i.i.d. will inevitably increase the level of overfitting. However, when we increase the number of local epochs from to , the ASRs of SIAs on CHMNIST, CIAFR-10, Location increases while on MNIST, Synthtic, Purchase the ASRs does not vary much. This is because changing local epochs does not increase the overfitting level of all models.
Success of SIAs is directly related to the generalizability of the local models and the diversity of the local training data. If a local model generalizes well to inputs beyond its local training members, it will not leak too much source information about its local data. Moreover, if the local training set fails to represent the overall training data distribution, the local model leaks significant information about its local data and the ASR of SIAs remains high. Recent works Zhao et al. (2018); Li et al. (2019, 2020a) have demonstrated that the non-i.i.d. of training data in FL has brought statistical heterogeneity challenges for model convergence guarantees. In this paper, we show another harm of non-i.i.d.: the leakage of source privacy for local data.
|Dataset||FL without DP||FL with DP|
Many works Shokri and Shmatikov (2015); Geyer et al. (2017); McMahan et al. (2018a) suggest differential privacy Dwork et al. (2006) can be used against the inference attacks due to its theoretical guarantee of privacy protection. Here, we test the differential privacy as a defense technique against SIAs in FL. In this experiment, we evaluate the defense approaches on Location and CIFAR-10, as shown in Table 1. In the experimental setting, we set , , for Location, and , , for CIAFR-10. From the results, we can see that the ASRs drop from % to % on Location, and % to % on CIFAR-10, while applying differential privacy. However, when differential privacy can defend the SIA, it also hurts the performance of the model on its tasks, where the model utility drops from 71.2% to 12.3% on Location, and 68.3% to 10.0% on CIFAR-10. In this case, we can conclude that vanilla DP is not a effective solution for SIA in FL, which provides future research opportunities.
6 Related work
6.1 Inference Attacks in FL
Macahan et al. McMahan et al. (2017) first propose the federated learning framework that can mitigate the privacy leakage of model training with limited, unbalanced, massively, or even non-IID data among distributed devices, such as mobile phones Pan and Sun (2021), healthcare data Xu et al. (2021). The motivation is to share the model weights instead of the private data for better privacy protection. However, recent works Melis et al. (2019); Nasr et al. (2019); Zhu et al. (2019); Wang et al. (2019); Yang et al. (2020) investigate several privacy attacks in FL, including property inference attacks Ganju et al. (2018), reconstruction attacks Hitaj et al. (2017), and membership inference attacks Shokri et al. (2017); Wang and Sun (2021). MIAs in FL allows a malicious participant or server to distinguish the training members of the trained model from the non-members. Melis et al. Melis et al. (2019) first explore MIAs in FL and demonstrate that an adversary can infer whether a specific location profile was used to train an FL model on FourSquare location dataset with 0.99 precision and perfect recall. Nasr et al. Nasr et al. (2019) suggest an adversary can actively craft his updated model to extract more membership information about other clients. For training members of the FL model, the existing inference attacks fail to explore which client owns them. The source inference attacks proposed in this paper fill this gap.
6.2 Privacy Defenses in FL
To enhance privacy protection, differential privacy and other privacy protection mechanisms, e.g., secure aggregation, have been recently applied to federated learning Lyu et al. (2020); Bhowmick et al. (2018); Geyer et al. (2017); McMahan et al. (2018b); Bonawitz et al. (2017); Sun and Lyu (2020). Previous works mostly focus on either the centralized differential privacy mechanism that requires a central trusted party Geyer et al. (2017); McMahan et al. (2018b), or local differential privacy, in which each user perturbs its updates randomly before sending it to an untrusted aggregator Truex et al. (2019); Sun et al. (2021). These privacy-preserving approaches have been evaluated effectively for inference and other attacks Geyer et al. (2017); McMahan et al. (2018a); Bonawitz et al. (2017); Li and Wang (2019); Sun et al. (2021) in FL. However, no protection approaches have been explored for SIAs. As discussed in the last section, applying differential privacy in FL is not an effective solution, since it suffers from the trade-off between model utility and defense performance of SIAs, providing future research opportunities.
In this paper, we propose a new inference attack named source inference attack in the context of FL, which enables a malicious server to infer the source of a training example between clients. We derive an optimal attack strategy formally that the malicious server is able to gain non-trivial source information of the training members by evaluating the local model’s loss. We evaluate SIAs in FL with many real datasets and different settings. The extensive experimental results demonstrate the effectiveness of SIAs in practice.
- How to backdoor federated learning. In AISTATS, Cited by: §2.1, §4.2.
- Protection against reconstruction and its applications in private federated learning. CoRR, arXiv:1812.00984. Cited by: §6.2.
- Practical secure aggregation for privacy-preserving machine learning. In CCS, Cited by: §6.2.
- The secret sharer: evaluating and testing unintended memorization in neural networks. In USENIX Security, Cited by: §4.2.
- Racism and discrimination in covid-19 responses. The Lancet. Cited by: §1.
- Implicit generation and modeling with energy-based models. In NeurIPS, Cited by: §3.
- Calibrating noise to sensitivity in private data analysis. In TCC, Cited by: §5.
- Property inference attacks on fully connected neural networks using permutation invariant representations. In CCS, Cited by: §1, §4.1, §6.1.
- Differentially private federated learning: a client level perspective. arXiv preprint arXiv:1712.07557. Cited by: §5, §6.2.
- FedGraphNN: a federated learning benchmark system for graph neural networks. Cited by: §1.
- Fedml: a research library and benchmark for federated machine learning. arXiv preprint arXiv:2007.13518. Cited by: §1.
- Deep models under the gan: information leakage from collaborative deep learning. In CCS, Cited by: §1, §6.1.
- Membership inference attacks on machine learning: a survey. arXiv preprint arXiv:2103.07853. Cited by: §2.2.
- Evaluating differentially private machine learning in practice. In USENIX Security, Cited by: §4.1.
- A tutorial on energy-based learning. Predicting structured data. Cited by: §3.
- Fedmd: heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581. Cited by: §6.2.
- Federated learning: challenges, methods, and future directions. IEEE Signal Processing Magazine. Cited by: §4.4.
- Federated optimization in heterogeneous networks. In MLSys, Cited by: §4.1.
- On the convergence of fedavg on non-iid data. In ICLR, Cited by: §4.1, §4.4.
- Ensemble distillation for robust model fusion in federated learning. arXiv preprint arXiv:2006.07242. Cited by: §4.2.
- Privacy and robustness in federated learning: attacks and defenses. arXiv preprint arXiv:2012.06337. Cited by: §6.2.
- Communication-efficient learning of deep networks from decentralized data. In AISTATS, Cited by: §1, §2.1, §3, §6.1.
Learning differentially private recurrent language models. In ICLR, Cited by: §5, §6.2.
- Learning differentially private recurrent language models. In ICLR, Cited by: §6.2.
- Exploiting unintended feature leakage in collaborative learning. In S&P, Cited by: §1, §2.1, §2.2, §6.1.
- Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In S&P, Cited by: §1, §2.2, §6.1.
- Global knowledge distillation in federated learning. arXiv preprint arXiv:2107.00051. Cited by: §6.1.
- Regulation eu 2016/679 of the european parliament and of the council of 27 april 2016. Official Journal of the European Union. Cited by: §1.
- Privacy-preserving deep learning. In CCS, Cited by: §5.
- Membership inference attacks against machine learning models. In S&P, Cited by: §1, §2.2, §4.1, §6.1.
- Machine learning models that remember too much. In CCS, Cited by: §4.2.
- Federated model distillation with noise-free differential privacy. arXiv preprint arXiv:2009.05537. Cited by: §6.2.
- LDP-fl: practical private aggregation in federated learning with local differential privacy. In IJCAI, Cited by: §6.2.
A hybrid approach to privacy-preserving federated learning.
12th ACM Workshop on Artificial Intelligence and Security, Cited by: §6.2.
- Towards demystifying membership inference attacks. arXiv preprint arXiv:1807.09173. Cited by: §1.
Membership inference attacks on knowledge graphs. arXiv preprint arXiv:2104.08273. Cited by: §6.1.
- Beyond inferring class representatives: user-level privacy leakage from federated learning. In INFOCOM, Cited by: §1, §4.1, §6.1.
- Dba: distributed backdoor attacks against federated learning. In ICLR, Cited by: §4.2.
- FedMood: federated learning on mobile health data for mood detection. arXiv preprint arXiv:2102.09342. Cited by: §6.1.
- A collaborative online ai engine for ct-based covid-19 diagnosis. medRxiv. Cited by: §1.
- Secure deep graph generation with link differential privacy. arXiv preprint arXiv:2005.00455. Cited by: §6.1.
- Federated learning with non-iid data. arXiv preprint arXiv:1806.00582. Cited by: §2.1, §4.4.
- Deep leakage from gradients. In NeurIPS, Cited by: §1, §6.1.