On Safeguarding Privacy and Security in the Framework of Federated Learning

Motivated by the advancing computational capacity of wireless end-user equipment (UE), as well as the increasing concerns about sharing private data, a new machine learning (ML) paradigm has emerged, namely federated learning (FL). Specifically, FL allows a decoupling of data provision at UEs and ML model aggregation at a central unit. By training model locally, FL is capable of avoiding data leakage from the UEs, thereby preserving privacy and security to some extend. However, even if raw data are not disclosed from UEs, individual's private information can still be extracted by some recently discovered attacks in the FL architecture. In this work, we analyze the privacy and security issues in FL, and raise several challenges on preserving privacy and security when designing FL systems. In addition, we provide extensive simulation results to illustrate the discussed issues and possible solutions.


page 1

page 2

page 3

page 4

page 5

page 6


When Federated Learning Meets Blockchain: A New Distributed Learning Paradigm

Motivated by the advancing computational capabilities of wireless end us...

Federated Learning: Opportunities and Challenges

Federated Learning (FL) is a concept first introduced by Google in 2016,...

Achieving Security and Privacy in Federated Learning Systems: Survey, Research Challenges and Future Directions

Federated learning (FL) allows a server to learn a machine learning (ML)...

Vertical Federated Learning: Challenges, Methodologies and Experiments

Recently, federated learning (FL) has emerged as a promising distributed...

A Review of Federated Learning in Energy Systems

With increasing concerns for data privacy and ownership, recent years ha...

Privacy-Preserving Aggregation in Federated Learning: A Survey

Over the recent years, with the increasing adoption of Federated Learnin...

Blockchain-Based Federated Learning in Mobile Edge Networks with Application in Internet of Vehicles

The rapid increase of the data scale in Internet of Vehicles (IoV) syste...

I Introduction

Recent technological advancements are currently transforming the ways in which data is created and processed. With the advent of the internet-of-things (IoT), the number of intelligent devices in the world is rapidly growing in the last couple of years. Many of these devices are equipped with various sensors and increasingly powerful hardware, which allow them to not just collect, but more importantly, process data at unprecedented scales. In a concurrent development, artificial intelligence (AI) has revolutionized the ways that information is extracted with ground breaking successes in areas such as computer vision, natural language processing, voice recognition, etc

[14]. Therefore, there is high demand for harnessing the rich data provided by distributed devices to improve machine learning models.

At the same time, data privacy has become a growing concern for clients. In particular, the emergence of centralized searchable data repositories has made the leakage of private information, e.g. health conditions, travel information, and financial data, an urgent social problem [8]. Furthermore, the diverse set of open data applications, such as census data dissemination and social networks, place more emphasis on privacy concerns. In such practices, the access to real-life datasets may cause information leakage even in pure research activities. Consequently, privacy preservation has become a critical issue.

To tackle the challenge of protecting individuals’ privacy, a new paradigm has emerged, i.e., federated learning (FL) [7], which allows a decoupling of data provision at end-user equipment (UE) and machine learning model aggregation at a central server. The purpose of FL is to cooperatively learn a global model but not sacrificing the privacy of data. In particular, FL has distinct privacy advantages compared to data center training on a dataset. At a server, holding even an “anonymized” dataset can still put client privacy at risk via linkage to other datasets. In contrast, the information transmitted for FL consists of the minimal updates to improve a particular machine learning model. The updates themselves can be ephemeral, and will never contain more information than the raw training data (by the data processing inequality). Further, the source of the updates is not needed by the aggregation algorithm, so updates can be transmitted without identifying metadata over a mixed network such as Tor [3] or via a trusted third party. These generic approaches include de-identification methods like anonymization [9], obfuscation methods like differential privacy [4], cryptographic techniques like homomorphic encryption [10] and secure multi-party computation (SMC) protocols like oblivious transfer and garbled circuits [12].

However, although the data is not explicitly shared in the original format, it is still possible for adversaries to reconstruct the raw data approximately, especially when the architecture and parameters are not completely protected. In addition, FL can expose intermediate results such as parameter updates from an optimization algorithm like stochastic gradient descent (SGD), and the transmission of these gradients may actually leak private information

[11] when exposed together with a data structure such as image pixels. In addition, the existence of malevolent users may induce further security issues. Therefore, the design of FL still needs further protection of parameters as well as investigations on the tradeoffs between the privacy-security-level and the system performance.

Inspired by this research gap, we briefly investigate the potential privacy and security issues in FL. Specifically, we clarify that the current protection methods are mainly focused on the server and client side, and then investigate four important aspects of current designs, including convergence, data poisoning, scaling up and model aggregation. The remainder of this article is organized as follows. Section II introduces the basic model and key directions on the protection of FL. Section III illustrates challenges and opportunities in developing private and secure FL, and Section IV provides probable solutions and future work for discussion. Finally, conclusions are drawn in Section V.

Ii Background

We first introduce the basic model of FL, which is illustrated in Fig. 1. As can be seen from Fig. 1, each client downloads a globally shared model from the broadcasting server for local training, whereas the server periodically collects all trained parameters to perform a global average and then redistributes the improved model back to the clients. After adequate training and updating iterations, usually termed as communication rounds, between the server and its associated clients, the objective function is able to converge to the global optimal, and the convergence property of FL can be quantitatively demonstrated.

Fig. 1: The structure of private and secure federated learning framework

Ii-a Difference between Security and Privacy

Although security and privacy are used interchangeably in the literature, it is important to highlight the difference between them. On one hand, security issues refer to unauthorized/malicious access, change or denial to data. Such attacks are usually launched by hackers with expert knowledge of the target system or network. Hence, the fundamental three goals of security are confidentiality, integrity, and availability.

On the other hand, privacy issues generally refer to unintentional disclosure of personal information, usually from open-access data. For example, from a side-by-side comparison of a vote registration dataset and an anonymous set of healthcare sensor records (e.g., no individual’s name and ID), an attacker may be able to identify certain individuals and learn about their health conditions. This is because some quasi-identifiers such as gender, birth date, and zip code are the same in both datasets. As can be seen from the above example, privacy attacks only require common sense and involve no hacking activities. The fundamental reason of privacy issues is that a seemingly harmless open dataset may contain clues to individual’s private information in real life. Hence, alternative goals such as anonymity, unlinkability, and unobservability have been proposed for privacy protection.

Ii-B Security and Privacy Protection for FL

During the learning process there exists several privacy and security issues, and we can generally clarify the corresponding protection methods into three categories: privacy protection at the client side, privacy protection at the server side, and security protection for the FL.

Ii-B1 Privacy protection at the client side

In FL, clients will upload their learning results including parameter values and weights to the server, but they may not trust the server since a curious server might have a look at the uploaded data to infer private information. To alleviate this concern, clients can employ some privacy-preservation technologies as follows:

  • Perturbation: The idea of perturbation is adding noise to the uploaded parameters by clients. This line of work often uses differential privacy [4] to obscure certain sensitive attributes until the third party is not able to distinguish the individual, thereby making the data impossible to be restored so as to protect user privacy. In [5], authors introduced a differential privacy approach to FL in order to add protection to client-side data. However, the root of these methods still require that data are transmitted elsewhere and they usually involve a trade-off between accuracy and privacy, which needs adjustments.

  • Dummy: The concept of dummy method stems from the location privacy protection [6]. Dummy model parameters along with the true one will be sent to the server from clients, which may hide client’s contribution during training. Because of the aggregation processed at the server, the system performance can still be guaranteed.

Ii-B2 Privacy protection at the server side

After collecting updated parameters from clients, the server will perform a weighted average to these parameters according to data size. However, when the server broadcasts the aggregated parameters to clients for model synchronizing, this information may leak as there may exist eavesdroppers. Thus, protections at the server side are also of significance.

  • Aggregation: The key idea of aggregation is collecting data or model parameters from different clients on the server side. After aggregation, the adversaries or the un-trustful server cannot inspect client information according to this aggregated parameters. In addition, in some scenarios, sever has the liberty to select clients with high quality parameters or non-sensitive requirements. However, the question on how to design an appropriate aggregation mechanism is still a challenging task for current FL.

  • Secure Multi-Party Computation (SMC): The root of SMC is using encryption to make individual devices’ updates uninspected by a server, instead of only revealing the sum after a sufficient number of updates [12]. In details, SMC is a four-round interactive protocol optionally enabled during the reporting phase of a given communication round. In each protocol round, the server gathers messages from all devices, then uses the set of device messages to compute an independent response and return to each device. The third round constitutes a commit phase, during which devices upload cryptographically masked model updates to the server. Finally, there is a finalization phase that devices reveal sufficient cryptographic secrets to allow the server to unmask the aggregated model update.

Ii-B3 Security protection for FL framework

As for the security of the whole FL framework, it mainly considers the model-stealing attacks. Specially, any participant in FL may introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features, or that a word predictor completes certain sentences with an attacker. Consequently, there are also some protecting measures on the security design for FL.

  • Homomorphic Encryption: Homomorphic encryption [10] is adopted to protect user data through parameters exchange under encryption mechanism. That is the parameters are coded before uploading, and the public-private decoding keys are also need to transmit, which may cause extra communication cost.

  • Back-door Defender: Existing defenses against backdoor attacks are not effective as most of them require access to the training data [2]. In addition, the FL system cannot ensure all clients are not malicious and has no visibility into what participants are doing locally, and prevents anyone from auditing participants’ updates to the joint model.

Iii Challenges on Private and Secure FL

In this section, we clarify four main issues in the private and secure FL system, and propose specific discussions on each issue.

Iii-a Convergence: An Issue Caused by Privacy Protection

As pointed out in [13], the theoretical convergence guarantees have not been fully explored in the federated average learning, even although recent works can provide approximate convergence guarantee to some extent. However, these works always assumed unrealistic scenarios, e.g., (i) the data is either shared across devices or distributed in an independent and identically distributed (i.i.d.) manner, and (ii) all devices are involved in communication at each round.

If privacy protection is considered, the convergence of FL cannot be guaranteed for the current system setting. The main reason is that learning parameters will be in a non i.i.d. manner if perturbation method is applied at the client side. Moreover, even if the convergence can be satisfied when appropriate measures are proposed, the learning performance should be properly characterized. Previous work in [1]

has shown that convergence can be guaranteed when artificial noises are added into a deep learning network, but the learning accuracy decreases around 40% when solving a MNIST classification problem. As such, the following aspects need to be addressed:

  • Theoretical results should be provided about the convergence of privacy-preserving FL.

  • Learning performances, i.e., learning accuracy, communication rounds and variations of loss functions, need to be investigated when privacy protection is considered.

  • Privacy protection algorithm, both theoretically and empirically, should be devised. In addition, the tradeoff between the privacy level and the convergency speed also needs further investigation.

To explain this using a concrete example, let us consider the perturbation method described in Section II-A. If artificial noises are added at the client side, the aggregated noise power will influence the updated system parameters, and these parameters are not i.i.d.. Thus, the global weighted parameters at the server side may appear differences from the original one without noises. When SGD is applied, the descent trend may change to a different or even an opposite direction if inappropriate noise is added. In this way, we cannot guarantee the convergence of the algorithm. In addition, even if the convergence can be satisfied, the reduction in convergency speed, i.e., the communication rounds between clients and the server, and the learning performance, i.e., the classification accuracy, should be carefully quantified and analyzed.

Iii-B Data Poisoning: A Security Issue

In FL, clients, who previously acted only as passive data providers, can now observe intermediate model states and might contribute arbitrary updates as part of the decentralized training process. This creates an opportunity for malicious clients to manipulate the training process with little restriction. In particular, adversaries posing as honest clients can send erroneous updates that maliciously influence the performances of the training model, a process that is known as model poisoning.

Traditional poisoning attacks compromise the training data to change the model’s behavior at inference time. Researchers have considered the situation when one of members of a FL system maliciously attacks others by allowing a backdoor to be inserted to filch others’ data [2]. They showed that an adversarial participant can infer membership as well as properties associated with a subset of the training data. In addition, some malicious clients may update unreasonable parameters, which in turn harm the system performance. On the other hand, there exists possible eavesdroppers during server broadcasting the intermediate machine learning model states. Thus, the data poisoning on the security issues can be summarized as follows:

  • How to measure the loss performance if any malicious clients produce data or model poisoning?

  • How to recognize and prevent these poisoning behaviors from clients?

  • How to improve the security level by preventing eavesdroppers during the communication?

Iii-C Scaling Up Issue: A Privacy and Security Issue

It is straightforward to extend the current FL system into a large one, e.g., hundreds or thousands of clients, due to the availability of high-performance and low-price devices. However, this vast scale will bring out several practical issues: device availability that correlates with the local data distribution in complex ways (e.g., time zone dependency); unreliable device connectivity and interrupted execution; orchestration of lock-step execution across devices with varying availability; and limited device storage and compute resources. All these issues can be concluded as scaling up issues, and the most important and urgent issue is what will happen if more UEs are able to participate in FL. Specifically, the following aspects need to be addressed:

  • If more UEs participate in FL, it will lead to less communication rounds thanks to more computations in each round, which should be an obvious advantage.

  • If more UEs participate in FL, there will be less impact of data poison attack because it becomes difficult for an adversary to control a large number of UEs.

  • If more UEs participate in FL, will it provide better privacy protection? The intuition is that hiding a UE in a larger dataset is easier than doing the same in a smaller dataset.

In summary, it is unknown whether having more UEs is helpful to reduce the learning time or accuracy, and we will provide related experimental results in Section IV-C. In addition, a typical wireless scenario for scheme designing and performance investigation that multiple communication modes, i.e., LTE, WiFi, 5G, etc., exists in the uploading process. Resources allocation for these multiple modes needs to be optimized as most of works are not considering wireless transmission. In a wireless setting, the communication links between the server and the clients are uncertain and imperfect and this effect needs to be carefully studied in the design of the FL system, especially in the large scale one [15].

Iii-D Model Aggregation: A Security Issue

The aggregation is mainly processed at the server after collecting individuals’ parameters, and updates the global model. This process is particularly important as it should absorb the advantages of the clients and determine the end of learning. If protection method is applied at the client side, such as the perturbation applied before collecting model parameters, the aggregation cannot be simply a conventional averaging process. The main reasons can be concluded as: (i) the noise power of perturbation is increasing along with the number of clients; (ii) the server should know the stochastic information from clients and the design of the aggregation method needs to distinguish the privacy-sensitive clients from privacy-insensitive ones. Therefore, a more intelligent aggregation process should be provided as follows:

  • An intelligent aggregator should recognize the differences of clients and employ different aggregating strategies for them.

  • An intelligent aggregator should resolve the noise-added problem provided by the privacy protection. For example, the use of minimum mean square estimation (MMSE) aggregator can serve as an effective candidate.

  • An intelligent aggregator should update parameter weights for the participating clients during different communication rounds.

In particular, some form of recognition mechanism can be integrated into the aggregation process. It is able to adjust the parameter weights according to the quality of parameters or system feedback. Furthermore, some anomaly detection schemes can be considered to identify outliers during communications. The aggregator should be sufficiently intelligent as it can select appropriate clients for learning to achieve fast convergence and high performance.

Iv Experiment Results and Possible Solutions

In this section, we provide simulations to demonstrate the aforementioned issues and discuss some possible solutions. For each experiment, we first partition the original training data into disjoint non i.i.d. training sets, and locally compute SGD updates on each dataset, and then aggregate updates using an averaging method to train a globally shared classifier. We evaluate the prototype on the well-known classification dataset: MNIST, a digit classification problem which distinguishes 10 digital number from 0 to 9, and the system fails to complete the classification if the accuracy cannot exceed 10%. The provided dataset in MNIST is divided into 60,000 training examples and 10,000 test examples. The global epoch is set to 300 iterations at the server side, while 120 iterations are implemented at each client side, and the local batch size is set to 1200. In the following figures, we collect 20 runs for each experiment and record the average results.

Iv-a Convergence

In this subsection, we show some experimental results related to the added noise power and the convergence time. To achieve the privacy protection, we employ the perturbation method to the client side. In details, different artificial noises with same power i.e., gaussian noise and Laplace noise are added to the local parameters, respectively.

Fig. 2: Communication rounds versus accuracy with different noise powers in CNN

In Fig. 2, we first show the classification accuracy with different noise powers, where local learning applies convolutional neutral network (CNN) system. From the figure, we can observe that the accuracy performance is largely affected by the added noise while less influenced by the particular distribution of the noise. In addition, the accuracy performance improves with increasing number of communication rounds. It means that adding noise to the FL system will not affect its convergence. Nevertheless, it will lead to poor performance or even system failure when large noises, i.e., , are added. This is due to the fact that the SGD algorithm has converged to a poor local minimum solution. Thus, in the noise-added FL system, the analysis on the convergence should be investigated with learning performance.

Fig. 3: Communication rounds versus accuracy with different noise powers in MLP

In addition, we verify this observation by applying multi-layer perception (MLP) system at clients. As can be seen in Fig. 3, the added noise seems to have slight influence on the accuracy. It it mainly because in the MLP system there is an auto-filtering process which can delete perceptions or parameters with bad performance.

Iv-B Data Poisoning

Fig. 4: Performance comparison with different number of malicious clients

In Fig. 4, we show the performance comparison with different number of malicious clients. We set a CNN system for 30 clients, and the malicious clients will upload fake value of parameters in each communication round. The fake value can be the opposite of the true value, or random numbers within [-1, 1]. From Fig. 4, we can see that the system performance will be influenced if malicious clients exist. In addition, the system will fail when more malicious clients participating in.

There are two main ways to prevent the data poisoning in privacy-aware FL system. The first one is to recognize malicious clients when the system sets up. In this scenario, machine learning can be utilized. For example, a supervised learning algorithm can be proposed to find malicious clients during each communication round. Another one is focusing on the aggregation process. After each aggregation, according to the updated learning parameters, the server can update the aggregation weights for each client. In this way, the server can select the clients that are helpful for the fast convergence or high performance. On the other hand, concepts from social networks can be applied to update the weights in each communication round by exploiting the social influences of each client to the overall system performance.

Iv-C Scaling Up Issue

For the scaling up issue, one promising method is setting an uploading delay deadline for each communication round to address the long waiting time. At each learning epoch, server will collect at least clients’ information before executing next process in a limited time deadline. If the waiting time exceeds this deadline, the current learning epoch is abandoned.

Fig. 5: Performance comparison with different number of clients

In the following, we first show the classification accuracy with different clients numbers. From Fig. 5 we can find that with the increasing number of clients, the performance does not show much gain. However, the total delay can be largely reduced when more clients exists. In particular, the clients are randomly distributed in a km square area and we record the maximum calculation and transmission time in each communication round for different number of clients. Then we set the learning stops when the accuracy exceeds and record the total communication round, and calculate the total delay.

In addition, to duel with the large number of clients, we can use the concept of cluster in game theory. By partitioning clients into different clusters factitiously, each cluster of clients will struggle together to complete the ultimate learning goal. The server will also provide benefits in return. In this new structure design, the large number of clients will be separated by their common interests, similar physical location or same uploading ways. Different cluster will compete with each other to obtain the learning opportunity.

Iv-D Model Aggregation

The model aggregation should be intelligent. It not only can deal with the large amount of noise while guaranteeing the system performance, but also applies various aggregation methods for different clients. The current strategy of the aggregation weight depends to the training size, but a more intelligent aggregator should be designed for multiple objectives. In addition, the selection for the updated parameters can also be adjusted. For example, the server can choose the uploading ones with better channel or parameter qualities.

Fig. 6: Performance comparison with different number of malicious clients under proposed the aggregation method

In Fig. 6, we propose an intelligent aggregation model to address the malicious clients’ problem. The proposed algorithm includes two parts: 1) Add a test process at the server side, and update the aggregation weight according to the testing performance for each uploading parameters. 2) Increase the local epoches foe each client. As can be seen in the figure, the proposed algorithm can well solved the performance recession caused by the malicious clients. In addition, more local epoches are needed when more malicious clients exist.

V Conclusion

In this article, we have investigated potential privacy and security issues in federated learning (FL). We have noted that the privacy protection can be taken on the client or the server side and security protection is mainly focused on the system level. In addition, we have argued that the considered issues can be classified into convergence, data poisoning, scaling up and model aggregation issues. Lastly, we have also provided some possible solutions for protecting privacy and security, which may show potential system design in the FL framework.


  • [1] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang (2016) Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. Cited by: §III-A.
  • [2] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov (2018) How to backdoor federated learning. arXiv preprint arXiv:1807.00459. Cited by: 2nd item, §III-B.
  • [3] D. L. Chaum (1981) Untraceable electronic mail, return addresses, and digital pseudonyms. Communications of the ACM 24 (2), pp. 84–90. Cited by: §I.
  • [4] C. Dwork, F. McSherry, K. Nissim, and A. Smith (2006) Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265–284. Cited by: §I, 1st item.
  • [5] R. C. Geyer, T. Klein, and M. Nabi (2017) Differentially private federated learning: a client level perspective. arXiv preprint arXiv:1712.07557. Cited by: 1st item.
  • [6] H. Kido, Y. Yanagisawa, and T. Satoh (2005-04) Protection of location privacy using dummies for location-based services. In 21st International Conference on Data Engineering Workshops (ICDEW’05), Vol. , pp. 1248–1248. External Links: Document, ISSN Cited by: 2nd item.
  • [7] J. Konečnỳ, H. B. McMahan, D. Ramage, and P. Richtárik (2016) Federated optimization: distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527. Cited by: §I.
  • [8] B. Liu, M. Ding, T. Zhu, Y. Xiang, and W. Zhou (2019) Adversaries or allies? privacy and deep learning in big data era. Concurrency and Computation: Practice and Experience, pp. e5102. Cited by: §I.
  • [9] A. Machanavajjhala, J. Gehrke, D. Kifer, and M. Venkitasubramaniam (2006) L-diversity: privacy beyond k-anonymity. In 22nd International Conference on Data Engineering (ICDE’06), pp. 24–24. Cited by: §I.
  • [10] N. Papernot, M. Abadi, U. Erlingsson, I. Goodfellow, and K. Talwar (2016) Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755. Cited by: §I, 1st item.
  • [11] L. T. Phong, Y. Aono, T. Hayashi, L. Wang, and S. Moriai (2018-05) Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security 13 (5), pp. 1333–1345. External Links: Document, ISSN 1556-6013 Cited by: §I.
  • [12] M. Rosulek (2017) Improvements for gate-hiding garbled circuits. In International Conference on Cryptology in India, pp. 325–345. Cited by: §I, 2nd item.
  • [13] A. K. Sahu, T. Li, M. Sanjabi, M. Zaheer, A. Talwalkar, and V. Smith (2018) On the convergence of federated optimization in heterogeneous networks. Cited by: §III-A.
  • [14] X. Wang, Y. Han, C. Wang, Q. Zhao, X. Chen, and M. Chen (2018) In-edge ai: intelligentizing mobile edge computing, caching and communication by federated learning. arXiv preprint arXiv:1809.07857. Cited by: §I.
  • [15] H. H. Yang, Z. Liu, T. Q. Quek, and H. V. Poor (2019) Scheduling policies for federated learning in wireless networks. IEEE Trans. Commun., revised. Available as ArXiv: 1908.06287. Cited by: §III-C.