Log In Sign Up

A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

Federated learning has been a hot research area in enabling the collaborative training of machine learning models among different organizations under the privacy restrictions. As researchers try to support more machine learning models with different privacy-preserving approaches, there is a requirement in developing systems and infrastructures to ease the development of various federated learning algorithms. Just like deep learning systems such as Caffe, PyTorch, and Tensorflow that boost the development of deep learning algorithms, federated learning systems are equivalently important, and face challenges from various issues such as unpractical system assumptions, scalability and efficiency. Inspired by federated systems in other fields such as databases and cloud computing, we investigate the existing characteristics of federated learning systems. We find that two important features for federated systems in other fields, i.e., heterogeneity and autonomy, are rarely considered in the existing federated learning systems. Moreover, we provide a thorough categorization for federated learning systems according to six different aspects, including data partition, machine learning model, privacy mechanism, communication architecture, scale of federation and motivation of federation. The categorization can help the design of federated learning systems as shown in our case studies. Lastly, we take a systematic comparison among the existing federated learning systems and present future research opportunities and directions.


Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

Federated learning systems enable the collaborative training of machine ...

An Efficient and Robust System for Vertically Federated Random Forest

As there is a growing interest in utilizing data across multiple resourc...

A Decision Model for Federated Learning Architecture Pattern Selection

Federated learning is growing fast in both academia and industry to reso...

Interpretable collaborative data analysis on distributed data

This paper proposes an interpretable non-model sharing collaborative dat...

An Exploratory Analysis on Users' Contributions in Federated Learning

Federated Learning is an emerging distributed collaborative learning par...

FPGA-Based Hardware Accelerator of Homomorphic Encryption for Efficient Federated Learning

With the increasing awareness of privacy protection and data fragmentati...

A Federated Learning Scheme for Neuro-developmental Disorders: Multi-Aspect ASD Detection

Autism Spectrum Disorder (ASD) is a neuro-developmental syndrome resulti...

Code Repositories

1 Introduction

Many machine learning algorithms are data hungry, and in reality, data are dispersed over different organizations under the protection of privacy restrictions. Due to these factors, federated learning (FL) [132] has become a hot research topic in machine learning and data mining. For example, data of different hospitals are isolated and become “data islands”. Since the size or the characteristic of data in each data island has limitations, a single hospital may not be able to train a high quality model that has a good predictive accuracy for a specific task. Ideally, hospitals can benefit more if they can collaboratively train a machine learning model with the union of their data. However, the data cannot simply be shared among the hospitals due to various policies and regulations. Such phenomena on “data islands” are commonly seen in many areas such as finance, government, and supply chains. Policies such as General Data Protection Regulation (GDPR) [5] stipulate rules on data sharing among different organizations. Thus, it is challenging to develop a federated learning system which has a good predictive accuracy while protecting data privacy. Many efforts have been devoted to implementing federated learning algorithms to support effective machine learning models under the context of federated learning. Specifically, researchers try to support more machine learning models with different privacy-preserving approaches, including deep neutral networks [96, 172, 16, 123, 99]

, gradient boosting decision trees (GBDTs) 

[175, 35], logistics regression [110, 33]

and support vector machine (SVM) 

[135]. For instance, Nikolaenko et al. [110] and Chen et al. [33]

proposed approaches to conduct FL based on linear regression. Hardy

et al. [63]

implemented an FL framework to train a logistic regression model. Since GBDTs have become very successful in recent years 

[31, 159], the corresponding FLSs have also been proposed by Zhao et al. [175] and Cheng et al. [35]

. Moreover, there are also many neural network based FLSs. Google has proposed a scalable production system which enables tens of millions of devices to train a deep neural network 

[16]. Yurochkin et al. [172]

developed a probabilistic FL framework for neural networks by applying Bayesian nonparametric machinery. Several methods try to combine FL with machine learning techniques such as multi-task learning and transfer learning. Smith

et al. [135] combined FL with multi-task learning to allow multiple parties to complete separate tasks. To address the scenario where the label information only exists in one party, Yang et al. [163] adopted transfer learning to collaboratively learn a model. Among the studies on customizing machine learning algorithms under the federated context, we have identified a few commonly used methods to provide privacy guarantees. One common method is to use cryptographic techniques [18] such as secure multi-party computation [103] and homomorphic encryption. The other popular method is differential privacy [175], which adds noises to the model parameters to protect the individual record. For example, Google’s federated learning system [18] adopts both secure aggregation and differential privacy to enhance privacy guarantee. As there are common methods and building blocks for building federated learning algorithms, it is possible to develop systems and infrastructures to ease the development of various federated learning algorithms. Systems and infrastructures allow algorithm developers to reuse the common building blocks, and avoid building algorithms every time from scratch. Just like deep learning systems such as Caffe [69], PyTorch [117], and TensorFlow [2] that boost the development of deep learning algorithms, federated learning systems (FLSs) are equivalently important for the success of federated learning. However, building federated learning systems face challenges from various issues such as unpractical system assumptions, scalability and efficiency. In this paper, we take a survey on the existing FLSs with a focus on drawing the analogy and differences to traditional federated systems in other fields such as databases [131] and cloud computing [83]. First, we consider heterogeneity and autonomy as two important characteristics of FLSs, which are often ignored in the existing designs in federated learning. Second, we categorize FLSs based on six different aspects: data distribution, machine learning model, privacy mechanism, communication architecture, scale of federation, and motivation of federation. These aspects can direct the design of a federated learning system. Furthermore, based on these aspects, we compare the existing FLSs and discover the key limitations of them. Last, to make FL more practical and powerful, we present future research directions to work on. We believe that systems and infrastructures are essential for the success of federated machine learning. More work has to be carried out to address the system research issues in security and privacy, efficiency and scalability. There have been several surveys and white papers on federated learning. A seminal survey written by Yang et al. [163] introduced the basics and concepts in federated learning, and further proposed a comprehensive secure federated learning framework. Later, WeBank [1] has published a white paper in introducing the background and related work in federated learning and most importantly presented a development roadmap including establishing local and global standards, building use cases and forming industrial data alliance. The survey and the white paper mainly target at a relatively small parties which are typically enterprise data owners. Lim et al. [91] conducted a survey of federated learning specific to mobile edge computing. Li et al. [88] summarized challenges and future directions of federated learning in massive networks of mobile and edge devices. In comparison with the previous surveys, the main contributions of this paper are as follows. (1) By analogy with the previous federated systems, we analyze two dimensions, including heterogeneity and autonomy, of federated learning systems. These dimensions can play an important role in the design of FLSs. (2) We provide a comprehensive taxonomy against federated learning systems on six different aspects, including data distribution, machine learning model, privacy mechanism, communication architecture, scale of federation, and motivation of federation, which can be used to direct the design of FLSs. (3) We summary the characteristics of existing FLSs and present the limitation of existing FLSs and vision for future generations of FLSs.

2 Federated Systems

In this section, we review key conventional federated systems and present the federated learning systems.

2.1 Conventional Federated Systems

The concept of federation can be found with its counterparts in the real world such as business and sports. The main characteristic of federation is cooperation. Federation not only commonly appears in society, but also plays an important role in computing. In computer science, federated computing systems have been an attractive area of research. Around 1990, there were many studies on federated database systems (FDBSs) [131]. An FDBS is a collection of autonomous databases cooperating for mutual benefit. As pointed out in a previous study [131], three important components of an FDBS are autonomy, heterogeneity, and distribution. First, a database system (DBS) that participates in an FDBS is autonomous, which means it is under separate and independent control. The parties can still manage the data without the FDBS. Second, differences in hardware, system software, and communication systems are allowed in an FDBS. A powerful FDBS can run in heterogeneous hardware or software environments. Last, due to the existence of multiple DBSs before an FDBS is built, the data distribution may differ in different DBSs. An FDBS can benefit from the data distribution if designed properly. Generally, FDBSs focus on the management of distributed data. More recently, with the development of cloud computing, many studies have been done for federated cloud computing [83]. A federated cloud (FC) is the deployment and management of multiple external and internal cloud computing services. The concept of cloud federation enables further reduction of costs due to partial outsourcing to more cost-efficient regions. Resource migration and resource redundancy are two basic features of federated clouds [83]. First, resources may be transferred from one cloud provider to another. Migration enables the relocation of resources. Second, redundancy allows concurrent usage of similar service features in different domains. For example, the data can be broken down and processed at different providers following the same computation logic. Overall, the scheduling of different resources is a key factor in the design of a federated cloud system.

2.2 Federated Learning Systems

While machine learning, especially deep learning, has attracted many attentions again recently, the combination of federation and machine learning is emerging as a new and hot research topic. When it comes to federated learning, the goal is to conduct collaborative machine learning techniques among different organizations under the restrictions on user privacy. Here we give a formal definition of federated learning systems. We assume that there are different parties, and each party is denoted by , where . We use to denote the data of . For the non-federated setting, each party uses only its local data to train a machine learning model . The performance of is denoted as . For the federated setting, all the parties jointly train a model while each party protects its data according to its specific privacy restrictions. The performance of is denoted as . Then, for a valid federated learning system, there exists such that . Note that, in the above definition, we only require that there exists any party that can achieve a higher model utility from the FLS. Even though some parties may not get a better model from an FLS, they may make an agreement with the other parties to ask for the other kinds of benifits (e.g., money).

2.3 Analogy among Federated Systems

Fig. 1: Federated database, federated cloud, and federated learning

Figure 1 shows the frameworks of federated database systems, federated clouds, and federated learning systems. There are some similarities and differences between federated learning systems and conventional federated systems. On the one hand, the concept of federation still applies. The common and basic idea is about the cooperation of multiple independent parties. Therefore, the perspective of considering heterogeneity and autonomy among the parties can still be applied in FLSs. Furthermore, some factors in the design of distributed systems are still important for FLSs. For example, how the data are shared between the parties can influence the efficiency of the systems. On the other hand, these federated systems have differences. While FDBSs focus on the management of distributed data and FCs focus on the scheduling of the resources, FLSs care more about the secure computation among multiple parties. FLSs induce new challenges such as the algorithm designs of the distributed training and the data protection under the privacy restrictions. With these findings, we analyze the existing FLSs and figure out the potential future directions of FLSs in the following sections.

3 System Characteristics

While existing FLSs take a lot of concerns on user privacy and machine learning models, two important characteristics of previous federated systems (i.e., heterogeneity and autonomy) are rarely addressed.

3.1 Heterogeneity

We consider heterogeneities between different parties in three aspects: data, privacy requirements and tasks.

3.1.1 Differences in data

parties always have different data distributions. For example, due to the ozone hole, the countries in the Southern Hemisphere may have more skin cancer patients than the Northern Hemisphere. Thus, hospitals in different countries tend to have very different distributions of patients records. The difference in data distributions may be a very important factor in the design of FLSs. The parties can potentially gain a lot from FL if they have various and partially representative distributions towards a specific task. Furthermore, if party Alice has fully representative data for task A and party Bob has fully representative data for task B, Alice and Bob can make a deal to conduct FLs for both tasks A and B to improve the performance for task B and task A, respectively. Besides data distributions, the size of data may also differ in different parties. FL should enable collaboration among parties with different scales. Furthermore, for fairness, the parties that provide more data should benefit more from FL.

3.1.2 Differences in privacy restrictions

Different parties always have different policies and regulation of data sharing restrictions. For example, the companies in the EU have to comply with the General Data Protection Regulation (GDPR) [5], while China recently issued a new regulation namely the Personal Information Security Specification (PISS). Furthermore, even in the same region, different parties still have different detailed privacy rules. The privacy restrictions play an important role in the design of FLSs. Generally, the parties are able to gain more from FL if the privacy restrictions are looser. Many studies assume that the parties have the same privacy level [33, 35]. The scenario where the parties have different privacy restrictions is more complicated and meaningful. It is very challenging to design an FLS which can maximize the utilization of data of each party while not violating their respective privacy restrictions.

3.1.3 Differences in tasks

The tasks of different parties may also vary. A bank may want to know whether a person can repay the loan but an insurance company may want to know whether the person will buy their products. The bank and the insurance company can also adopt FL although they want to perform different tasks. Multiple machine learning models may be learned during the FL process. Techniques like multi-task learning can also be adopted in FL [135].

3.2 Autonomy

The parties are often autonomous and under independent control. These parties are willing to share the information with the others only if they retain control. It is important to address the autonomy property when designing an FLS.

3.2.1 Association autonomy

A party can decide whether to associate or disassociate itself from FL and can participate in one or more FLSs. Ideally, an FLS should be robust enough to tolerate the entry or departure of any party. Thus, the FLS should not fully depend on any single party. Since this goal is hard to achieve, in practice, the parties can make an agreement to regularize the entry or departure to ensure that the FLS works properly.

3.2.2 Communication autonomy

A party should have the ability to decide how much information to communicate with others. The party can also choose the size of data to participate in the FL. An FLS should have the ability to handle different communication scale during the learning process. As we have mentioned in Section 3.1.1, the benefit of the party should be relevant to its contribution. The party may gain more if it is willing to share more information, while the risk of exposing user privacy may also be higher.

4 Taxonomy

Through analysis of many application scenarios in building FLSs, we can classify FLSs by six aspects: data distribution, machine learning model, privacy mechanism, communication architecture, scale of federation, and motivation of federation. These aspects include common factors (e.g., data distribution, communication architecture) in previous federated systems and unique consideration (e.g., machine learning model and privacy mechanism) for FLSs. Furthermore, these aspects can be used to direct the design of FLSs. Figure 

2 shows the summary of the taxonomy of FLSs.

Fig. 2: Taxonomy of federated learning systems (FLSs)

Let us explain the six aspects with an intuitive example. The hospitals in different regions want to conduct federated learning to improve the performance of prediction task on lung cancer. Then, the six aspects have to be considered to design such a federated learning system. First, we should consider how the patient records are distributed among hospitals. While the hospitals may have different patients, they may also have different knowledge for a common patient. Thus, we have to utilize both the non-overlapped instances and features in federated learning. Second, we should figure out which machine learning model should be adopted for such a task. For example, we may adopt gradient boosting decision trees which show a good performance on many classification problems. Third, we have to decide what techniques to use for privacy protection. Since the patient records cannot be exposed to the public, differential privacy is an option to achieve the privacy guarantee. Fourth, the communication architecture also matters. We may need a centralized server to control the updates of the model. Moreover, the number of hospitals and the computation power in each hospital should also be studied. Unlike federated learning on mobile devices, we have a relatively small scale and well stability of federation in this scenario. Last, we should consider the incentive for each party. A clear and straightforward motivation for the hospitals is to increase the accuracy of lung cancer prediction. Then, it is important to achieve a highly accurate machine learning model by federated learning. Next, we discuss these aspects in details.

4.1 Data Distribution

Based on how data are distributed over the sample and feature spaces, FLSs can be typically categorized to horizontal, vertical, and hybrid FLSs [163]. In horizontal FL, the datasets of different organizations have the same feature space but little intersection on the sample space. The system uses a server to aggregate the information from the devices and adopts differential privacy [100] and secure aggregation to enhance privacy guarantee. Wake-word recognition [84], such as ‘Hey Siri’ and ‘OK Google’, is a typical application of horizontal partition because each user speaks the same sentence with a different voice. In vertical FL, the datasets of different organizations have the same sample space but differ in the feature space. Vaidya et al. has proposed multiple secure models on vertically partitioned data, including association rule mining [145]

, k-means 


, naive bayes classifier 

[147] and decision tree [148]. For the vertical FLS, it usually adopts entity alignment techniques to collect the overlapped samples of the organizations. Then the overlapped data are used to train the machine learning model using encryption methods. Cheng et al. [35] proposed a lossless vertical FLS to enable parties to collaboratively train gradient boosting decision trees. They use privacy-preserving entity alignment to find common users among two parties, whose gradients are used to jointly train the decision trees. Cooperation among government agencies can be treated as a situation of vertical partition. Suppose the department of taxation requires the housing data of residents, which is stored in the department of housing, to formulate tax policies. Meanwhile, the department of housing also needs the tax information of residents, which is kept by the department of taxation, to adapt their housing policies. These two departments share the same sample space (i.e. all the residents) but each of them only has one part of features (e.g. housing data and tax data). In many other applications, while existing FLSs mostly focus on one kind of partition, the partition of data among the parties may be a hybrid of horizontal partition and vertical partition. Let us take cancer diagnosis system as an example. A group of hospitals wants to build a federated system for cancer diagnosis but each hospital has different patients as well as different kinds of medical examination results. Transfer learning [115] is a possible solution for such scenarios. Liu et al. [96] proposed a secure federated transfer learning system which can learn a representation among the features of parties using common instances.

4.2 Machine Learning Models

While there are many kinds of machine learning models, here we consider three different kinds of models that current FLSs mainly support: linear models, decision trees, and neural networks. Linear models include some basic statistical methods such as linear regression and logistic regression [33]. There are many well developed systems for linear regression and logistic regression [110, 63]

. These linear models are basically easy to learn compared with other complex models (e.g., neural networks). A tree based FLS is designed for the training for a single or multiple decision trees (i.e., gradient boosting decision trees and random forests). GBDTs are especially popular recently and it has a very good performance in many classification and regression tasks. Zhao

et al. [175] and Cheng et al. [35]

proposed FLSs for GBDTs on horizontally and vertically partitioned data, respectively. A neural network based system aims to train neural networks, which are an extremely hot topic in current machine learning area. There are many studies on federated stochastic gradient descent 

[99, 16], which can be used to learn the parameters of neural networks. Generally, for different machine learning models, the designs of the FLSs usually differ. It is still challenging to propose a practical tree based or neural network based FLS. Moreover, due to the fast developing of machine learning, there is a gap for FLSs to well support the state-of-the-art models.

4.3 Privacy Mechanisms

Privacy has been shown to be an important issue in machine learning and there have been many attacks against machine learning models [152, 49, 24, 134, 107, 102]. Also, there are many privacy mechanisms nowadays such as differential privacy [45] and -anonymity [47], which provide different privacy guarantees. The characteristics of existing privacy mechanisms are summarized in the survey [151]. Here we introduce three popular approaches that are adopted in current FLSs for data protection: model aggregation, cryptographic methods, and differential privacy. Model aggregation (or model averaging) [99, 101] is a widely used framework to avoid the communication of raw data in federated learning. Specifically, a global model is trained by aggregating the model parameters from local parties. A typical algorithm is federated averaging [99] based on stochastic gradient descent (SGD), which aggregates the locally-computed models and then update the global model in each round. PATE [116] combines multiple black-box local models to learn a global model, which predicts an output chosen by noisy voting among all of the local models. Yurochkin et al. [172]

developed a probabilistic FL framework by applying Bayesian nonparametric machinery. They use a Beta-Bernoulli process informed matching procedure to general a global model by matching the neurons in the local models. Smith

et al. [135] combined FL with multi-task learning to allow multiple parties to locally learn models against different tasks. A challenge of model aggregation methods is to ensure the better utility of the global model than the local models. Cryptographic methods such as homomorphic encryption  [8, 63, 19, 27, 60, 121, 122, 171, 173, 94], and secure multi-party computation [130, 29, 17, 43, 18, 85, 11, 52, 77, 153, 32, 54] are widely used in privacy-preserving machine learning algorithms. Basically, the parties have to encrypt their messages before sending, operate on the encrypted messages, and decrypt the encrypted output to get the result. Applying the above methods, the user privacy of federated learning systems can usually be well protected [75, 168, 76, 113, 169]. For example, secure multi-party computation [56] guarantees that all parties cannot learn anything except the output. However, such systems are usually not efficient and have a large computation and communication overhead. Differential privacy [45, 46] guarantees that one single record does not influence much on the output of a function. Many systems adopt differential privacy [28, 12, 3, 160, 175, 67, 87, 141] for data privacy protection, where the parties cannot know whether an individual record participates in the learning or not. By adding random noise to the data or the model parameters [3, 87, 136], differential privacy provides statistical privacy for individual records and protection against the inference attack on the model. Due to the noises in the learning process, such systems tend to produce less accurate models. Note that the above methods are independent of each other, and a federated learning system can adopt multiple methods to enhance the privacy guarantees [58]. While most of the existing FLSs adopt cryptographic techniques or differential privacy to achieve well privacy guarantee, the limitations of these approaches seem hard to overcome currently. While trying to minimize the side effects brought by these methods, it may also be a good choice to look for novel approaches to protect data privacy and flexible privacy requirements. For example, Liu et al. [96] adopts a weaker security model [42], which can make the system more practical. Related to privacy level, the threat models also vary in FLSs. A common assumption is that all parties are honest-but-curious [163, 33, 41], meaning that they follow the protocol but try to find out as much as possible about the data of the other parties. In such scenario, inference attacks may be conducted to extract user information. For example, membership inference attack [134, 102, 107] is an interesting kind of attack, where the attacker can infer whether a given record was used as part of the training dataset given accesses to the machine learning model. Also, there may be malicious parties [64, 64] in federated learning. One threat model is that the parties may conduct poison and adversarial attacks [10, 13, 51] to backdoor federated learning. Another threat model is that the parties may suffer Byzantine faults [26, 15, 34, 139], where the parties behave arbitrarily badly against the system.

4.4 Communication Architecture

There are two major ways of communications in FLSs: centralized design and distributed design. In the centralized design, the data flow is often asymmetric, which means one server or a specific party is required to aggregate the information (e.g., gradients) from the other parties and send back training result [16]. The parameter updates on the global model are always done in this server. The communication between the server and the local parties can be synchronous [99] or asynchronous [161, 137]. In a distributed design, the communications are performed among the parties [175] and every party is able to update the global parameters directly. Google Keyboard [164] is a case of centralized architecture. Google collects data from users’ android phones, collaboratively trains models using the collected data and returns the prediction result to users. How to reduce the communication cost is a vital problem in this kind of architectures. Some algorithms like deep gradient compression [92] has been proposed to solve this problem. While the centralized design is widely used in existing FLSs, the distributed design is preferred some aspects since concentrating information on one server may bring potential risks or unfairness. Recently, blockchain [177] is a popular distributed platform for consideration. It is still challenging to design a distributed system for FL while each party is treated nearly equally in terms of communication during the learning process and no trusted server is needed. The distributed cancer diagnosis system among hospitals is an example of distributed architecture. Each hospital shares the model trained with data from their patients and gets the global model for diagnosis. In distributed design, the major challenge is that it is hard to design a protocol that treats every member fairly.

4.5 Scale of Federation

The FLSs can be categorized into two typical types by the scale of federation: private FLS and public FLS. The differences between them lie on the number of parties and the amount of data stored in each party. In private FLS, there are usually a relatively small number of parties and each of them has a relatively large amount of data as well as computational power. For example, Amazon wants to recommend items for users by training the shopping data collected from hundreds of data centers around the world. Each data center possesses a huge amount of data as well as sufficient computational resources. One challenge that private FLS faces is how to efficiently distribute computation to data centers under the constraint of privacy models [178]. In public FLS, on the contrary, there are a relatively large number of parties and each party has a relatively small amount of data as well as computational power [155]. Google Keyboard [164] is a good example for public FLS. Google tries to improve the query suggestions of Google Keyboard with the help of federated learning. There are millions of Android devices and each device only has its users’ data. Meanwhile, due to the energy consumption concern, the devices cannot be asked to conduct complex training tasks. Under this occasion, the system should be powerful enough to manage huge number of parties and deal with possible issues such as the unstable connection between the device and the server.

4.6 Motivation of Federation

In real-world applications of federated learning, individual parties need the motivation to get involved in the federated system. The motivation can be regulations or incentives. Federated learning inside a company or an organization is usually motivated by regulations. But in some sorts of cooperation, parties cannot be forced to provide their data by regulations. Taking Google Keyboard [164] as an example, Google cannot prevent users who do not provide data from using their app. But those who agree to upload input data may enjoy a higher accuracy of word prediction. This kind of incentives can encourage every user providing their data to improve the performance of the overall model. However, how to design such a reasonable protocol remains a challenge. Incentive mechanism design can be very important for the success of a federated learning system. There have been some successful cases for incentive designs in blockchain [181, 48]. The parties inside the system can be collaborators at the same time competitors. Other incentive designs like [74, 73]

are proposed to attract participants with high-quality data for federated learning. We expect different game theory models 

[128, 72, 106] and their equilibrium designs should be revisited under the federated learning systems. Even in the case of Google Keyboard, the users need to be motivated to participate this collaborative learning process.

5 Summary of Existing Studies

In this section, we compare the existing studies on FLSs according to the aspects considered in Section 4.

5.1 Methodology

To discover the existing studies, we search keyword “federated learning” in Google Scholar111 and arXiv222 Here we only consider the published studies in computer science community. Since the scale of federation and the motivation of federation are problem dependent, we do not compare the studies by these two aspects. For ease of presentation, we use “NN”, “DT” and “LM” to denote neural networks, decision trees and linear models, respectively. Also, we use “CM”, “DP” and “MA” to denote cryptographic methods, differential privacy and model aggregation, respectively. Note that the algorithms (e.g., federated stochastic gradient descent) in some studies can be used to learn many machine learning models (e.g., logistic regression, neural networks). Thus, in the “model implementation” column, we present the models that are already implemented in the corresponding papers. Moreover, in the “main focus” column, we indicate the major area that the papers study on.

5.2 Individual Studies

Table I shows a summary of comparison among existing published studies on federated learning, which mainly focus on individual algorithms. From Table I, we have the following findings. First, most of the existing studies consider a horizontal data partitioning. We suspect a part of the reason is that the experimental studies and benchmarking in horizontal data partitioning is relatively ready than vertical data partitioning. Also, in vertical data partitioning, how to align data sets with different features is problem dependent and thus can be very challenging. The scenarios with the vertical data partitioning need to be further investigated. Second, most approaches of the existing studies can only be applied to one kind of machine learning model. While a particularly designed algorithm for one specific model may achieve higher model utility, a general federated learning framework may be more practical and easy-to-use. Third, due to the property of stochastic gradient descent, the model aggregation method can be easily applied to SGD and is currently the most popular approaches to implement federated learning without directly exposing the user data. We look forward to more novel FL frameworks. Lastly, the centralized design is the mainstream of current implementations. A trusted server is needed in their assumptions. It is more challenging to propose a framework without the demand of a centralized server.

Federated Learning
Linear Regression FL [126]
vertical LM CM centralized
Logistic Regression FL [63] LM
Federated Transfer Learning [96] NN
SecureBoost [35] DT
FedXGB [97] horizontal
Tree-based FL [175] DP distributed
Federated GBDTs [86] hashing
Ridge Regression FL [110] LM CM centralized
PPRR [33] CM
BlockFL [79] MA
Federated Collaborative Filtering [7]
Federated SVRG [80]
Byzantine Gradient Descent [34]
Federated MTL [135]
Variatinoal Federated MTL [37] NN
Krum [15]
Federated Averaging [99]
Federated Meta-Learning [30]
LFRL [93]
Bayesian Nonparametric FL [172]
Federated Generative Privacy [142] GAN
Secure Aggregation FL [18] CM, MA
DSSGD [133] DP, MA
Client-Level DP FL [53]
FL-LSTM [100]
Protection Against Reconstruction [14] LM, NN
Agnostic FL [104] MA
Fair FL [89] MA
PATE [116] LM, DT, NN DP, MA
Hybrid FL [143] LM, DT, NN CM, DP, MA
Communication Efficient FL [81]
Multi-Objective Evolutionary FL [180]
On-Device ML [68]
Sparse Ternary Compression [129]
FedCS [112]
DRL-MEC [157]
FL for Keyboard Prediction [62]
FFL-ERL [144]
FL for Vehicular Communication [125] GPD
Resource-Constrained MEC [156] LM, NN
FLs Performance Evaluation [111]
LEAF [22]
TABLE I: Comparison among existing published studies. LM denotes Linear Models. DM denotes Decision Trees. NN denotes Neural Networks. CM denotes Cryptographic Methods. DP denotes Differential Privacy. MA denotes Model Aggregation.

In the following, we review those studies with the categories on the main focus: algorithm design, efficiency improvement, application, and benchmark.

5.2.1 Algorithm Design

Sanil et al. [126] presented a secure regression model on vertical partitioned data. They focus on the linear regression model and secret sharing is applied to ensure privacy in their solution. Hardy et al. [63] presented a solution for two-party vertical federated logistic regression. They used entity resolution and additively homomorphic encryption. They also study the impact of entity resolution errors on learning. Liu et al. [96] introduced an FL framework combined with transfer learning for neural networks. They addressed a specific scenario where two parties have a part of common samples and all the label information are in one party. They used additively homomorphic encryption to encrypt the model parameters to protect data privacy. Cheng et al. [35] implemented a vertical tree-based FLS called SecureBoost. In their assumptions, only one party has the label information. They used the entity alignment technique to get the common data and then build the decision trees. Additively homomorphic encryption is used to protect the gradients. Liu et al. [97] proposed a federated extreme boosting learning framework for mobile crowdsensing. They adopted secret sharing to achieve privacy-preserving learning of GBDTs. Zhao et al. [175] proposed a horizontal tree-based FLS. Each decision tree is trained locally without the communications between parties. The trees trained in a party are sent to the next party to continuous train a number of trees. Differential privacy is used to protect the decision trees. Li et al. [86] exploited similarity information in the building of federated GBDTs by using locality-sensitive hashing [40]. They utilize the data distribution of local parties by aggregating gradients of similar instances. Within a weaker privacy model compared with secure multi-party computation, their approach is effective and efficient. Nikolaenko et al. [110] proposed a system for privacy-preserving ridge regression. Their approaches combine both homomorphic encryption and Yao’s garbled circuit to achieve privacy requirements. An extra evaluator is needed to run the algorithm. Chen et al. [33] proposed a system for privacy-preserving ridge regression. Their approaches combine both secure summation and homomorphic encryption to achieve privacy requirements. They provided a complete communication and computation overhead comparison among their approach and the previous state-of-the-art approaches. Kim et al. [79] combined blockchain architecture with federated learning. On the basis of federated averaging, they used a blockchain network to exchange the devices’ local model updates. Ammad et al. [7] formulated the first federated collaborative filter method. Based on a stochastic gradient approach, the item-factor matrix is trained in a global server by aggregating the local updates. Konevcny et al. [80]

proposed federated SVRG algorithm, which is based on stochastic variance reduced gradient 

[71]. They compared their algorithm with the other baselines like CoCoA+ [98] and simple distributed gradient descent. Their method can achieve better accuracy with the same communication rounds for the logistic regression model. Smith et al. [135] combined federated learning with multi-task learning (MTL) [25, 174]. Their method considers the issues of high communication cost, stragglers, and fault tolerance for MTL in the federated environment. Corinzia et al. [37]

proposed a federaetd MTL method with non-convex models. They treated the central server and the local parties as a Bayesian network and the inference is performed using variational methods. Chen

et al. [34] and Blanchard et al. [15] studied the scenario where the parties may be Byzantine and try to compromise the FLS. The former proposed an aggregation rule based on the geometric median of means of the gradients. The later proposed Krum, which selects the gradients vector closest to the barycenter among the proposed parameter vectors. McMahan et al. [99] implemented federated averaging on TensorFlow, focusing on improving communication efficiency. Methods they use on deep networks are effective on reducing communication costs compared to synchronized stochastic gradient descent method. Chen et al. [30] designed a federated meta-learning framework. Specifically, the users’ information are shared at the algorithm level, where a server updates the algorithm with feedback from the users. Liu et al. [93]

proposed a lifelong federated reinforcement learning framework. Adopting transfer learning techniques, a global model is trained to effectively remember what the robots have learned. Yurochkin

et al. [172] developed a probabilistic FL framework by applying Bayesian nonparametric machinery. They used an Beta-Bernoulli process informed matching procedure to combine the local models into a federated global model. Triastcyn et al. [142] used generative adversarial networks (GAN) [57] to generate artificial data, which are then directly used to train machine learning models. Their privacy guarantee is weaker than differential privacy. Bonawitz et al. [18] applied secure aggregation to protect the local parameters on the basis of federated averaging. Shokri et al. [133] proposed a distributed selective SGD algorithm, where a fraction of local parameters are used to update the global parameters each round. Differential privacy is applicable to protect the uploaded parameters. Geyer et al. [53] applied differential privacy in federated averaging on a client level perspective. They used the Gaussian mechanism to distort the sum of updates of gradients to protect a whole client’s dataset instead of a single data point. McMahan et al. [100]

deployed federated averaging in the training of Long Short-Term Memory (LSTM) recurrent neural networks (RNNs). In addition, they used user-level differential privacy to protect the parameters. Bhowmick

et al. [14] applied local differential privacy to protect the parameters in federated learning. To increase the model utility, they considered a practical threat model that wishes to decode individuals’ data but has little prior information on them. Withing this assumption, they could get a much larger privacy budget. Mohri et al. [104]

proposed a new framework named agnostic federated learning. Instead of minimizing the loss with respect to the uniform distribution, which is an average distribution among the data distributions from local clients, they tried to train a centralized model optimized for any possible target distribution formed by a mixture of the client distributions. Li

et al. [89] proposed a new objective taking the fairness into consideration. Specifically, if the variance of the performance of the model on the devices is smaller, then the model is more fair. Based on their objective, they proposed an extension of federated SGD, which uses a dynamic step-size instead of a fixed step-size. Papernot et al. [116] designed a general framework for federated learning, which can be applied to any model. In a black-box setting, they considered the local trained models as teacher models. Then, they learned a student model by noisy voting among all of the teachers. Truex et al. [143] combined both secure multiparty computation and differential privacy for privacy-preserving federated learning. They used differential privacy to inject noises to the local updates. Then the noisy updates will be encrypted using the Paillier cryptosystem [114] before sent to the central server.

5.2.2 Efficiency Improvement

Konevcny et al. [81] proposed two ways, structured updates and sketched updates, to reduce the communication costs in federated averaging. Their methods can reduce the communication cost by two orders of magnitude with a slight degradation in convergence speed. Zhu and Jin [180]

designed a multi-objective evolutionary algorithm to minimize the communication costs and the global model test errors simultaneously. Considering the minimization of the communication cost and the maximization of the global learning accuracy as two objectives, they formulated federated learning as a bi-objective optimization problem and solve it by the multi-objective evolutionary algorithm. Jeong

et al. [68] proposed a federated learning framework for devices with non-i.i.d. local data. They designed federated distillation, whose communication size depends on the output dimension but not on the model size. Also, they proposed a data augmentation scheme using a generative adversarial network (GAN) to make the training dataset become i.i.d.. Many other studies also design specialize approach for non-i.i.d. data. [176, 90, 95, 167] Sattler et al. [129] proposed a new compression framework named sparse ternary compression (STC). Specifically, STC compresses the communication using sparsification, ternarization, error accumulation and optimal Golomb encoding. Their method is robust to non-i.i.d. data and large numbers of parties. There are also other studies working on reducing communication cost of federated learning. Yao et al. [165] adopted a two-stream model with MMD (Maximum Mean Discrepancy) constraint instead of the single model to be trained on devices in standard federated learning settings to reduce the communication cost. Zhu et al. [179] proposed a multi-access Broadband Analog Aggregation (BAA) design for communication-latency reduction in federated learning based on the concept of over-the-hair computation [162]. Later, Amiri et al. [6] added error accumulation and gradient sparsification with over-the- air computation to get a faster convergence speed. Caldas et al. [21] proposed lossy compression and federated dropout to reduce server-to-participant communication costs.

5.2.3 Application Studies

Nishio et al. [112] implemented federated averaging in practical mobile edge computing (MEC) frameworks. They used an operator of MEC framworks to manage the resources of heterogeneous clients. Federated learning is promising in edge computing [109, 170, 119, 44]. Wang et al. [157] adopted federated averaging to implement distributed deep reinforcement learning (DRL) in mobile edge computing system. The usage of DRL and FL can effectively optimize the mobile edge computing, caching, and communication. Hard et al. [62] applied federated learning in mobile keyboard prediction. They adopted the federated averaging method to learn a variant of LSTM. Ulm et al. [144] implemented federated learning in Erlang (FFL-ERL), which is a functional programming language. Based on federated averaging, they created a functional implementation of an artificial neural network in Erlang. Samarakoon et al. [125] first adopted federated learning in the context of ultra reliable low latency communication. To model reliability in terms of probabilistic queue lengths, they used model averaging to learn a generalized Pareto distribution (GPD). Wang et al. [156] performed federated learning on resource-constrained MEC systems. They address the problem of how to efficiently utilize the limited computation and communication resources at the edge. Using federated averaging, they implemented many machine learning algorithms including linear regression, SVM, and CNN. While edge computing is an appropriate scenario to apply federated learning, federated learning has also been applied in many other areas. Brisimi et al. [20] developed models to predict hospitalizations for cardiac events using a federated distributed algorithm. They developed a general decentralized optimization framework enabling multiple data holders to collaborate and converge to a common predictive model, without explicitly exchanging raw data. Other applications about health AI are shown in [66]. Verma et al. [149] help to foster collaboration across multiple Government agencies with federated learning algorithms.

5.2.4 Benchmarking Studies

Nilsson et al. [111] conducted performance comparison among three different federated learning algorithms, including federated averaging [99], federated stochastic variance reduced gradient [81], and CO-OP [158]

. They executed experiments using a multi-layer perceptron on the MNIST dataset with both i.i.d. and non-i.i.d. partitions of the data. Their experiments showed that federated averaging can achieve better performance on MNIST than the other two algorithms. Caldas

et al. [22] proposed a LEAF, a modular benchmark for federated learning. LEAF includes public federated datasets, an array of statistical and systems metrics, and a set of reference implementations.

5.3 Open Source Systems

Federated Learning
Google TensorFlow Federated (TFF) [16, 140] horizontal LM, NN CM, DP, MA centralized
PySyft [123]
PhotoLabeller333 NN MA
Federated AI Technology Enabler (FATE)444 hybrid LM, DT, NN CM distributed
TABLE II: Comparison among some existing FLSs. The notations used in this table are the same as Table I.

Table II

shows a summary of comparison among some representative FLSs. Here we only consider open source systems that supports data protection. From Table 

II, we have the following findings. First, most of the current open-sourced systems only implemented one kind of partition methods and one or two kinds of machine learning models. We think that many systems are still in an early stage, and we expect more development efforts will be put in from the community. A general and complete FLS which can support multiple kinds of data partitioning or machine learning models is still on the way. Second, despite the costly processing of cryptographic methods, they seem to be the most popular technique to be used to provide privacy guarantees. However, there is no final conclusion as to which approach is better in terms of system performance and model utility. It should depend on the privacy restrictions. Last, many systems still adopt a centralized communication design since they need a server to aggregate model parameters. From the system perspective, this introduce single point of failure/control, and we believe that a more advanced mechanism to avoid this centralized design is needed. Now we give more details on these FLSs. Google proposed a scalable FLS which enables over tens of millions of Android devices learning a deep neural network based on TensorFlow [16]. In their design, they use a server to aggregate the model updates with federated averaging [100], which are computed by the devices locally in synchronous rounds. Differential privacy and secure aggregation are used to enhance privacy guarantees. The OpenMined community introduced an FL system named PySyft [123] built on PyTorch, which applies both differential privacy and multi-party computation (MPC) to provide privacy guarantees. They adopts SPDZ [39]

and moment accountant 

[3] methods respectively for MPC and DP in a federated learning context. Corbacho implemented PhotoLabeller, which gives a practical use case of FLS. It uses Android phones to train models locally, and uses federated averaging on the server to aggregate the model. Finally the trained model is shared across every client for photo labeling. WeBankFinTech company implemented FL platform Federated AI Technology Enabler (FATE), which supports multiple kinds of data partitioning and algorithms. The secure computation protocols are based on homomorphic encryption and multi-party computation. It has supported many machine learning algorithms including logistic regression, gradient boosting decision trees, etc.

6 Case Study

There are many interesting applications for federated learning systems. We review two case studies to examine the system aspects surveyed above.

6.1 Keyboard Word Suggestion

Here we analyze the keyboard word suggestion application and identify which type of FLS is suitable for such applications. Keyboard word suggestion aims to predict which word users input next according to their previous input words [164]. Keyboard word suggestion models can be trained in the federated learning manner, such that more user input data can be exploited to

Horizontal Partitioning
Neural Networks
Public FL
Centralized Communication
Privacy Protection
Incentive Motivated
TABLE III: GBoard requirements

improve the model quality, while obeying the user data regulations such as GDPR [150, 38] in Europe, PDPA [36] in Singapore and CCPA [23] in the US. For training such word suggestion models, the training data (i.e., users’ input records) is “partitioned” horizontally, where each user has her input data on her own device (e.g., mobile phone). Furthermore, many of the word suggestions models are based on neural networks [164] [120], so the desired FLS should support neural networks. Privacy protection mechanisms need to be enforced in the word suggestion model training as well, because the user data or the trained model is synchronized with those of other users. In keyboard word suggestion applications, the keyboard service providers cannot prevent users from using their services for users’ refusal of sharing data. So, incentive mechanisms need to be adopted in the FLS, such that more users are willing to participate to improve the word suggestion model. Finally, the training of the word suggestion model is performed in a public FLS, since the user data is distributed globally. Next, we take a concrete example, Google Keyboard (GBoard), of word suggestion applications to identify which FLS is well suited to. According to the analysis above and existing federated learning systems shown in Table II, Google TensorFlow Federated (TFF) may be the underlying FLS for GBoard. The communication architecture of the federated learning for GBoard is centralized, since Google cloud collects data from Android phones all around the world and then provides prediction service to users. TFF can handle horizontally partitioned data and can support neural networks for training models. Meanwhile, TFF provides differential privacy to ensure privacy guarantee. TFF is centralized and can sustain millions of users in the public FL setting. Finally, Google can encourage users to share their data by granting higher predictive accuracy to them as incentives. In summary, based on our analysis, TFF tends to be the underlying FL model for GBoard.

6.2 Nationwide Learning Health System

Hybrid Partitioning
No specific Models
Private FL
Distributed Communication
Privacy Protection
Policy Motivated
TABLE IV: Nationwide Learning Health System Requirements

In this case study, we discuss the application of FL in building health care systems. The vision of a nationwide learning health system has been introduced since 2010 [50]. This system aims to exploit data from research institutes, hospitals, federal agencies and many other parties to improve health care of the nation. In such scenario, the health care data is partitioned both horizontally and vertically: each party contains health care data of residents for a specific purpose (e.g., patient treatment), but the features used in each party are different. Moreover, the health care data is strictly protected by laws. In health care systems, besides neural networks, a wide range of machine learning algorithms are commonly used [9][82]. The learning process should be distributed, because each party shares data, trains the model on the data cooperatively, and gets the final learned model. Due to the strict privacy protection and the nature of health care systems, it is a private federated learning system where a few parties possess a huge amount of data, while the others may hold only a small amount. Finally, once the system is established, the data sharing can be guaranteed by policies of the national government (i.e., nationwide learning health systems can be governmental policy motivated). Based on the analysis above, Federated AI Technology Enabler (FATE) is the potential choice for this application. It can support both horizontal and vertical data partitioning. FATE provides multiple machine learning algorithms such as decision trees and linear models. Meanwhile, FATE has cryptographic techniques to guarantee the protection of user data, and supports distributed communication architecture as well. These properties of FATE well match the requirements of this application.

7 Vision and Conclusion

In this section, we show interesting directions to work on in the future, and conclude this paper.

7.1 Future Research Directions

7.1.1 (Re)-Invent Federated Learning Models

When designing an FLS, many existing studies have tried to support more machine learning models and come up with new efficient approaches to protect privacy while not sacrificing the accuracy of the learned model much.

7.1.2 Dynamic scheduling

As we discussed in Section 3.2, the number of parties may not be fixed during the learning process. However, the number of parties is fixed in many existing systems and they do not consider the situations where there are entries of new parties or departures of the current parties. The system should support dynamic scheduling and have the ability to adjust its strategy when there is a change in the number of parties. There are some studies addressing this issue. For example, Google TensorFlow Federated [16] can tolerate the drop-outs of the devices. Also, the emergence of blockchain [177] can be an ideal and transparent platform for multi-party learning. More efforts need to be done in this direction.

7.1.3 Diverse privacy restrictions

Little work has considered the privacy heterogeneity of FLSs, as shown in Section 3.1.2. The existing systems adopt techniques to protect the model parameters or gradients for all the parties on the same level. However, the privacy restrictions of the parties usually differ in reality. It would be interesting to design an FLS which treats the parties differently according to their privacy restrictions. The learned model should have a better performance if we can maximize the utilization of data of each party while not violating their privacy restrictions. The heterogeneous differential privacy [4] may be useful in such settings.

7.1.4 Intelligent benefits

Intuitively, one party can gain more from the FLS if it contributes more information. A simple solution is to make agreements among the parties such that some parties pay for the other parties which contribute more information. Representative incentive mechanisms need to be developed.

7.1.5 Benchmark

As more FLSs are being developed, a benchmark with representative data sets and workloads is quite important to evaluate the existing systems and direct future development. Caldas et al. [22] proposed LEAF, which is a benchmark including federated datasets, an evaluation framework, and reference implementations. Hao et al. [61] presented a computing testbed named Edge AIBench with federated learning support, and discussed four typical scenarios and six components for measurement included in the benchmark suite. Still, more applications and scenarios are the key to the success of FLSs.

7.1.6 System architecture

Like the parameter server in deep learning which controls the parameter synchronization, some common system architectures are needed to be investigated for FL. Although Yang et al. [163] presented three architectures for different partition methods of data, we need more complete architectures in terms of learning models or privacy levels. Communication costs can be a significant issue in the performance of training a federated learning model [154, 65].

7.1.7 Data life cycles

Learning is simply one aspects of a federated system. A data life cycle consists of multiple stages including data creation, storage, use, share, archive and destroy. For the data security and privacy of the entire application, we need to invent new data life cycles under federated learning context. Although data sharing is clearly one of the focused stage, the design of federated learning system also affects other stages. For example, data creation may help to prepare the data and features that are suitable for federated learning.

7.1.8 Data labels

Most existing studies have focused on labelled data sets. However, in practice, training data sets may not have labels, or have poisoned and mistaken labels, which can lead to runtime mispredictions. The poisoned and mislabels can come from unreliable data collection process such as in mobile and edge environments, and malicious parties. There are still many challenges to address those issues of data poisoning and backdoor attacks. Along this line, CalTrain [59] presents a multi-party collaborative learning system to fulfill modle accountability in Trusted Execution Environment (TEE) environments. Ghosh et al. [55] considers the model robustness upon Byzantine parties (or abnormal and adversarial parties). Another potential approach can be blockchain [118, 78]. Preuveneers et al. [118]

proposed a permissioned blockchain-based federated learning method to monitor the incremental updates to an anomaly detection machine learning model.

7.1.9 Federated learning in domains

Internet-of-thing: Security and privacy issues have been a hot research area in fog computing and edge computing, due to the increasing deployment of Internet-of-thing applications. For more details, readers can refer to some recent surveys [138, 166, 105]. Federated learning can be one potential approach in addressing the data privacy issues, while still offering reasonable good machine learning models [91, 108]. The additional key challenges come from the computation and energy constraints. The mechanisms of privacy and security introduces runtime overhead. For example, Jiang et al. [70] applies independent Gaussian random projection to improve the data privacy, and then the training of a deep network can be too costly. The author needs to develop new resource scheduling algorithm to moves the workload to the nodes with more computation power. Similar issues happen on other environments such as vehicle-to-vehicle networks [124, 127].

7.2 Conclusion

Many efforts have been devoted to developing federated learning systems (FLSs). A complete overview and summary for existing FLSs is important and meaningful. Inspired by the previous federated systems, we have shown that heterogeneity and autonomy are two important factors in the design of practical FLSs. Moreover, with six different aspects, we provide a comprehensive categorization for FLSs. Based on these aspects, we also present the comparison on features and designs among existing FLSs. More importantly, we have pointed out a number of opportunities, ranging from more benchmarks to integration of emerging platforms such as blockchain. Federated learning systems will be an exciting research journey, which call for the effort from machine learning, system and data privacy communities.


This work is supported by a MoE AcRF Tier 1 grant (T1 251RES1824), an SenseTime Young Scholars Research Fund, and a MOE Tier 2 grant (MOE2017-T2-1-122) in Singapore.


  • [1] C. ”WeBank (2018) Federated learning white paper v1.0. In, Cited by: §1.
  • [2] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al. (2016) Tensorflow: a system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), pp. 265–283. Cited by: §1.
  • [3] M. Abadi, A. Chu, I. Goodfellow, H. B. McMahan, I. Mironov, K. Talwar, and L. Zhang (2016) Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 308–318. Cited by: §4.3, §5.3.
  • [4] M. Alaggan, S. Gambs, and A. Kermarrec (2015) Heterogeneous differential privacy. arXiv preprint arXiv:1504.06998. Cited by: §7.1.3.
  • [5] J. P. Albrecht (2016) How the gdpr will change the world. Eur. Data Prot. L. Rev. 2, pp. 287. Cited by: §1, §3.1.2.
  • [6] M. M. Amiri and D. Gunduz (2019) Federated learning over wireless fading channels. arXiv preprint arXiv:1907.09769. Cited by: §5.2.2.
  • [7] M. Ammad-ud-din, E. Ivannikova, S. A. Khan, W. Oyomno, Q. Fu, K. E. Tan, and A. Flanagan (2019) Federated collaborative filtering for privacy-preserving personalized recommendation system. arXiv preprint arXiv:1901.09888. Cited by: §5.2.1, TABLE I.
  • [8] Y. Aono, T. Hayashi, L. Wang, S. Moriai, et al. (2018) Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security 13 (5), pp. 1333–1345. Cited by: §4.3.
  • [9] H. Asri, H. Mousannif, H. Al Moatassime, and T. Noel (2016) Using machine learning algorithms for breast cancer risk prediction and diagnosis. Procedia Computer Science 83, pp. 1064–1069. Cited by: §6.2.
  • [10] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov (2018) How to backdoor federated learning. arXiv preprint arXiv:1807.00459. Cited by: §4.3.
  • [11] R. Bahmani, M. Barbosa, F. Brasser, B. Portela, A. Sadeghi, G. Scerri, and B. Warinschi (2017) Secure multiparty computation from sgx. In International Conference on Financial Cryptography and Data Security, pp. 477–497. Cited by: §4.3.
  • [12] R. Bassily, A. Smith, and A. Thakurta (2014) Private empirical risk minimization: efficient algorithms and tight error bounds. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science, pp. 464–473. Cited by: §4.3.
  • [13] A. N. Bhagoji, S. Chakraborty, P. Mittal, and S. Calo (2018) Analyzing federated learning through an adversarial lens. External Links: 1811.12470 Cited by: §4.3.
  • [14] A. Bhowmick, J. Duchi, J. Freudiger, G. Kapoor, and R. Rogers (2018) Protection against reconstruction and its applications in private federated learning. arXiv preprint arXiv:1812.00984. Cited by: §5.2.1, TABLE I.
  • [15] P. Blanchard, R. Guerraoui, J. Stainer, et al. (2017) Machine learning with adversaries: byzantine tolerant gradient descent. In Advances in Neural Information Processing Systems, pp. 119–129. Cited by: §4.3, §5.2.1, TABLE I.
  • [16] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, C. Kiddon, J. Konecny, S. Mazzocchi, H. B. McMahan, et al. (2019) Towards federated learning at scale: system design. arXiv preprint arXiv:1902.01046. Cited by: §1, §4.2, §4.4, §5.3, TABLE II, §7.1.2.
  • [17] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth (2016) Practical secure aggregation for federated learning on user-held data. arXiv preprint arXiv:1611.04482. Cited by: §4.3.
  • [18] K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth (2017) Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191. Cited by: §1, §4.3, §5.2.1, TABLE I.
  • [19] F. Bourse, M. Minelli, M. Minihold, and P. Paillier (2018) Fast homomorphic evaluation of deep discretized neural networks. In Annual International Cryptology Conference, pp. 483–512. Cited by: §4.3.
  • [20] T. S. Brisimi, R. Chen, T. Mela, A. Olshevsky, I. C. Paschalidis, and W. Shi (2018) Federated learning of predictive models from federated electronic health records. International journal of medical informatics 112, pp. 59–67. Cited by: §5.2.3.
  • [21] S. Caldas, J. Konečny, H. B. McMahan, and A. Talwalkar (2018) Expanding the reach of federated learning by reducing client resource requirements. arXiv preprint arXiv:1812.07210. Cited by: §5.2.2.
  • [22] S. Caldas, P. Wu, T. Li, J. Konečnỳ, H. B. McMahan, V. Smith, and A. Talwalkar (2018) LEAF: a benchmark for federated settings. arXiv preprint arXiv:1812.01097. Cited by: §5.2.4, TABLE I, §7.1.5.
  • [23] California Consumer Privacy Act Home Page. Note: Cited by: §6.1.
  • [24] N. Carlini, C. Liu, J. Kos, Ú. Erlingsson, and D. Song (2018) The secret sharer: measuring unintended neural network memorization & extracting secrets. arXiv preprint arXiv:1802.08232. Cited by: §4.3.
  • [25] R. Caruana (1997) Multitask learning. Machine learning 28 (1), pp. 41–75. Cited by: §5.2.1.
  • [26] M. Castro, B. Liskov, et al. (1999) Practical byzantine fault tolerance. In OSDI, Vol. 99, pp. 173–186. Cited by: §4.3.
  • [27] H. Chabanne, A. de Wargny, J. Milgram, C. Morel, and E. Prouff (2017) Privacy-preserving classification on deep neural network.. IACR Cryptology ePrint Archive 2017, pp. 35. Cited by: §4.3.
  • [28] K. Chaudhuri, C. Monteleoni, and A. D. Sarwate (2011) Differentially private empirical risk minimization. Journal of Machine Learning Research 12 (Mar), pp. 1069–1109. Cited by: §4.3.
  • [29] D. Chaum (1988) The dining cryptographers problem: unconditional sender and recipient untraceability. Journal of cryptology 1 (1), pp. 65–75. Cited by: §4.3.
  • [30] F. Chen, Z. Dong, Z. Li, and X. He (2018) Federated meta-learning for recommendation. arXiv preprint arXiv:1802.07876. Cited by: §5.2.1, TABLE I.
  • [31] T. Chen and C. Guestrin (2016) Xgboost: a scalable tree boosting system. In KDD, pp. 785–794. Cited by: §1.
  • [32] V. Chen, V. Pastro, and M. Raykova (2019) Secure computation for machine learning with spdz. arXiv preprint arXiv:1901.00329. Cited by: §4.3.
  • [33] Y. Chen, A. Rezapour, and W. Tzeng (2018) Privacy-preserving ridge regression on distributed data. Information Sciences 451, pp. 34–49. Cited by: §1, §3.1.2, §4.2, §4.3, §5.2.1, TABLE I.
  • [34] Y. Chen, L. Su, and J. Xu (2017) Distributed statistical machine learning in adversarial settings: byzantine gradient descent. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1 (2), pp. 44. Cited by: §4.3, §5.2.1, TABLE I.
  • [35] K. Cheng, T. Fan, Y. Jin, Y. Liu, T. Chen, and Q. Yang (2019) SecureBoost: a lossless federated learning framework. arXiv preprint arXiv:1901.08755. Cited by: §1, §3.1.2, §4.1, §4.2, §5.2.1, TABLE I.
  • [36] W. B. Chik (2013) The singapore personal data protection act and an assessment of future trends in data privacy reform. Computer Law & Security Review 29 (5), pp. 554–575. Cited by: §6.1.
  • [37] L. Corinzia and J. M. Buhmann (2019) Variational federated multi-task learning. arXiv preprint arXiv:1906.06268. Cited by: §5.2.1, TABLE I.
  • [38] B. Custers, A. Sears, F. Dechesne, I. Georgieva, T. Tani, and S. van der Hof (2019) EU personal data protection in policy and practice. Springer. Cited by: §6.1.
  • [39] I. Damgård, V. Pastro, N. Smart, and S. Zakarias (2012) Multiparty computation from somewhat homomorphic encryption. In Annual Cryptology Conference, pp. 643–662. Cited by: §5.3.
  • [40] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni (2004) Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry, pp. 253–262. Cited by: §5.2.1.
  • [41] W. Du and M. J. Atallah (2001) Privacy-preserving cooperative statistical analysis. In Seventeenth Annual Computer Security Applications Conference, pp. 102–110. Cited by: §4.3.
  • [42] W. Du, Y. S. Han, and S. Chen (2004) Privacy-preserving multivariate statistical analysis: linear regression and classification. In SDM, pp. 222–233. Cited by: §4.3.
  • [43] W. Du and Z. Zhan (2002) Building decision tree classifier on private data. In Proceedings of the IEEE international conference on Privacy, security and data mining-Volume 14, pp. 1–8. Cited by: §4.3.
  • [44] M. Duan (2019) Astraea: self-balancing federated learning for improving classification accuracy of mobile deep learning applications. arXiv preprint arXiv:1907.01132. Cited by: §5.2.3.
  • [45] C. Dwork, F. McSherry, K. Nissim, and A. Smith (2006) Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265–284. Cited by: §4.3.
  • [46] C. Dwork, A. Roth, et al. (2014) The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science 9 (3–4), pp. 211–407. Cited by: §4.3.
  • [47] K. El Emam and F. K. Dankar (2008) Protecting privacy using k-anonymity. Journal of the American Medical Informatics Association 15 (5), pp. 627–637. Cited by: §4.3.
  • [48] I. Eyal, A. E. Gencer, E. G. Sirer, and R. V. Renesse (2016-03) Bitcoin-ng: a scalable blockchain protocol. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), Santa Clara, CA, pp. 45–59. External Links: ISBN 978-1-931971-29-4, Link Cited by: §4.6.
  • [49] M. Fredrikson, S. Jha, and T. Ristenpart (2015) Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333. Cited by: §4.3.
  • [50] C. P. Friedman, A. K. Wong, and D. Blumenthal (2010) Achieving a nationwide learning health system. Science translational medicine 2 (57), pp. 57cm29–57cm29. Cited by: §6.2.
  • [51] C. Fung, C. J. M. Yoon, and I. Beschastnikh (2018) Mitigating sybils in federated learning poisoning. External Links: 1808.04866 Cited by: §4.3.
  • [52] A. Gascón, P. Schoppmann, B. Balle, M. Raykova, J. Doerner, S. Zahur, and D. Evans (2016) Secure linear regression on vertically partitioned datasets.. IACR Cryptology ePrint Archive 2016, pp. 892. Cited by: §4.3.
  • [53] R. C. Geyer, T. Klein, and M. Nabi (2017) Differentially private federated learning: a client level perspective. arXiv preprint arXiv:1712.07557. Cited by: §5.2.1, TABLE I.
  • [54] B. Ghazi, R. Pagh, and A. Velingker (2019) Scalable and differentially private distributed aggregation in the shuffled model. arXiv preprint arXiv:1906.08320. Cited by: §4.3.
  • [55] A. Ghosh, J. Hong, D. Yin, and K. Ramchandran (2019) Robust federated learning in a heterogeneous environment. External Links: 1906.06629 Cited by: §7.1.8.
  • [56] O. Goldreich (1998) Secure multi-party computation. Manuscript. Preliminary version 78. Cited by: §4.3.
  • [57] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio (2014) Generative adversarial networks. External Links: 1406.2661 Cited by: §5.2.1.
  • [58] S. Goryczka and L. Xiong (2015) A comprehensive comparison of multiparty secure additions with differential privacy. IEEE transactions on dependable and secure computing 14 (5), pp. 463–477. Cited by: §4.3.
  • [59] Z. Gu, H. Jamjoom, D. Su, H. Huang, J. Zhang, T. Ma, D. Pendarakis, and I. Molloy (2018) Reaching data confidentiality and model accountability on the caltrain. External Links: 1812.03230 Cited by: §7.1.8.
  • [60] R. Hall, S. E. Fienberg, and Y. Nardi (2011) Secure multiple linear regression based on homomorphic encryption. Journal of Official Statistics 27 (4), pp. 669. Cited by: §4.3.
  • [61] T. Hao, Y. Huang, X. Wen, W. Gao, F. Zhang, C. Zheng, L. Wang, H. Ye, K. Hwang, Z. Ren, et al. (2019) Edge aibench: towards comprehensive end-to-end edge computing benchmarking. arXiv preprint arXiv:1908.01924. Cited by: §7.1.5.
  • [62] A. Hard, K. Rao, R. Mathews, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage (2018) Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604. Cited by: §5.2.3, TABLE I.
  • [63] S. Hardy, W. Henecka, H. Ivey-Law, R. Nock, G. Patrini, G. Smith, and B. Thorne (2017) Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677. Cited by: §1, §4.2, §4.3, §5.2.1, TABLE I.
  • [64] B. Hitaj, G. Ateniese, and F. Perez-Cruz (2017) Deep models under the gan: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618. Cited by: §4.3.
  • [65] K. Hsieh, A. Harlap, N. Vijaykumar, D. Konomis, G. R. Ganger, P. B. Gibbons, and O. Mutlu (2017-03) Gaia: geo-distributed machine learning approaching LAN speeds. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), Boston, MA, pp. 629–647. External Links: ISBN 978-1-931971-37-9, Link Cited by: §7.1.6.
  • [66] L. Huang, Y. Yin, Z. Fu, S. Zhang, H. Deng, and D. Liu (2018) LoAdaBoost: loss-based adaboost federated machine learning on medical data. arXiv preprint arXiv:1811.12629. Cited by: §5.2.3.
  • [67] R. Iyengar, J. P. Near, D. Song, O. Thakkar, A. Thakurta, and L. Wang (2019) Towards practical differentially private convex optimization. In Towards Practical Differentially Private Convex Optimization, pp. 0. Cited by: §4.3.
  • [68] E. Jeong, S. Oh, H. Kim, J. Park, M. Bennis, and S. Kim (2018) Communication-efficient on-device machine learning: federated distillation and augmentation under non-iid private data. arXiv preprint arXiv:1811.11479. Cited by: §5.2.2, TABLE I.
  • [69] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell (2014) Caffe: convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia, pp. 675–678. Cited by: §1.
  • [70] L. Jiang, R. Tan, X. Lou, and G. Lin (2019) On lightweight privacy-preserving collaborative learning for internet-of-things objects. In Proceedings of the International Conference on Internet of Things Design and Implementation, IoTDI ’19, New York, NY, USA, pp. 70–81. External Links: ISBN 978-1-4503-6283-2, Link, Document Cited by: §7.1.9.
  • [71] R. Johnson and T. Zhang (2013) Accelerating stochastic gradient descent using predictive variance reduction. In Advances in neural information processing systems, pp. 315–323. Cited by: §5.2.1.
  • [72] R. Jurca and B. Faltings (2003-06) An incentive compatible reputation mechanism. In EEE International Conference on E-Commerce, 2003. CEC 2003., Vol. , pp. 285–292. External Links: Document, ISSN Cited by: §4.6.
  • [73] J. Kang, Z. Xiong, D. Niyato, S. Xie, and J. Zhang (2019) Incentive mechanism for reliable federated learning: a joint optimization approach to combining reputation and contract theory. IEEE Internet of Things Journal. Cited by: §4.6.
  • [74] J. Kang, Z. Xiong, D. Niyato, H. Yu, Y. Liang, and D. I. Kim (2019) Incentive design for efficient federated learning in mobile networks: a contract theory approach. arXiv preprint arXiv:1905.07479. Cited by: §4.6.
  • [75] M. Kantarcioglu and C. Clifton (2004) Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge & Data Engineering (9), pp. 1026–1037. Cited by: §4.3.
  • [76] A. F. Karr, X. Lin, A. P. Sanil, and J. P. Reiter (2009) Privacy-preserving analysis of vertically partitioned data using secure matrix products. Journal of Official Statistics 25 (1), pp. 125. Cited by: §4.3.
  • [77] N. Kilbertus, A. Gascón, M. J. Kusner, M. Veale, K. P. Gummadi, and A. Weller (2018) Blind justice: fairness with encrypted sensitive attributes. arXiv preprint arXiv:1806.03281. Cited by: §4.3.
  • [78] H. Kim, J. Park, M. Bennis, and S. Kim (2019) Blockchained on-device federated learning. IEEE Communications Letters (), pp. 1–1. External Links: Document, ISSN Cited by: §7.1.8.
  • [79] H. Kim, J. Park, M. Bennis, and S. Kim (2018) On-device federated learning via blockchain and its latency analysis. arXiv preprint arXiv:1808.03949. Cited by: §5.2.1, TABLE I.
  • [80] J. Konečnỳ, H. B. McMahan, D. Ramage, and P. Richtárik (2016) Federated optimization: distributed machine learning for on-device intelligence. arXiv preprint arXiv:1610.02527. Cited by: §5.2.1, TABLE I.
  • [81] J. Konečnỳ, H. B. McMahan, F. X. Yu, P. Richtárik, A. T. Suresh, and D. Bacon (2016) Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492. Cited by: §5.2.2, §5.2.4, TABLE I.
  • [82] K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis (2015) Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal 13, pp. 8–17. Cited by: §6.2.
  • [83] T. Kurze, M. Klems, D. Bermbach, A. Lenk, S. Tai, and M. Kunze (2011) Cloud federation. Cloud Computing 2011, pp. 32–38. Cited by: §1, §2.1.
  • [84] D. Leroy, A. Coucke, T. Lavril, T. Gisselbrecht, and J. Dureau (2019) Federated learning for keyword spotting. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6341–6345. Cited by: §4.1.
  • [85] P. Li, J. Li, Z. Huang, T. Li, C. Gao, S. Yiu, and K. Chen (2017) Multi-key privacy-preserving deep learning in cloud computing. Future Generation Computer Systems 74, pp. 76–85. Cited by: §4.3.
  • [86] Q. Li, Z. Wen, and B. He (2019) Practical federated gradient boosting decision trees. arXiv preprint arXiv:1911.04206. Cited by: §5.2.1, TABLE I.
  • [87] Q. Li, Z. Wu, Z. Wen, and B. He (2019) Privacy-preserving gradient boosting decision trees. arXiv preprint arXiv:1911.04209. Cited by: §4.3.
  • [88] T. Li, A. K. Sahu, A. Talwalkar, and V. Smith (2019) Federated learning: challenges, methods, and future directions. External Links: 1908.07873 Cited by: §1.
  • [89] T. Li, M. Sanjabi, and V. Smith (2019) Fair resource allocation in federated learning. arXiv preprint arXiv:1905.10497. Cited by: §5.2.1, TABLE I.
  • [90] X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang (2019) On the convergence of fedavg on non-iid data. arXiv preprint arXiv:1907.02189. Cited by: §5.2.2.
  • [91] W. Y. B. Lim, N. C. Luong, D. T. Hoang, Y. Jiao, Y. Liang, Q. Yang, D. Niyato, and C. Miao (2019) Federated learning in mobile edge networks: a comprehensive survey. External Links: 1909.11875 Cited by: §1, §7.1.9.
  • [92] Y. Lin, S. Han, H. Mao, Y. Wang, and W. J. Dally (2017) Deep gradient compression: reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887. Cited by: §4.4.
  • [93] B. Liu, L. Wang, M. Liu, and C. Xu (2019) Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems. arXiv preprint arXiv:1901.06455. Cited by: §5.2.1, TABLE I.
  • [94] J. Liu, M. Juuti, Y. Lu, and N. Asokan (2017) Oblivious neural network predictions via minionn transformations. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 619–631. Cited by: §4.3.
  • [95] L. Liu, J. Zhang, S. Song, and K. B. Letaief (2019) Edge-assisted hierarchical federated learning with non-iid data. arXiv preprint arXiv:1905.06641. Cited by: §5.2.2.
  • [96] Y. Liu, T. Chen, and Q. Yang (2018) Secure federated transfer learning. arXiv preprint arXiv:1812.03337. Cited by: §1, §4.1, §4.3, §5.2.1, TABLE I.
  • [97] Y. Liu, Z. Ma, X. Liu, S. Ma, S. Nepal, and R. Deng (2019) Boosting privately: privacy-preserving federated extreme boosting for mobile crowdsensing. arXiv preprint arXiv:1907.10218. Cited by: §5.2.1, TABLE I.
  • [98] C. Ma, J. Konečnỳ, M. Jaggi, V. Smith, M. I. Jordan, P. Richtárik, and M. Takáč (2017) Distributed optimization with arbitrary local solvers. optimization Methods and Software 32 (4), pp. 813–848. Cited by: §5.2.1.
  • [99] H. B. McMahan, E. Moore, D. Ramage, S. Hampson, et al. (2016) Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629. Cited by: §1, §4.2, §4.3, §4.4, §5.2.1, §5.2.4, TABLE I.
  • [100] H. B. McMahan, D. Ramage, K. Talwar, and L. Zhang (2017) Learning differentially private recurrent language models. arXiv preprint arXiv:1710.06963. Cited by: §4.1, §5.2.1, §5.3, TABLE I.
  • [101] H. B. McMahan, E. Moore, D. Ramage, and B. A. y Arcas (2016) Federated learning of deep networks using model averaging. CoRR abs/1602.05629. Cited by: §4.3.
  • [102] L. Melis, C. Song, E. De Cristofaro, and V. Shmatikov (2019) Exploiting unintended feature leakage in collaborative learning. In 2019 IEEE Symposium on Security and Privacy (SP), pp. 691–706. Cited by: §4.3.
  • [103] P. Mohassel and P. Rindal (2018) ABY 3: a mixed protocol framework for machine learning. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 35–52. Cited by: §1.
  • [104] M. Mohri, G. Sivek, and A. T. Suresh (2019) Agnostic federated learning. arXiv preprint arXiv:1902.00146. Cited by: §5.2.1, TABLE I.
  • [105] M. Mukherjee, R. Matam, L. Shu, L. Maglaras, M. A. Ferrag, N. Choudhury, and V. Kumar (2017) Security and privacy in fog computing: challenges. IEEE Access 5 (), pp. 19293–19304. External Links: Document, ISSN Cited by: §7.1.9.
  • [106] M. Naor, B. Pinkas, and R. Sumner (1999) Privacy preserving auctions and mechanism design. In Proceedings of the 1st ACM Conference on Electronic Commerce, EC ’99, New York, NY, USA, pp. 129–139. External Links: ISBN 1-58113-176-3, Link, Document Cited by: §4.6.
  • [107] M. Nasr, R. Shokri, and A. Houmansadr (2019) Comprehensive privacy analysis of deep learning: passive and active white-box inference attacks against centralized and federated learning. In Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning, pp. 0. Cited by: §4.3.
  • [108] T. D. Nguyen, S. Marchal, M. Miettinen, H. Fereidooni, N. Asokan, and A. Sadeghi (2018) DÏoT: a federated self-learning anomaly detection system for iot. External Links: 1804.07474 Cited by: §7.1.9.
  • [109] S. Niknam, H. S. Dhillon, and J. H. Reed (2019) Federated learning for wireless communications: motivation, opportunities and challenges. arXiv preprint arXiv:1908.06847. Cited by: §5.2.3.
  • [110] V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. Boneh, and N. Taft (2013) Privacy-preserving ridge regression on hundreds of millions of records. In 2013 IEEE Symposium on Security and Privacy, pp. 334–348. Cited by: §1, §4.2, §5.2.1, TABLE I.
  • [111] A. Nilsson, S. Smith, G. Ulm, E. Gustavsson, and M. Jirstrand (2018) A performance evaluation of federated learning algorithms. In Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning, pp. 1–8. Cited by: §5.2.4, TABLE I.
  • [112] T. Nishio and R. Yonetani (2019) Client selection for federated learning with heterogeneous resources in mobile edge. In ICC 2019-2019 IEEE International Conference on Communications (ICC), pp. 1–7. Cited by: §5.2.3, TABLE I.
  • [113] R. Nock, S. Hardy, W. Henecka, H. Ivey-Law, G. Patrini, G. Smith, and B. Thorne (2018) Entity resolution and federated learning get a federated resolution. arXiv preprint arXiv:1803.04035. Cited by: §4.3.
  • [114] P. Paillier (1999) Public-key cryptosystems based on composite degree residuosity classes. In International Conference on the Theory and Applications of Cryptographic Techniques, pp. 223–238. Cited by: §5.2.1.
  • [115] S. J. Pan and Q. Yang (2010) A survey on transfer learning. IEEE Transactions on knowledge and data engineering 22 (10), pp. 1345–1359. Cited by: §4.1.
  • [116] N. Papernot, M. Abadi, U. Erlingsson, I. Goodfellow, and K. Talwar (2016) Semi-supervised knowledge transfer for deep learning from private training data. arXiv preprint arXiv:1610.05755. Cited by: §4.3, §5.2.1, TABLE I.
  • [117] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer (2017) Automatic differentiation in pytorch. Cited by: §1.
  • [118] D. Preuveneers, V. Rimmer, I. Tsingenopoulos, J. Spooren, W. Joosen, and E. Ilie-Zudor (2018-12) Chained anomaly detection models for federated learning: an intrusion detection case study. Applied Sciences 8, pp. 2663. External Links: Document Cited by: §7.1.8.
  • [119] Y. Qian, L. Hu, J. Chen, X. Guan, M. M. Hassan, and A. Alelaiwi (2019) Privacy-aware service placement for mobile edge computing via federated learning. Information Sciences 505, pp. 562–570. Cited by: §5.2.3.
  • [120] M. Ranzato, S. Chopra, M. Auli, and W. Zaremba (2015) Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732. Cited by: §6.1.
  • [121] M. S. Riazi, C. Weinert, O. Tkachenko, E. M. Songhori, T. Schneider, and F. Koushanfar (2018) Chameleon: a hybrid secure computation framework for machine learning applications. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, pp. 707–721. Cited by: §4.3.
  • [122] B. D. Rouhani, M. S. Riazi, and F. Koushanfar (2018) Deepsecure: scalable provably-secure deep learning. In Proceedings of the 55th Annual Design Automation Conference, pp. 2. Cited by: §4.3.
  • [123] T. Ryffel, A. Trask, M. Dahl, B. Wagner, J. Mancuso, D. Rueckert, and J. Passerat-Palmbach (2018) A generic framework for privacy preserving deep learning. arXiv preprint arXiv:1811.04017. Cited by: §1, §5.3, TABLE II.
  • [124] S. Samarakoon, M. Bennis, W. Saad, and M. Debbah (2018) Federated learning for ultra-reliable low-latency v2v communications. External Links: 1805.09253 Cited by: §7.1.9.
  • [125] S. Samarakoon, M. Bennis, W. Saady, and M. Debbah (2018) Distributed federated learning for ultra-reliable low-latency vehicular communications. arXiv preprint arXiv:1807.08127. Cited by: §5.2.3, TABLE I.
  • [126] A. P. Sanil, A. F. Karr, X. Lin, and J. P. Reiter (2004) Privacy preserving regression modelling via distributed computation. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 677–682. Cited by: §5.2.1, TABLE I.
  • [127] Y. M. Saputra, D. T. Hoang, D. N. Nguyen, E. Dutkiewicz, M. D. Mueck, and S. Srikanteswara (2019) Energy demand prediction with federated learning for electric vehicle networks. External Links: 1909.00907 Cited by: §7.1.9.
  • [128] Y. Sarikaya and O. Ercetin (2019) Motivating workers in federated learning: a stackelberg game perspective. External Links: 1908.03092 Cited by: §4.6.
  • [129] F. Sattler, S. Wiedemann, K. Müller, and W. Samek (2019) Robust and communication-efficient federated learning from non-iid data. arXiv preprint arXiv:1903.02891. Cited by: §5.2.2, TABLE I.
  • [130] A. Shamir (1979) How to share a secret. Communications of the ACM 22 (11), pp. 612–613. Cited by: §4.3.
  • [131] A. P. Sheth and J. A. Larson (1990) Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys (CSUR) 22 (3), pp. 183–236. Cited by: §1, §2.1.
  • [132] E. Shi, T. H. Chan, E. Rieffel, and D. Song (2017) Distributed private data analysis: lower bounds and practical constructions. ACM Transactions on Algorithms (TALG) 13 (4), pp. 50. Cited by: §1.
  • [133] R. Shokri and V. Shmatikov (2015) Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp. 1310–1321. Cited by: §5.2.1, TABLE I.
  • [134] R. Shokri, M. Stronati, C. Song, and V. Shmatikov (2017) Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. Cited by: §4.3.
  • [135] V. Smith, C. Chiang, M. Sanjabi, and A. S. Talwalkar (2017) Federated multi-task learning. In Advances in Neural Information Processing Systems, pp. 4424–4434. Cited by: §1, §3.1.3, §4.3, §5.2.1, TABLE I.
  • [136] S. Song, K. Chaudhuri, and A. D. Sarwate (2013) Stochastic gradient descent with differentially private updates. In 2013 IEEE Global Conference on Signal and Information Processing, pp. 245–248. Cited by: §4.3.
  • [137] M. R. Sprague, A. Jalalirad, M. Scavuzzo, C. Capota, M. Neun, L. Do, and M. Kopp (2018) Asynchronous federated learning for geospatial applications. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 21–28. Cited by: §4.4.
  • [138] I. Stojmenovic, S. Wen, X. Huang, and H. Luan (2016-07) An overview of fog computing and its security issues. Concurr. Comput. : Pract. Exper. 28 (10), pp. 2991–3005. External Links: ISSN 1532-0626, Link, Document Cited by: §7.1.9.
  • [139] L. Su and J. Xu (2018) Securing distributed machine learning in high dimensions. arXiv preprint arXiv:1804.10140. Cited by: §4.3.
  • [140] TensorFlow federated: machine learning on decentralized data. Note: Cited by: TABLE II.
  • [141] O. Thakkar, G. Andrew, and H. B. McMahan (2019) Differentially private learning with adaptive clipping. arXiv preprint arXiv:1905.03871. Cited by: §4.3.
  • [142] A. Triastcyn and B. Faltings (2019) Federated generative privacy. External Links: 1910.08385 Cited by: §5.2.1, TABLE I.
  • [143] S. Truex, N. Baracaldo, A. Anwar, T. Steinke, H. Ludwig, and R. Zhang (2018) A hybrid approach to privacy-preserving federated learning. arXiv preprint arXiv:1812.03224. Cited by: §5.2.1, TABLE I.
  • [144] G. Ulm, E. Gustavsson, and M. Jirstrand (2018) Functional federated learning in erlang (ffl-erl). In

    International Workshop on Functional and Constraint Logic Programming

    pp. 162–178. Cited by: §5.2.3, TABLE I.
  • [145] J. Vaidya and C. Clifton (2002) Privacy preserving association rule mining in vertically partitioned data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 639–644. Cited by: §4.1.
  • [146] J. Vaidya and C. Clifton (2003) Privacy-preserving k-means clustering over vertically partitioned data. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 206–215. Cited by: §4.1.
  • [147] J. Vaidya and C. Clifton (2004) Privacy preserving naive bayes classifier for vertically partitioned data. In Proceedings of the 2004 SIAM International Conference on Data Mining, pp. 522–526. Cited by: §4.1.
  • [148] J. Vaidya and C. Clifton (2005) Privacy-preserving decision trees over vertically partitioned data. In IFIP Annual Conference on Data and Applications Security and Privacy, pp. 139–152. Cited by: §4.1.
  • [149] D. Verma, S. Julier, and G. Cirincione (2018) Federated ai for building ai solutions across multiple agencies. arXiv preprint arXiv:1809.10036. Cited by: §5.2.3.
  • [150] P. Voigt and A. Von dem Bussche (2017) The eu general data protection regulation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing. Cited by: §6.1.
  • [151] I. Wagner and D. Eckhoff (2018) Technical privacy metrics: a systematic survey. ACM Computing Surveys (CSUR) 51 (3), pp. 57. Cited by: §4.3.
  • [152] M. J. Wainwright, M. I. Jordan, and J. C. Duchi (2012) Privacy aware learning. In Advances in Neural Information Processing Systems, pp. 1430–1438. Cited by: §4.3.
  • [153] L. Wan, W. K. Ng, S. Han, and V. Lee (2007) Privacy-preservation for gradient descent methods. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 775–783. Cited by: §4.3.
  • [154] L. Wang, W. Wang, and B. Li (2019) CMFL: mitigating communication overhead for federated learning. In IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Cited by: §7.1.6.
  • [155] S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan (2018) When edge meets learning: adaptive control for resource-constrained distributed machine learning. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pp. 63–71. Cited by: §4.5.
  • [156] S. Wang, T. Tuor, T. Salonidis, K. K. Leung, C. Makaya, T. He, and K. Chan (2019) Adaptive federated learning in resource constrained edge computing systems. IEEE Journal on Selected Areas in Communications 37 (6), pp. 1205–1221. Cited by: §5.2.3, TABLE I.
  • [157] X. Wang, Y. Han, C. Wang, Q. Zhao, X. Chen, and M. Chen (2019) In-edge ai: intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network. Cited by: §5.2.3, TABLE I.
  • [158] Y. Wang (2017) Co-op: cooperative machine learning from mobile devices. Cited by: §5.2.4.
  • [159] Z. Wen, J. Shi, B. He, J. Chen, K. Ramamohanarao, and Q. Li (2019) Exploiting gpus for efficient gradient boosting decision tree training. IEEE Transactions on Parallel and Distributed Systems. Cited by: §1.
  • [160] X. Wu, F. Li, A. Kumar, K. Chaudhuri, S. Jha, and J. Naughton (2017) Bolt-on differential privacy for scalable stochastic gradient descent-based analytics. In Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1307–1322. Cited by: §4.3.
  • [161] C. Xie, S. Koyejo, and I. Gupta (2019) Asynchronous federated optimization. arXiv preprint arXiv:1903.03934. Cited by: §4.4.
  • [162] K. Yang, T. Jiang, Y. Shi, and Z. Ding (2018) Federated learning via over-the-air computation. arXiv preprint arXiv:1812.11750. Cited by: §5.2.2.
  • [163] Q. Yang, Y. Liu, T. Chen, and Y. Tong (2019) Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST) 10 (2), pp. 12. Cited by: §1, §4.1, §4.3, §7.1.6.
  • [164] T. Yang, G. Andrew, H. Eichner, H. Sun, W. Li, N. Kong, D. Ramage, and F. Beaufays (2018) Applied federated learning: improving google keyboard query suggestions. arXiv preprint arXiv:1812.02903. Cited by: §4.4, §4.5, §4.6, §6.1, §6.1.
  • [165] X. Yao, C. Huang, and L. Sun (2018) Two-stream federated learning: reduce the communication costs. In 2018 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4. Cited by: §5.2.2.
  • [166] S. Yi, Z. Qin, and Q. Li (2015) Security and privacy issues of fog computing: a survey. In WASA, Cited by: §7.1.9.
  • [167] N. Yoshida, T. Nishio, M. Morikura, K. Yamamoto, and R. Yonetani (2019) Hybrid-fl: cooperative learning mechanism using non-iid data in wireless networks. arXiv preprint arXiv:1905.07210. Cited by: §5.2.2.
  • [168] H. Yu, X. Jiang, and J. Vaidya (2006) Privacy-preserving svm using nonlinear kernels on horizontally partitioned data. In Proceedings of the 2006 ACM symposium on Applied computing, pp. 603–610. Cited by: §4.3.
  • [169] H. Yu, J. Vaidya, and X. Jiang (2006) Privacy-preserving svm classification on vertically partitioned data. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 647–656. Cited by: §4.3.
  • [170] Z. Yu, J. Hu, G. Min, H. Lu, Z. Zhao, H. Wang, and N. Georgalas (2018) Federated learning based proactive content caching in edge computing. In 2018 IEEE Global Communications Conference (GLOBECOM), pp. 1–6. Cited by: §5.2.3.
  • [171] J. Yuan and S. Yu (2013) Privacy preserving back-propagation neural network learning made practical with cloud computing. IEEE Transactions on Parallel and Distributed Systems 25 (1), pp. 212–221. Cited by: §4.3.
  • [172] M. Yurochkin, M. Agarwal, S. Ghosh, K. Greenewald, T. N. Hoang, and Y. Khazaeni (2019) Bayesian nonparametric federated learning of neural networks. arXiv preprint arXiv:1905.12022. Cited by: §1, §4.3, §5.2.1, TABLE I.
  • [173] Q. Zhang, L. T. Yang, and Z. Chen (2015) Privacy preserving deep computation model on cloud for big data feature learning. IEEE Transactions on Computers 65 (5), pp. 1351–1362. Cited by: §4.3.
  • [174] Y. Zhang and Q. Yang (2017) A survey on multi-task learning. arXiv preprint arXiv:1707.08114. Cited by: §5.2.1.
  • [175] L. Zhao, L. Ni, S. Hu, Y. Chen, P. Zhou, F. Xiao, and L. Wu (2018) InPrivate digging: enabling tree-based distributed data mining with differential privacy. In INFOCOM, pp. 2087–2095. Cited by: §1, §4.2, §4.3, §4.4, §5.2.1, TABLE I.
  • [176] Y. Zhao, M. Li, L. Lai, N. Suda, D. Civin, and V. Chandra (2018) Federated learning with non-iid data. arXiv preprint arXiv:1806.00582. Cited by: §5.2.2.
  • [177] Z. Zheng, S. Xie, H. Dai, X. Chen, and H. Wang (2018) Blockchain challenges and opportunities: a survey. International Journal of Web and Grid Services 14 (4), pp. 352–375. Cited by: §4.4, §7.1.2.
  • [178] A. C. Zhou, Y. Xiao, B. He, J. Zhai, R. Mao, et al. (2019) Privacy regulation aware process mapping in geo-distributed cloud data centers. IEEE Transactions on Parallel and Distributed Systems. Cited by: §4.5.
  • [179] G. Zhu, Y. Wang, and K. Huang (2018) Low-latency broadband analog aggregation for federated edge learning. arXiv preprint arXiv:1812.11494. Cited by: §5.2.2.
  • [180] H. Zhu and Y. Jin (2019) Multi-objective evolutionary federated learning. IEEE transactions on neural networks and learning systems. Cited by: §5.2.2, TABLE I.
  • [181] G. Zyskind, O. Nathan, and A. ’. Pentland (2015-05) Decentralizing privacy: using blockchain to protect personal data. In 2015 IEEE Security and Privacy Workshops, Vol. , pp. 180–184. External Links: Document, ISSN Cited by: §4.6.