The emergence of federated learning (FL) enables multiple devices to learn a common model while keeping all the training data on their own devices. It allows for less resource consumption on the cloud and ensures the privacy at the same time. Multiple applications have benefited from FL, including mobile phones [1, 2, 3], wearable devices [4, 5], autonomous vehicles [6, 7], etc. In standard federated learning, all participants are required to train their local models. A random subset of clients will be selected each round, who will upload their gradient updates to the central server. Similar FL architectures can be found in [8, 9, 10, 11, 12, 13].
One interesting question here is about the security and privacy implication in the FL training process. Any characteristic of clients’ private data needs to be protected carefully since it may reveal some important private information about the training data – e.g., the distribution of labels might show the diversity of participants. Similarly, what the training data consists of is also what attackers want to explore, i.e., can they determine the quantity proportion of different labels in the whole training dataset during the training process? This problem may pose serious threats to FL security. For instance, an attacker can acquire information about the morbidity of a particular disease if the government is training an online disease diagnosis system. A malicious store can figure out the relation between the supply and demand of a certain product when there is a new commodity registration system trained with FL approach, and it can then adjust its price accordingly to gain unfair advantage.
In the literature, there are mainly two areas of research on attacking FL models: division and aggregation, which correspond to the two main roles in FL (distributed devices and central server). The former one assumes that the attacker compromises some participated devices and uses them to achieve malicious intentions, e.g. importing backdoor to FL [14, 15], adversarial poisoning [16, 17, 18], membership or property inference [19, 20, 21, 22], and reconstruction attack [23, 24, 25]. The attacks in the latter area are relatively less-studied. In , the authors assume that the central server is malicious and train a GAN to reproduce data samples similar to the privacy of clients. Similar idea can also be seen in . Please see Section VI for a more comprehensive discussion on related works.
These previously-studied attacks to FL, e.g., membership inference or reconstruction attack, did not lay much emphasis on the quantity information in training, as they usually focus on existential information, i.e., whether a certain sample exists in training data. Another drawback of these approaches is that they all need individual updates that clients sent to the server. However, under the secure aggregation protocol  or differential privacy techniques [29, 30], both the participants and the server cannot acquire the individual updates in the plain form, which make most of these attacks difficult. Therefore, more practical and applicable attacks should be based on the assumption that the observation of individual updates is not available.
The aforementioned issues motivated us to consider attacks without asking individual updates. In this paper, we propose three new inference attacks with high success rate and without the need of any gradient updates from individual clients. In addition, our attacks concentrate on the quantity information of training data in FL, which could lead to serious consequences but has never been studied in prior works to the best of our knowledge. We conducted extensive experiments to evaluate the effectiveness and generality of our approaches, and the results showed the existence of vulnerability from quantity privacy leakage.
In this paper, we make the first step towards quantity estimation attacks in federated learning. Specifically:
We propose a new attacking surface in the context of federated learning, i.e., inferring the quantity composition proportion of different labels in the training process. For instance, an attacker may learn how many data samples with each label are used in the training of a certain learning model, which may possibly pose considerable privacy threats to the practical application of FL.
We design three general attacks towards FL without the need to observe any individual updates. This enables the adversaries to launch our attacks successfully in FL even with secure aggregation protocols or under the protection of differential privacy. Our attacks are passive, which means they will not impose any influence to the training process, and thus they can work covertly without being detected by many intrusion detection techniques.
Our technique can infer the labels’ quantity composition proportion of a single training round, or the whole training process. The former aims at stealing the quantity information of the training data owned by selected clients, while the latter one targets the quantity proportion of all participants at different training stages.
Ii Threat Model
The advancement of deep learning techniques has received significant interests in recent years. It also has a wide range of applications on different types of devices, which gives a bright stage for FL to show its merits on convenience, privacy protection and resource utilization. In fact, FL has shown great promises not only on smartphone-based applications (e.g., human activity recognition, heart rate monitoring  and keyboard prediction [2, 3]), but also in other fields, such as healthcare industry (e.g., disease diagnosis online expert system and medical insurance registration ) and transportation systems (vehicular networking technology [6, 7]).
FL is designed to preserve the private data of individuals, and any information or characteristic should be protected seriously. In FL architecture, training data owned by clients comes from various sources, so the quantity of samples among different labels might be unbalanced, which reflects the clients’ overall characteristic. It is a potential source of information leakage if these quantities are illegally obtained by malicious attackers. For instance, an organization wants to build up an online disease prediction system with FL structure among thousands of hospitals. Each hospital trains their local model with their own data, and the organization will obtain a global model that is able to predict the trend of many diseases, not just a small group of diseases emerged in a single hospital. Then, there is a malicious player who wants to know how many hospitals have treated a particular disease, so that it can raise the corresponding treatment expenses and even estimate the approximate distribution scope of this disease. This is a simple example of attackers may try to learn the composition proportion of training data for their advantage, and many other applications could have similar concern.
Thus, in this work, the goal of attackers is defined as to infer the quantity information of particular training labels, especially the composition proportion of training labels in a single training round and the whole training process.
Unlike prior inference attack, the application setting here is based on more realistic scenarios, i.e., the central aggregation server will choose a set of clients randomly from thousands of participants, which we call the selection process
, and collect their gradient updates generated by training local models with each own data in every training epoch. After such collection, we assume a secure aggregation algorithm, which is an important characteristic of FL, is executed so that the server cannot observe individual updates sent by clients in plain text form but can only acquire the aggregated value.
, we know that particular batches, or particular property of training data, can result in change of gradient on corresponding neurons but have little effect on other neurons. As we know, different training labels are units of different features. Given that we sum up all property inferences towards the feature set for a particular label, is it possible to infer some information about such label rather than just its properties? As we discovered, the answer is yes: an adversary can infer some important information about the training label by analyzing gradient changes in the training process. Here, without loss of generality, we assume that the same labels possessed by different clients result in similar local gradient changes. And if we can determine the global updates consist of how many such local changes, then the number of clients who own the same labels can be obtained.
Ii-C Attacker Capacity
One of the key features of our attacks is that they do not require any observation of individuals’ gradient updates, which makes them much easier to be launched than previous attack models. Other basic pre-requirements are similar to other attacks, as discussed below.
The attacker should obtain some control of a legal participant in FL, specifically, he should be able to acquire complete privileges of reading the content of messages from the aggregation server, comprehending the structure of local model, and modifying or changing the training data with full freedom. He will need some prior knowledge about the training process, i.e., the average number of labels owned by each participant and the probable number of data samples per label. Such information can be estimated by collecting the data of a few participants and performing simple statistical analysis. At last, the attacker should know the approximate number of clients selected by the server in a single training round.
Ii-D Attack Overview
We propose three original label inference attacks in FL environment:
Class Sniffing. In a single training round, the adversary is able to infer whether a particular class of training data appears.
Quantity Inference. In a single training round, the attacker can make a judgment that whether a certain training label is owned by a small group of clients or a large group, and predicts how many clients own this label.
Whole Determination. The malicious participant aims to obtain the composition proportion about the dataset labels of current global model.
The inferred information about training labels can be applied to many fields. We list three possible scenarios here:
Use the rare ‘labels’ to identify clients, since these labels are usually owned by extremely few people. Specifically, if such labels are detected in training by our attacks, the attacker can know who participate in the training.
Apply this approach to detect the intrusion of malicious participants. The intrusion phenomenon, e.g., backdoor and poisoning attack in FL, was studied in prior works. For instance, Fung et al. 
applied cosine similarity to detect Sybil, andBagdasaryan et al.  proposed a type of powerful backdoor attack under FL scenario. They all mentioned that the updates provided by malicious attackers are different from that of benign clients. Thus, we can regard adversarial data as a type of unique labels that are only owned by malicious clients, and detect them in the training process.
Obtain the composition proportion of labels in the training process. We may use some other techniques to train the learning model better (such as data augmentation, focal loss ) if we find that the training labels are unbalanced.
Here could represent distinct formats under different scenarios, e.g., Mean Square Error (MSE) or Mean Absolute Error (MAE). is the number of label classes. is the mapping from inputs to target label ; is the learning model that maps the inputs to the prediction label .
The objective of the training process is to minimize the loss of the network, and here we choose the popular stochastic gradient descent (SGD) method to be the network’s optimizer. SGD decides how to modify the network parametersin each training iteration. Specifically, it calculates the opposite direction of the gradient of the loss function in terms of every member in , combines it with the learning rate , and updates to the next state. When the value of the loss function shrinks to a relative bottom bound, the training process stops. The calculation of gradients is implemented by back-propagation operation from the last to the first layer of the whole network, and the standard updated formula of SGD is .
The basic process of our attack is presented in Figure 1. There are one or more observers, in other words, adversarial attackers in the training process. At each training iteration , they download the current global model, which is the detailed parameter information of the network and denoted as (hence FL problems are always in the white-box form). Next, the attackers can train local model with auxiliary dataset to obtain relatively standard gradient changes , and then conduct analysis between the global updates and to determine whether a particular label appear in the training round . Furthermore, based on a much deeper comparison between the magnitude of and , the quantity information, i.e., how many clients own a particular training label, can also be acquired. One thing to note is that if these observers/attackers are selected as training clients, their contribution to the global model needs to be removed when comparing the magnitude of and . We name the former type of attack (determining whether a particular label appears) as Class Sniffing, and the latter types (acquiring quantity information) as Quantity Inference. These two types of attack are from the perspective of a single training round. We also propose another label inference attack, Whole Determination, which can determine the composition proportion of training labels in the whole training process.
Iii-C Class Sniffing
Like most prior work, we build these attack models on the supervised classifying task. We utilize a feed-forward neural network with output size equal to the number of total classes. For each training label, the position of output neurons is shown in Figure2. We discovered a phenomenon that is similar to the basis of property inference attack . More specifically, in our experiments, we observed that, using a particular label in the training will make the inputting weights (the network connection weights denoted as in Figure 2
) of corresponding output neuron grow significantly and the weight vectors of other neurons decrease slightly. Such observation motivated our design of the Class Sniffing attack.
We use to denote the updates of weight set . Both and exist in a vector form with the size equal to the number of neurons in the layer before the output layer. For example, when we train a model on the MNIST dataset, the average increase achieves approximately , while the average decrease is . The worst case happens when there is no sample of a particular label in the training data, and then its corresponding inputting weights accept all negative impact without any positive benefit. This case can be simulated with our auxiliary data by restricting this particular label not to emerge in training, so that the weight updates of its corresponding neurons would be as the worst case. The inputting weight updates vector in such worst case can be regarded as a threshold . In a particular round, if the updates of corresponding to label are higher than , it means that appears in training; and if the weight changes are approximately close to this threshold, label can be considered absent in the training round. The detailed acquiring process of such thresholds is shown in Algorithm 1.
Iii-D Quantity Inference
Similar to the workflow of Class Sniffing, in Quantity Inference, malicious attacker trains his local model using auxiliary data, especially just using data samples of a single label, and then obtain several local updates , where each corresponds to a label . And we denote the increase of as when the local model is trained with the samples of . The decreases on the same are s when the local model is input with the samples of other labels, and both and are vectors too. But their magnitudes are different, i.e., the extent of increase is much higher than that of decrease. The specific values of weight update magnitudes may be changing in different training rounds. Nevertheless, the information about magnitudes in different training rounds can be obtained by training local models on the current global model, just like what the attacker does in Class Sniffing.
As it happens, the positive effect of the increase can be offset by the accumulated impact of other decreases, and this phenomenon appears when a label is possessed by a small number of clients. However, we can still launch the following attack with the existence of above phenomenon. The details of such Quantity Inference attack is described in Algorithm 2 and explained below.
The changes of inputting neuron weights do reflect the quantity information about training data, but not all of them possess such evident reflection, which means part of weights increase less than the rest and sometimes they could decrease even if the corresponding label appears in training. This set of weights are easily to experience the aforementioned ‘Offset’ phenomenon, which could make the attack fail. Hence, the first question from the attacker perspective is how to remove them from the original intact inputting neuron weights. First, when we train the network with the data of a certain label in the training process, its existence will make the corresponding inputting neuron weights grow while the inputting neuron weights of other labels decrease. Following the simple superposition rules, the higher the ratio between magnitudes of and is, the easier ‘Offset’ phenomenon emerges. Thus, we can set a threshold towards , and compare on each inputting weight in the weight vector with the s on the same
corresponding to other local updates. If there is an outlier whose corresponding ratiois higher than , we will delete it from the original set and get a new set , as shown in Algorithm 2 from Line 9 to Line 19.
Next, let us take a label as an example. After local training process using auxiliary data, there will be following local updates . Correspondingly, its original updates of the inputting weights are , which is a vector with the size equal to the number of neurons in the layer before the output. Then, we can regard the members of on as , and the updates vectors of the same on other s as s, followed by the process of deleting the aforementioned outliers and obtaining the new set . Next, we are able to regard each member of on as , which denotes the increase when label is owned by a single client, and the averaged value of each member of on other s as , which indicates the negative impact of other labels. With and , we can calculate all possible numbers of clients who own label by using each inputting weight change in . The client number calculation formula is (5), which is a derivation form of the simple average aggregation shown in (4).
Here, indicates the average number of labels owned by selected clients, which is the same as that in Algorithm 1. is the predicted number of clients, and each corresponds to such a . However, there are still abnormal weight changes whose corresponding s are unreasonable. For instance, providing that there are clients in a particular round, some s could be larger than or less than (the circumstance of less than is regarded as outlier since we assume label has been proven by Class Sniffing to be present in training). Thus, we also need to remove them from the current weight change set and obtain a final version , which is shown in Figure 3, and the detailed steps can be seen in Algorithm 2 from Line 20 to Line 27. The final number of clients who owns label can be determined by the mean of s corresponding to
. Another point worth mentioning is that the standard deviation ofs corresponding to should ideally be small, however occasionally it is large, in which case we abort the Quantity Inference attack. Such scenario happens at an extremely low frequency (below 1% in the whole training process).
Iii-E Whole Determination
If the attacker is not sensitive about time immediacy and patient enough, which means he cares about the composition proportion of entire training data over a long training span rather than just a single or several training rounds, we can propose another new attack. This attack lays emphasis on the overfitting characteristic of learning model when the training process sustains constantly, which suits best to FL application scenarios.
Let us describe an example case. In the training process of a deep neural network, there are a set of labels appearing frequently (the number of samples is large) and other labels appearing occasionally (the number of samples is small), denoted by and , respectively. Like the former attacks, the attacker downloads current global model and trains his local one with auxiliary data inputted label by label, and eventually he can obtain all corresponding local gradient updates of each label, i.e., . When we investigated the inputting weight changes of one particular frequent label and an occasional label, for instance, of and of , we observed an interesting phenomenon that the corresponding absolute value of and on other s (except and ) present a huge difference. That is to say that the absolute values of in other frequent gradient updates, i.e., , are much higher than those of in the same gradient updates . This is from the perspective of . Similar huge difference can be seen for (e.g., the difference between the on and the on ). In various experiments, such phenomenon can be easily observed. For instance, it appears after 10 epochs in the MNIST classifier training process, which is shown in Figure 4 and Figure 5. The attacker is then able to analyze this phenomenon to access the information about the composition proportion of training labels.
Let us give a possible explanation about this phenomenon here. The connection between inputting weight updates of the output layer and the corresponding labels is strong enough that we can regard the neuron weights as the features set of each label. Then, (2) is changed to
Here . We assume that the features embedded in the neurons of the output layer are highly independent of each other after the extract and filter of front layers. This independence is regarded as they are irrelevant to each other, hence their derivatives to each other are zero:
The output format of a classifier is a type of probability vector, and the outputting result of a particular input is the label corresponding to the highest dimension in the probability vector. Then, we know that the quantities of data samples in terms of different labels are different, hence their proportions in the target label are different. We thus hypothesize that the proportion can be measured by calculating the derivative of on the features of the output layer. If a label has more samples, its proportion will be greater, and vice versa. Then, it is presented as
Then the specific format of the updates is and , which corresponds to the aforementioned phenomenon from the perspective of .
The here denotes the next version of ideal model when the current global model accepts the input of label training samples, hence it is natural that there is only difference existing in the derivatives of between these two versions to some extent. In other words, the differences of other s can be neglected:
Since and are frequent labels, is an occasional label, and is the current global model, which is the same in both cases, we can obtain that
Combine the above results, we can conclude that
Iii-E2 Attack Approach
Based on the phenomenon above (13), we can conclude that the ratio of occasional labels is larger than that of frequent labels, and leverage such conclusion to determine the quantity relation between different labels. Like former attacks, when the adversary launches the Whole Determination attack, he trains his local model on the basis of with auxiliary data , and correspondingly obtains local updates . Then, he can calculate all s by using data from . Next, these s will be formed as a vector , where each label corresponds to a vector
. Finally, these vectors can be clustered into different groups by an unsupervised algorithm, and the vectors being in the same group indicates their corresponding labels have approximately the same number of data samples in training. The quantity could present huge differences if labels belong to different clusters. Here, the unsupervised algorithm we adopted is Hierarchical Clustering, which can classify given data into different clusters with the metric of Euclidean Distance. Attackers may also choose other clustering approaches.
All experiments were conducted on a workstation running Ubuntu 18.04 LTS equipped with a 2.10GHz CPU Intel Xeon(R) Gold 6130, 64GB RAM, and an NVIDIA TITAN RTX GPU card. We construct the model mainly on PyTorch, and use Scipy-scikit-learn  to implement some machine learning models.
Iv-a Experiment Setting
Iv-A1 Auxiliary Data
Like many other inference attacks on machine learning, the attacker also needs auxiliary data. It consists of some data samples of the labels he wants to infer. In practice, such data samples are often not difficult to acquire. The number of data samples should be close to the average quantity owned by participants. If the samples are not enough, the attacker may try reproduction techniques such as GAN to construct more similar samples.
Iv-A2 Network Structure
The main structure is based on standard construction of federated learning , with some modifications for practical purpose, e.g., participating clients are able to process several epochs locally rather than just a single epoch before sending their updates to the aggregation server . The symbols of major hyper-parameters are defined in Table I.
|Local model training batch size|
|Local model learning rate|
|Local model training epochs|
|Selection proportion of clients in a training round|
|Selected models to accomplish learning task|
|Approximate number of whole participants|
The dataset information (number of training labels and corresponding training model) are presented in Table II
. We choose these datasets that are close to our concerns about privacy in daily life. For instance, Fer2013 is relevant to face recognition, while HAM10000 aims at diagnosing several skin cancers. Both of them contain private information owned by different people.
|MNIST||MLP & CNN|
|CIFAR10||LeNet5 & Resnet18|
MNIST. As one of the most popular and classical datasets in machine learning, MNIST includes 10 labels, each of which corresponds to approximately 6,000 32
32 gray handwritten digital images. 5,000 of them are training data, while 1,000 are for testing. Because of its simplicity and the small number of total training data, it is not easy for deep and complicated models to achieve high performance. Hence we choose a standard but simple MLP (multi-layer perceptron) and a CNN model for it, both of which are able to achieveaccuracy.
The MLP model contains an input layer, followed by two fully-connected hidden layers of size 256 and 64. There is a dropout operation between hidden layers, and finally it has an output layer with size of 64. We use rectified linear unit (ReLU) as the activation function for all layers. Other settings are, , . The CNN model consists of two spatial convolution layers with 10 and 20 filters (kernel size is 5
5), max pooling layers with size set to 2, a dropout layer and a fully-connected layer with size 320, and finally an output layer whose size is 50. The activation function is ReLU, and other settings are the same as MLP.
CIFAR10. CIFAR10 consists of 10 classes containing 6,000 3232 RGB images each, which can also be divided into 5,000 for training and 1,000 for testing. The entire training labels contain common objects in daily life, suitable for the object identifying task on smartphones. For clustering model, we select two commonly used networks, LeNet5  and ResNet18. The former can achieve accuracy, while the latter can achieve . LeNet5 consists of two convolution layers with 6 and 16 filters respectively (kernel size of them is 5
5), pooling layers with size set to 2, two fully-connected linear layers with size set to 400 and 120, and an output layer and a softmax layer. Its parameter setting is, , . The specific network structure of Resnet18 can be found in . The parameter setting on Resnet18 is , , .
Fer2013. Fer2013  originates from a Kaggle competition, which is Facial Expression Recognition Challenge 2013, and it aims to build a learning model to recognize human’s expression automatically. It contains approximately 30,000 facial RGB images of different expressions with size restricted to 4848, and the main labels of it can be divided into 7 types: 0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral. The Disgust expression has the minimal number of images – 600, while other labels have nearly 5,000 samples each. We randomly select of the samples for each label as the training data, and use the rest for testing. We choose Resnet18 as the learning model for Fer2013. It is trained under the setting , , . The testing accuracy is .
HAM10000. HAM10000  is a large collection of multi-source dermatoscopic images of pigmented lesions. There are nearly 37,000 records about skin lesions, and they are classified into 7 labels, i.e., 0=Melanocytic nevi, 1=Melanoma, 2=Benign keratosis-like lesions, 3=Basal cell carcinoma, 4=Actinic keratoses, 5=Vascular lesions, 6=Dermatofibroma. Each label corresponds to approximately 5,000 images. Similarly, we randomly divide the data into 4,500 for training and 500 for testing. HAM10000 is also trained on Resnet18, and its parameters are , , . The testing accuracy is .
Iv-B Class Sniffing
In order to simulate more practical application scenarios of FL, we try to allocate training dataset samples randomly. Let us take MNIST as an example. We create a setting with 100 participants. In each training round, the server is required to select 10 clients randomly and collect their gradient updates to form an aggregated global model. In our setting, each participant can possess 3, 4 or 5 main labels and a small number of other labels, and the number of data samples per main label is much larger than that of other labels. All participants select their own main labels randomly. The data allocations of other datasets are similar to that of MNIST, which we believe can simulate the practical scenarios to some extent.
The goal of Class Sniffing is to predict whether a certain label appears in a training round, hence the evaluated metric here is the success rate of prediction. That is, if we correctly detect the existence of a label fortimes and fail for times in several training rounds, then the success rate is . We perform this attack on all datasets with each own standard model for 100 training rounds, and the results are presented in Table III. As shown in the table, the success rate is relatively high (above ) for all datasets, which demonstrates the effectiveness of Class Sniffing.
Iv-C Quantity Inference
The Class Sniffing attack is designed to detect the existence of a particular label, while Quantity Inference aims to acquire the quantity information of a label in a single training cycle. Because the pre-settings of Class Sniffing are relatively practical, there is no need to change them here. That is, the participants of FL are also set to 100, the randomly selected fraction is , and the allocation strategy in terms of dataset samples is the same.
Considering both the threat model and the problems we want to solve, we need to define a new metric to evaluate the attack here. The main idea is to set an error bound and evaluate how often the attacker can estimate the number of clients possessing a particular label within the error bound. Specifically, in a particular training round, assume there are clients possessing a label , and the attacker launches Quantity Inference in this round and obtains an estimated number of clients who possess label . We regard an attack successful if , while failed when , where the error bound controls the accuracy requirement. We set in our experimental evaluation. Then, by recording the number of times the attacker successfully make the estimation (i.e., within the error bound) and fails it (i.e., larger than the error bound), we can calculate the success rate.
We evaluates the Quantity Inference attack on all datasets with each own standard models for 100 training rounds, and the results are shown in Table IV. We can see that the success rate is high for all datasets, i.e., between and , which shows the effective of our Quantity Inference attack. Moreover, the results are consistently high across the four datasets, which shows the broad applicability of our approach.
Iv-C3 Impact of Hyper Parameters
We also study how Quantity Inference is affected by hyper-parameters, in particular (local model training batch size) and (local model training epochs). We choose MNIST with CNN for our study, and fix other settings as in Sec. IV-A3 when we evaluate a particular parameter. The results are shown as Figure 6 and Figure 7.
For the impact of different batch sizes shown in Figure 6, we can observe that success rate hits the bottom () when the batch size is set to relatively small, and reaches a relatively high level () with batch size at 20. As we know, the batch size in training should not be set too small as it may lead to more calculation iterations and cause the model perform bad. Hence, we think Class Sniffing should be effective under common batch size settings.
We can also clearly observe the impact of local training epochs in Figure 7, where the success rate decreases slightly when the local epoch rises (the lowest is ). To our knowledge, the standard application of FL usually allows participants to train their local model only one epoch. The reason is that considering the limited computation capacity of local devices, more epochs will drastically increase the computation load and may affect the normal operation of those devices. Recently, some researchers propose that local devices can shoulder more computation tasks , which means more local training epochs are possible, however more than 10 epochs would still be quite demanding. Thus, we believe that Quantity Inference should work well in most circumstances, even when there are multiple local training epochs (as long as not too many).
Iv-C4 Quantity of Participants
To further investigate the practicality of Quantity Inference, we consider changing the selection proportion and the overall number of participants . In our former default settings, and . In this study, we explore a range of values, and the results are shown in Figure 8 and Figure 9.
We first fix the overall number of participants to 100, and change the selection fraction from 0.1 to 0.5 with the step of 0.1. If we still use the original metric in Sec. IV-C1, which is denoted as in the figures, we can see that the success rate shows a moderate decline from to when the proportion is increasing. Such trend is reasonable, as the more clients there are in each round, the harder it is to achieve the same success rate under an absolute error bound. However, we could consider a new metric based on a relative error bound (depending on the number of participants), which is defined as the difference between the real number of clients possessing a label and the estimated number with respect to an error bound rather than of . Under , the success rate always stays at a high level (near ).
Then, we make the selection proportion unchanged at , and increase the number of overall participants from 100 to 1000. We can observe the trend that the success rate slightly decreases when the number of participants increases. However, if we use the metric based on relative error bound, we can see that the success rate stays at a high level (near ). Overall, both figures demonstrate the effectiveness of our Quantity Inference attack over a wide range of number of overall participants and selection proportion.
Iv-D Whole Determination
Whole Determination is an attack typically towards developed models, which have been trained with considerable data samples and perform well on corresponding given tasks. Hence, for its evaluation, we choose to launch the attack in the middle and late stages of the training process when the model is near to convergence. However, it does not mean that Whole Determination cannot work in more advanced stages. Before the attack, all datasets above are required to train themselves under their own default settings. When the loss of model decreases to a relative small value, the attacker will use Whole Determination to obtain the composition proportion of training labels.
Iv-D1 Dataset Allocation
In previous experiments, the number of data samples with each label is decided via random selection, but here we cannot apply this selection strategy. We need to make the numbers of data samples belonging to each label have differences, otherwise we cannot evaluate the performance of Whole Determination. To start with, we should figure out the connection between the magnitude of ratio difference and the proportion difference of labels. We conduct experiments by changing the number of samples belonging to a certain label and record the corresponding ratio difference. The results are presented in Figure 10. As shown in the figure, obvious ratio difference can be observed when there is a four-fold difference in the number of samples owned by the two labels. As a result, we divide the whole labels into 3 groups randomly, and ensure that each label in the first group can be allocated with data samples, the second group can only acquire , and the last group just get samples. These groups will be used to train learning model, and what we want is to evaluate if our approach can detect this composition proportion.
We conduct the experiments on all datasets. For each dataset, we train it for 20 times and launch Whole Determination attack in every training. The middle stage of training is defined as the round when the testing accuracy is approximately , while the late stage is when the testing accuracy exceeds . Clearly, the dataset allocation is different in each training process because of the random allocation. We also use success rate as our metric, and only consider the attack successful when the clustering results are exactly the same as the data allocation before training, including the number of clusters and the specific labels in each cluster; otherwise we regard it as a failure.
The results are presented in Table V. We can see that the average success rate is very high (almost 95%), which shows that Whole Determination attack can be effective under such circumstances. The success rate of middle stage is a little worse than that of late stage. We think the reason could be that the exploration direction of gradient is relatively more random in the middle stage of training.
V-a Network Layers
It could be asked why we consider the output layer and whether similar phenomenon exists in other layers, e.g., hidden layers. The main task of the front network layers is typically to extract and filter the features of training data, which means the objects to be processed are various features. The emergence of a particular feature leads to corresponding gradient updates, while its absence has no influence . It is interesting to note that a certain label usually possesses many different features, in other words, it is an unity of multiple features. And different labels possibly have the same features, e.g., cats and dogs have similar fur features. What is more, in some special neural networks, the features embedded in front layers are not explainable or interpretable to human analysts practically, especially the cases in convolution operation. These characteristics make the front layers not applicable to our case, where we want to obtain the quantity information about training labels. However, in case we cannot access the output layer but only several front layers, we could try to apply some Explainable Machine Learning techniques to extract the key features of each class, like LIME  for linear classifiers and LEMNA  for deep neural networks. Explainable Machine Learning aims at figuring out why a particular input sample is clustered into its corresponding label and obtaining some relatively interpretable reasons for users, especially for debuggers and computer security practitioners. And these reasons, namely the key features, can be served as the identifications for different labels, and our method is possibly able to build on them.
The three attacks proposed by us share similarity to property inference attack (albeit our focus switches from the existential proof to quantity information). Thus, some defenses designed for property inference may be leveraged to mitigate our attacks. Here we discuss two possible defenses.
V-B1 Compress the gradient updates
As mentioned by , there is no need to share the whole network parameters in collaborative learning. In other words, compressing or distilling the significant neuron updates can also make the global model converge and achieve great performance. As for our attacks, the adversary does not need to observe any individual updates, and what he needs is only the global model parameters, hence there is some possible impact to the attacks if the participants cannot acquire the whole global model (it can only acquire part of it).
We try to simulate such defense in our experiment setting, and we choose to keep the weights whose gradient updates are relatively greater and make other weights invisible to participants. Specifically, if we set the compression rate (CR) in advance, each client will select top CR of inputting weight updates to be uploaded to the server. This simple compression operation might make the convergence of global model slower to some extent, but the influence to performance of the model is not great. And under such circumstances, we conduct some experiments on Quantity Inference under the default setting (MNIST on CNN). The results can be seen in Table VI.
|Compression Rate||Success Rate(%)||Aborting Rate(%)|
We launch Quantity Inference for 100 rounds under each CR setting. From the results, we can see that the success rate of our attack is not impacted by such defense. However, the aborting rate increases a lot since Quantity Inference is designed to abort the round whose corresponding standard deviation of ’s (Sec. III-D) is high. Thus, the aborting operation indirectly makes this attack less effective.
Another possible defense is adding Dropout Layer to the neural networks, which is also an effective approach to mitigate the overfitting phenomenon. Since its random removal of the features in training process, the dropout operation may make the gradient updates of clients more different from each other, which possibly poses some impact to our attacks. However in our initial evaluation experiments, especially for MNIST dataset, both MLP and CNN (Sec. IV-A3) have dropout layers, and the success rate of our three attacks is still extremely high. Thus, we are not sure whether the dropout technique is able to defense our attack methods and plan to conduct a deeper analysis in the future.
Vi Related Work
Vi-a Privacy-preserving Federated Learning
Federated learning is an evolution form of distributed learning, and it enables training data stay locally while a collaborative global model can be learned. Existing work with the considering of privacy can be classified into Differential Privacy mechanism (DP) and secure Multi-Party Computation (MPC). Geyer et al.  stand on the perspective of the client and realize differential privacy protection by adding Gaussian noise to the local updates under the setting where there are a large number of participants, and similar work can also be seen in . Hamm et al.  apply knowledge transfer techniques to aggregate multiple models trained on individual devices with DP guarantee.
Bonawitz et al.  design secure multi-party aggregation techniques, pertinent for federated learning, to enable participants to encrypt themselves so that the central server cannot observe individual gradient updates in the plain form and only do the aggregation operation. Mohassel and Zhang  enable two servers to train a global model with multi-party encrypted data, and the training process is protected by MPC techniques.
Vi-B Inference Attack
Different types of inference attacks in collaborative setting emerge frequently. Hitaj et al. 
create a GAN structure to imitate the output probability distributions and use reverse learning to infer the training data.Hayes et al.  note the privacy leakage in the scenario of machine-learning-as-a-service application and also train several GANs to detect overfitting characteristics of input-output pairs. Truex et al.  propose a membership inference threat on the surface of FL, but they assume FL is under machine-learning-as-a-service application and adversaries hold the ability that sniffing output probability distributions of all other clients rather than model parameter updates, which we think it is not an inference attack towards the standard FL. Melis et al.  lay emphasis on the unintended feature leakage in collaborative learning setting by training a shadow attack model to infer information about training data, and the threat model have been simplified to some extent. Different from above works, Wang et al.  assume that the aggregation server in FL is malicious, and they combine the main work of global model, identity distinguishing and traditional authentication task to form the mixed discriminator of a GAN that is able to track particular victims and reconstruct their private training data. Much similar to the aggregation feature of FL, in aggregated location field, Pyrgelis et al.  use a challenge game to distinguish the victims with other participates and then track the location information of particular victims.
There are also some relative work about property inference attack for traditional machine learning, in both white-box and black-box settings. For instance, under black-box circumstance, Salem et al.  use GAN to achieve reconstruction and then sniff the information about training data between different versions of learning model based on updates of output results. Ateniese et al.  use different property of training data to obtain several meta-models, and combine these meta-models to sense the existence of a particular label. Ganju et al.  construct an inference attack towards fully-connected neural networks, and they realize it by applying post-training techniques to a white-box model.
Vi-C Other Attacks and Defenses
Attack. Federated learning is a fertile research field for security problems, and there have been several other interesting attacks recently. Bagdasaryan et al.  create a type of backdoor approach in FL setting, which can pose the backdoor threat after only a few rounds of attack with high target class accuracy. Baruch et al.  propose a poisoning attack whose impact is profound and it can escape prevalent abnormal detection. They realize it by splitting abnormal parts in a few neurons to a large number of neurons, and they also investigate the capacity scope of current abnormal detection approaches on the degree of abnormality. Bhagoji et al.  explore the threat of model poisoning attacks on FL launched by a single, non-colluding malicious agent, where the adversarial objective is to cause the model to mis-classify a set of chosen inputs with high confidence.
Defense. Shen et al.  apply the clustering operation to individual parameter updates before aggregation to detect malicious participants in distributed learning setting. Blanchard et al.  use Euclidean Distance to measure the contribution of clients to global model, and design a selection strategy to tolerate the gradient contribution from Byzantine attackers. Fung et al.  present the impact of sybils attack in FL and design a detection algorithm by comparing the cosine similarity between gradient updates.
In this paper, we proposed three original inference attacks against federated learning. The attack target includes the quantity composition proportion of training labels, a new consideration in FL security. Specifically, Class Sniffing can detect the existence of a particular label in a single training round; Quantity Inference is able to determine how many clients own a certain label from the perspective of a single iteration; and finally, Whole Determination aims to infer the quantity information among different labels for the whole training process. All of them work in a passive way, and they will not impose any influence to the whole FL structure, hence it is difficult for the prevalent intrusion detection techniques to detect our attacks. Besides, all three attacks do not require the observation of any individual gradient updates from participants, which enables the attackers to apply them in more practical scenarios.
We have conducted extensive experiments that demonstrate the effectiveness of our attacks, with evaluation settings as practical as we can. All three attacks are shown to be very effective, with their success rates staying at a relative high level (typically around ). Moreover, we also investigated the impact of major hyper-parameters, e.g., batch size, local epochs and the overall number of participants. The results demonstrate broad applicability of our approaches.
- Anguita et al.  Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra, and Jorge Luis Reyes-Ortiz. A public domain dataset for human activity recognition using smartphones. In Esann, 2013.
- Hard et al.  Andrew Hard, Kanishka Rao, Rajiv Mathews, Françoise Beaufays, Sean Augenstein, Hubert Eichner, Chloé Kiddon, and Daniel Ramage. Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604, 2018.
- Ramaswamy et al.  Swaroop Ramaswamy, Rajiv Mathews, Kanishka Rao, and Françoise Beaufays. Federated learning for emoji prediction in a mobile keyboard. arXiv preprint arXiv:1906.04329, 2019.
- Pantelopoulos and Bourbakis  Alexandros Pantelopoulos and Nikolaos G Bourbakis. A survey on wearable sensor-based systems for health monitoring and prognosis. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 40(1):1–12, 2009.
- Nguyen et al.  Thien Duc Nguyen, Samuel Marchal, Markus Miettinen, Hossein Fereidooni, N Asokan, and Ahmad-Reza Sadeghi. D” iot: A federated self-learning anomaly detection system for iot. arXiv preprint arXiv:1804.07474, 2018.
- Samarakoon et al. [2018a] Sumudu Samarakoon, Mehdi Bennis, Walid Saady, and Merouane Debbah. Distributed federated learning for ultra-reliable low-latency vehicular communications. arXiv preprint arXiv:1807.08127, 2018a.
- Samarakoon et al. [2018b] Sumudu Samarakoon, Mehdi Bennis, Walid Saad, and Merouane Debbah. Federated learning for ultra-reliable low-latency v2v communications. In 2018 IEEE Global Communications Conference (GLOBECOM), pages 1–7. IEEE, 2018b.
- Chilimbi et al.  Trishul Chilimbi, Yutaka Suzue, Johnson Apacible, and Karthik Kalyanaraman. Project adam: Building an efficient and scalable deep learning training system. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14), pages 571–582, 2014.
- Dean et al.  Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Andrew Senior, Paul Tucker, Ke Yang, Quoc V Le, et al. Large scale distributed deep networks. In Advances in neural information processing systems, pages 1223–1231, 2012.
- Lin et al. [2017a] Yujun Lin, Song Han, Huizi Mao, Yu Wang, and William J Dally. Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887, 2017a.
- Moritz et al.  Philipp Moritz, Robert Nishihara, Ion Stoica, and Michael I Jordan. Sparknet: Training deep networks in spark. arXiv preprint arXiv:1511.06051, 2015.
- Xing et al.  Eric P Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, and Yaoliang Yu. Petuum: A new platform for distributed machine learning on big data. IEEE Transactions on Big Data, 1(2):49–67, 2015.
- Zinkevich et al.  Martin Zinkevich, Markus Weimer, Lihong Li, and Alex J Smola. Parallelized stochastic gradient descent. In Advances in neural information processing systems, pages 2595–2603, 2010.
- Bagdasaryan et al.  Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. How to backdoor federated learning. arXiv preprint arXiv:1807.00459, 2018.
- Baruch et al.  Moran Baruch, Gilad Baruch, and Yoav Goldberg. A little is enough: Circumventing defenses for distributed learning. arXiv preprint arXiv:1902.06156, 2019.
- Fung et al.  Clement Fung, Chris JM Yoon, and Ivan Beschastnikh. Mitigating sybils in federated learning poisoning. arXiv preprint arXiv:1808.04866, 2018.
- Bhagoji et al.  Arjun Nitin Bhagoji, Supriyo Chakraborty, Prateek Mittal, and Seraphin Calo. Analyzing federated learning through an adversarial lens. arXiv preprint arXiv:1811.12470, 2018.
- Mahloujifar et al.  Saeed Mahloujifar, Mohammad Mahmoody, and Ameer Mohammed. Multi-party poisoning through generalized -tampering. arXiv preprint arXiv:1809.03474, 2018.
- Melis et al.  Luca Melis, Congzheng Song, Emiliano De Cristofaro, and Vitaly Shmatikov. Exploiting unintended feature leakage in collaborative learning. arXiv preprint arXiv:1805.04049, 2018.
- Truex et al.  Stacey Truex, Ling Liu, Mehmet Emre Gursoy, Lei Yu, and Wenqi Wei. Demystifying membership inference attacks in machine learning as a service. IEEE Transactions on Services Computing, 2019.
- Pyrgelis et al.  Apostolos Pyrgelis, Carmela Troncoso, and Emiliano De Cristofaro. Knock knock, who’s there? membership inference on aggregate location data. arXiv preprint arXiv:1708.06145, 2017.
- Nasr et al.  Milad Nasr, Reza Shokri, and Amir Houmansadr. Comprehensive privacy analysis of deep learning: Stand-alone and federated learning under passive and active white-box inference attacks. arXiv preprint arXiv:1812.00910, 2018.
- Hitaj et al.  Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-Cruz. Deep models under the gan: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 603–618. ACM, 2017.
- Salem et al.  Ahmed Salem, Apratim Bhattacharya, Michael Backes, Mario Fritz, and Yang Zhang. Updates-leak: Data set inference and reconstruction attacks in online learning. arXiv preprint arXiv:1904.01067, 2019.
- Hayes et al.  Jamie Hayes, Luca Melis, George Danezis, and Emiliano De Cristofaro. Logan: Membership inference attacks against generative models. Proceedings on Privacy Enhancing Technologies, 2019(1):133–152, 2019.
- Wang et al.  Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pages 2512–2520. IEEE, 2019.
- Aono et al.  Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al. Privacy-preserving deep learning: Revisited and enhanced. In International Conference on Applications and Techniques in Information Security, pages 100–110. Springer, 2017.
- Bonawitz et al.  Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1175–1191. ACM, 2017.
- Geyer et al.  Robin C Geyer, Tassilo Klein, and Moin Nabi. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557, 2017.
- McMahan et al.  H Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. Learning differentially private language models without losing accuracy. arXiv preprint arXiv:1710.06963, 2017.
- Huang et al.  Li Huang, Yifeng Yin, Zeng Fu, Shifa Zhang, Hao Deng, and Dianbo Liu. Loadaboost: Loss-based adaboost federated machine learning on medical data. arXiv preprint arXiv:1811.12629, 2018.
- Ateniese et al.  Giuseppe Ateniese, Giovanni Felici, Luigi V Mancini, Angelo Spognardi, Antonio Villani, and Domenico Vitali. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. arXiv preprint arXiv:1306.4447, 2013.
Lin et al. [2017b]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár.
Focal loss for dense object detection.
Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b.
- Paszke et al.  Adam Paszke, Sam Gross, Soumith Chintala, and Gregory Chanan. Pytorch. Computer software. Vers. 0.3, 1, 2017.
- Pedregosa et al.  Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830, 2011.
- Konečnỳ et al.  Jakub Konečnỳ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492, 2016.
- McMahan et al.  H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629, 2016.
- LeCun et al.  Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
He et al. 
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun.
Deep residual learning for image recognition.
Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Goodfellow et al.  Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. Challenges in representation learning: A report on three machine learning contests. In International Conference on Neural Information Processing, pages 117–124. Springer, 2013.
- Tschandl et al.  Philipp Tschandl, Cliff Rosendahl, and Harald Kittler. The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5:180161, 2018.
- Ribeiro et al.  Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144. ACM, 2016.
- Guo et al.  Wenbo Guo, Dongliang Mu, Jun Xu, Purui Su, Gang Wang, and Xinyu Xing. Lemna: Explaining deep learning based security applications. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 364–379. ACM, 2018.
- Shokri and Shmatikov  Reza Shokri and Vitaly Shmatikov. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pages 1310–1321. ACM, 2015.
- Hamm et al.  Jihun Hamm, Yingjun Cao, and Mikhail Belkin. Learning privately from multiparty data. In International Conference on Machine Learning, pages 555–563, 2016.
- Mohassel and Zhang  Payman Mohassel and Yupeng Zhang. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy (SP), pages 19–38. IEEE, 2017.
- Ganju et al.  Karan Ganju, Qi Wang, Wei Yang, Carl A Gunter, and Nikita Borisov. Property inference attacks on fully connected neural networks using permutation invariant representations. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 619–633. ACM, 2018.
- Shen et al.  Shiqi Shen, Shruti Tople, and Prateek Saxena. A uror: defending against poisoning attacks in collaborative deep learning systems. In Proceedings of the 32nd Annual Conference on Computer Security Applications, pages 508–519. ACM, 2016.
- Blanchard et al.  Peva Blanchard, Rachid Guerraoui, Julien Stainer, et al. Machine learning with adversaries: Byzantine tolerant gradient descent. In Advances in Neural Information Processing Systems, pages 119–129, 2017.