RFLBAT: A Robust Federated Learning Algorithm against Backdoor Attack

01/11/2022
by   Yongkang Wang, et al.
0

Federated learning (FL) is a distributed machine learning paradigm where enormous scattered clients (e.g. mobile devices or IoT devices) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. Unfortunately, FL is susceptible to a variety of attacks, including backdoor attack, which is made substantially worse in the presence of malicious attackers. Most of algorithms usually assume that the malicious at tackers no more than benign clients or the data distribution is independent identically distribution (IID). However, no one knows the number of malicious attackers and the data distribution is usually non identically distribution (Non-IID). In this paper, we propose RFLBAT which utilizes principal component analysis (PCA) technique and Kmeans clustering algorithm to defend against backdoor attack. Our algorithm RFLBAT does not bound the number of backdoored attackers and the data distribution, and requires no auxiliary information outside of the learning process. We conduct extensive experiments including a variety of backdoor attack types. Experimental results demonstrate that RFLBAT outperforms the existing state-of-the-art algorithms and is able to resist various backdoor attack scenarios including different number of attackers (DNA), different Non-IID scenarios (DNS), different number of clients (DNC) and distributed backdoor attack (DBA).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 6

page 8

page 9

08/14/2018

Mitigating Sybils in Federated Learning Poisoning

Machine learning (ML) over distributed data is relevant to a variety of ...
04/28/2022

Shielding Federated Learning: Robust Aggregation with Adaptive Client Selection

Federated learning (FL) enables multiple clients to collaboratively trai...
06/23/2020

Learning Based Distributed Tracking

Inspired by the great success of machine learning in the past decade, pe...
07/03/2021

Byzantine-robust Federated Learning through Spatial-temporal Analysis of Local Model Updates

Federated Learning (FL) enables multiple distributed clients (e.g., mobi...
07/18/2019

Federated PCA with Adaptive Rank Estimation

In many online machine learning and data science tasks such as data summ...
01/08/2022

LoMar: A Local Defense Against Poisoning Attack on Federated Learning

Federated learning (FL) provides a high efficient decentralized machine ...
02/07/2022

Blind leads Blind: A Zero-Knowledge Attack on Federated Learning

Attacks on Federated Learning (FL) can severely reduce the quality of th...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The recently proposed federated learning (FL) is an attractive framework for the massively distributed training of machine learning models with thousands or even millions of participants [2]. FL has demonstrated great success because it embodies the principles of focused collection and data minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning [13]. FL is now widely used in various fields, such as finance, insurance, health situation assessment, smart cities, and next-word prediction while typing [29, 5, 24, 11].

Parameter Server (PS) is a classical distributed machine learning paradigm, which consists of central servers and multiple clients [15]. Each client pulls the model from the central server, and performs model training using its local training data, then pushes the updated model’s parameters to the central server. The central server updates the global model by aggregating collected models from the participated clients and distributes the updated model to all clients for the next round of training. The entire training terminates when the pre-configured round reaches or the model converges to a satisfied result. Federated averaging (FedAvg) [20] is a typical model aggregating algorithms for FL.

Although FL is capable of aggregating dispersed information provided by different clients to train a global model, its distributed framework as well as inherently Non-IID across different parties may unintentionally provide a venue to new attacks. In particular, the fact of limiting access to individual party’s data due to privacy concerns or regulation constraints may facilitate attacks on the final aggregation model for FL [27]. Recent studies show that FL is very easy for the local client to add adversarial perturbation such as “backdoor” during the training process to compromise the final aggregation model [3, 2, 26, 27], and this kind of attack is called backdoor attack. Different from the byzantine attack which aims to degrade the final aggregation model’s performance, or “fully break” the final aggregation model, the backdoor attack is both data poisoning attacks and model poisoning attacks, and its goal is to insert the backdoor triggers into the final aggregation model whose performance will be altered testing the samples with specific backdoor triggers, while maintaining high overall accuracy on normal samples. Therefore, backdoor attacks are extremely difficult to really defend.

Figure 1 shows the overview of the local client inserts backdoor triggers into the global model during training process. Firstly, every client pulls the global model from the central server. Secondly, each client performs training process using its local data, while the malicious client uses tampered samples with backdoor labeled another class (i.e., in client1, the original class of cat training sample is marked as dog by injecting white triangle backdoor). Third, the participated clients push updated local models to the central server. Finally, the central server aggregates collected updated local models into the global model for the next training round and distributes it to all clients. After multiple rounds of training, the backdoor will be embedded into the global model which will misclassify the samples with the backdoor as the specified class by malicious clients.

Figure 1: Overview of the local client inserts backdoor triggers into the global model during training process.

Backdoor attack has aroused great security concerns and become the obstacles in FL system. Intensive robust aggregation algorithms have been proposed against backdoor attacks [4, 7, 22, 23, 30, 17, 9, 25]. Recent study shows that multiple defense algorithms did little to defend against model poisoning attacks without unrealistic assumptions and can hardly defend against distributed backdoor attacks in FL system [8, 27]

. Most state-of-the-art defense algorithms play with mean or median statistics of gradient contributions and usually acquire strong assumptions that the malicious attackers are less than benign clients or the data distribution is IID. The prototypical idea for these defense algorithms is to estimate a true “center” of the received model updates rather than attempting to identify malicious clients

[4, 7, 22, 23, 30]. Since the impact of backdoor attacks is not completely eliminated, these aggregation algorithms will fail after enough rounds of training.

To completely nullify the impact of backdoor attackers during training process. In this work, we propose RFLBAT against backdoor attacks based on PCA technique and Kmeans clustering algorithm. The key sight in this work is that the backdoored gradients and benign gradients can be clearly distinguished using PCA and partitioned into different clusters through Kmeans algorithm. We use cosine similarity to intelligently select benign gradients to aggregate the global model. Empirically, we conduct evaluation on three diverse datasets including MNIST

[14], FEMNIST [6] and Amazon [6]

datasets using logistic regression (LR), convolutional neural network (CNN) and long short-term memory (LSTM) models, respectively. We consider a variety of backdoor attack scenarios including different number of attackers (DNA), different Non-IID scenarios (DNS), different number of participated clients (DNC) and distributed backdoor attack (DBA).

Contributions. In this paper, we summarized the following contributions:

  • We design, implement, and evaluate a novel robust aggregation algorithm based on PCA technique and Kmeans clustering algorithm to defend against backdoor attack in FL.

  • RFLBAT can effectively distinguish backdoored updates and benign updates and uses cosine similarity to measure the updates of different clients instead of Euclidean distance to select benign updates to aggregate.

  • We evaluate the performance of RFLBAT on MNIST, FEMNIST and Amazon Fine Food Reviews datasets. Experimental results show that RFLBAT can defend against various backdoor attackers and outperforms existing algorithms.

2 Motivation and Challenges

Motivation. Backdoor attacks aim to insert backdoor triggers into the final global model by training strong poisoned local models and submitting poisoned model updates to the central server, so as to mislead the final global model [3]. In FL system, the central server does not know any auxiliary information except the gradients of clients.

To get much higher accuracy of the global model on poisoned samples and clean samples, the gradients sent by backdoored clients are greatly close to benign clients, so as to hardly be detected and removed. Figure 2

illustrates the gradients distribution of backdoored clients and benign clients. Thus, the central server can hardly distinguish the benign clients and backdoored clients due to the high dimensional gradients, especially in deep learning. Nevertheless, we have found that the backdoored clients and benign clients can be clearly distinguished when they are reduced to a low dimension, as shown in Figure

3. Principal component analysis (PCA) [12] is a dimensionality-reduction technique that returns a compact representation of a multi-dimensional dataset by reducing the data to a lower dimensional subspace. From Figure 3, the gradients can be partitioned into different clusters after using PCA. Therefore, we can use the Kmeans [1]

clustering algorithm for clustering analysis and select a cluster containing benign gradients for aggregation.

Figure 2:

Overview of the gradients distribution of backdoored clients and benign clients. The dotted vectors are malicious contributions that drive the model towards a backdoor objective. The solid vectors are contributed by benign clients that drive towards the true objective.

Figure 3: The gradients distribution in two-dimensional space after using PCA.

Challenges. In FL setting, the central server can only access to the outputs of local model updates, while some aggregation algorithms [16, 17, 28]

assume that the central can access the part or all of the private data. In the real world, this assumption is apparently restricted due to the violation of privacy-preserving principle. In addition to this, the role of a client is not always static, which means that a client is benign in this round and may be malicious in the next round, moreover the central server has no prior information about the number of backdoored clients. In this scenario, from the aggregator’s perspective, detecting backdoored gradients is extremely difficult. From above analysis, we seemingly have found a effective method under unsupervised learning algorithms (PCA and Kmeans), but there are still several challenges to conquer.

  • Challenge 1.

    As we all know, Kmeans clustering algorithm is extremely sensitive to outliers. In Non-IID scenario of FL, outliers are much easier to produce after dimensionality-reduction. Perhaps an outlier can result in clustering result which deviates what we expected.

  • Challenge 2. After dimensionality-reduction, the gradients can be clustered into several clusters, but we have no idea which cluster should be selected for aggregation.

  • Challenge 3. Suppose we have effectively selected one cluster for aggregation. However, a few backdoored gradients may be present in selected cluster, resulting in receiving a worse aggregated global model.

3 Preliminaries

3.1 Federated Learning

3.1.1

Training Objective. The train objective of FL can be cast a distributed optimization problem: , where , represent the number of aggregated clients and dimensions of model respectively, is the aggregation weight of the -th client and have . The local objective is defined by , where is the data size of the -th client,

is a defined loss function,

is the model parameters of -th client, is the -th local training data sample of -th client.

FedAvg. In FL system, the clients require to perform multiple rounds to update the global model, and every client may need to execute multiple local iterations to update its local model [19]. Specifically, at round , the central server sends the current global model parameters to all clients. Next, every client receives the global model parameters from the central server and initializes its local model parameters , and perform local updates to get final local model at round . And then, every client pushes its local model update to the central server. Finally, the central server aggregates over local model updates into a new global model using . In this paper, we define as the gradient of the client at round .

3.2 Threat Model

The goal of backdoor attacks is to insert backdoor triggers into global model during training process, causing the global model misclassify the test input with the same backdoor triggers and simultaneously fit the main task [10]. We consider the backdoor attacks by training the local models using poisoned data samples with backdoor, and send the backdoored gradients to the central server. Suppose there are backdoored clients among total clients. To increase the intensity of the backdoor attack, we assume backdoored clients collaboratively attack the global model at every round, and the backdoored clients are not static. Besides, we set the common model settings between backdoored clients and benign clients, including learning rate, local iterations, optimizer, etc. This will bring about enormous difficulties for the central server to produce a satisfied global model.

Let be the union of original benign local datasets. We denote the backdoored data sample as , where is the original data sample in , and is the backdoor we intend to insert. Let be the backdoored data samples in backdoored client having datasets with size . is the union of local datasets with backdoor.

At every round , the backdoored client trains its local model using backdoored dataset such that , , where is the local iterations of backdoored client , is the model parameters of backdoored client , is the training batch with size sampled from . In each batch, there are backdoored data samples denoted as , then the batch gradient is . The central server aggregates benign updates and backdoored updates into an infected global model via , where the first term is global model of previous round, the second term and the third term represent backdoored gradients and benign gradients respectively.

0:  initial global model , global round , datasets , local iterations , local learning rate , threshold , threshold
0:  global model
Client
1:for each round  do
2:    for client  do
3:      Download from the central server
4:      for local iteration  do
5:        Compute batch gradient
Update model
6:      end for
7:      Uploads to the central server
8:    end for
9:end forServer
10:for each round  do
11:    Distribute to all clients
12:    Wait until all the gradients arrive
13:    Flatten every gradients and get
14:    Reduce the dimension of using PCA, and get
15:    for client  do
16:      Compute the sum of Euclidean distance between client and the other clients via
17:    end for
18:    Let
19:    Let
20:    for client  do
21:      if  then
22:        Exclude gradients of client based on threshold //Only valid in this round
23:      end if
24:    end for
25:    After excluding satisfied gradients of clients, client set becomes
26:    Cluster using Kmeans algorithm, and get clusters
27:    for Cluster in parallel do
28:      for client in cluster  do
29:        Let be the similarity set between client and the other clients in cluster //Use the flattened gradients to compute cosine similarity
30:        Let
31:      end for
32:      Let be the set of in cluster
33:      Let
34:    end for
35:    Let be the set of
36:    The final selected cluster is
37:    Exclude the unsatisfied gradients of clients in selected cluster like line with threshold
38:    The final selected clients set denotes as
39:    Update global model
40:end for
41: Return final global model
Algorithm 1 RFLBAT

4 Methodology

In this section, we introduce the proposed robust aggregation oracle RFLBAT in Algorithm 1. In contrast to the existing majority-based algorithms, our algorithm RFLBAT is intended for a FL setting where the central server can only access to the gradients of clients under IID and Non-IID scenarios, and does not assume that the backdoored clients is less than the benign clients. In addition to this, RFLBAT require no auxiliary information outside of the learning process.

Our algorithm RFLBAT has two key insights: one is the gradients pushed by backdoored clients and benign clients can be distinguished using PCA and partitioned into different clusters using Kmeans clustering algorithm, and another is the gradients of backdoored clients appear more similar to each other than benign clients. RFLBAT uses the two insights to effectively select the benign gradients to aggregate, so as to fully avoid the participation of the backdoor clients.

RFLBAT design. On the client side, during the training process, at round , the clients download the latest model and update their models with iterations and datasets , and then send the updated gradients to the central server. On the central server side, and the central server collects the gradients pushed by clients in synchronous rounds, and performs aggregation to generate the global model for next round training using RFLBAT.

Figure 4 shows the process of RFLBAT on the central server side, where the gray dotted line is the main work of RFLBAT. The general process of RFLBAT is: first, reduce the dimension of gradients using PCA; second, exclude the outliers which are harmful to the next process; third, cluster the gradients set using Kmeans clustering method; fourth, select the optimal gradients based on cosine similarity; fifth, exclude the outliers again; finally, aggregate the selected gradients to the global model for next round training. To clearly clarify how RFLBAT works and how to solve above mentioned challenges on the central server side at round , we will describe RFLBAT in detail in Algorithm 1.

Figure 4: The flow of our algorithm RFLBAT on the central server side.

Step 1. The central server collect all gradients pushed by backdoored clients and benign clients (line 12).

Step 2. Flatten the gradient of every client and get a gradient set with dimensions, where and represent the number of clients and model parameters respectively (line 13). Then, reduce the dimension of using PCA, and get a new gradient set with dimensions. Here, we set to be 2 (line 14).

Step 3. Calculate the sum of Euclidean distance between each client and the other clients like (line 16). Then, the sum of distance per client is dimensionless based on the median value (line 19). Exclude the outliers lager than threshold (line 22). Here, we set to be 10. Note that we use the gradients after dimensionality-reduction to compute Euclidean distance. This step addresses challenge 2.

Step 4. Use Kmeans clustering algorithm to cluster , and get clusters, where represents the client set after removing unsatisfied clients from original client set in Step 3 (line 26).

Step 5. Select the optimal cluster for aggregation based on similarity. In RFLBAT  we use cosine similarity to measure the angular distance between client and client such that . Specially, in each cluster , computer the cosine similarity between each client and the others, and denotes the similarity set of client in cluster (line 29). Then, take an average for each client’s similarity set , denote as (line 30), and each client’s constitutes set (line 32). Choose the median of as the overall similarity of cluster (line 33), denote as . Each cluster’s similarity constitutes set (line 35), finally select cluster with the minimum value in (line 36). Note that we utilize the original gradients to compute the similarity, not the gradients after dimensionality-reduction. This step addresses Challenge 3.

Step 6. Though we have selected an effective cluster for aggregation, a few backdoored clients may be still present in the selected cluster . Therefore, similar to Step 3, remove satisfied clients based on in the selected cluster (line 37). Here, to completely exclude backdoored clients that may be in the selected cluster , we set to be 4 smaller than .

Step 7. Use FedAvg algorithm to aggregate the selected gradients such that , where represents the selected client set (line 39).

Step 8. The central server sends updated global model to all clients for next round training (line 11).

Step 9. After multiple rounds training, the central server generate the final global model (line 41).

Convergence analysis. Note that our algorithm RFLBAT actually selects partial clients from the client set to aggregate the global model based on FedAvg. Li [18] proves that FedAvg algorithm is convergent using partial clients under Non-IID scenario. RFLBAT thus shares the same convergence property as the FedAvg.

5 Experiments

In this section, we evaluate the performance of RFLBAT on image classification and sentiment analysis tasks with three common machine learning models over three well-known public datasets under various scenarios. We test our approach RFLBAT’s effectiveness by comparing to four baseline aggregation algorithms. Our experiments are implemented using PyTorch framework with about 3000 lines code.

5.1 Datasets and Models

We demonstrate three public datasets and three well-known machine learning models. For image classification task, we use MNIST[20] and FEMNIST [6] datasets. For the sentiment analysis task, we use Amazon Fine Food Reviews from kaggle [21]. The details of all three datasets and machine learning models are as follows, and Table 1 describes the experimental datasets and machine learning models.

MNIST. MNIST dataset [20] contains 60000 training images and 10000 testing images with 10 labels, and the size of each image is 2828. In IID scenario, all training samples are uniformly divided into 100 parties, each client holds a party which consists of 600 training samples with 10 labels. To implement Non-IID scenario, a Dirichlet distribution is used to divide training images into 100 parties, each party holding samples which are different from other parties in both class and quantity.

FEMNIST. FEMNIST dataset [6] includes 801074 samples with 62 object classes distributed among 3500 writers, and the size of each image is also . Each writer represents a client, resulting in a heterogeneous federated scenario.

Amazon. We download Amazon Fine Food Reviews from kaggle [21], and it contains 568454 food reviews with 1-5 score as 5 object classes. We only sample 20000 reviews of every class to train a model, total 100000 data samples. Similar to MNIST, we split the training data for 100 parties in IID and Non-IID manners. 80% of data samples are used for training and the rest for testing.

Models.

For MNIST dataset, we train a multi-class logistic regression (LR) model which contains one softmax layer with 784 units, total 7850 parameters. For FEMNIST dataset, we train a convolutional neural network (CNN) model with 2 CNN layers (7

732 and 33

64), followed by a fully connected (FC) layer with 3136 units. There are total 0.2M parameters in CNN model. For Amazon dataset, we train a one layer directional long short-term memory (LSTM) model with 100 hidden units, followed by 1 FC layers with 64 units. We first embed 56785 words into 100 dimensions, and each sample has a maximum of 100 words, and less than 100 words are padded with 0. There are total 6M parameters in LSTM model. The datasets and models are summarized as Table

1.



Dataset
Classes Data size Features Model


MNIST
10 60000 784 LR

FEMNIST
62 801074 784 CNN

Amazon
5 80000 100 LSTM

Table 1: The description of datasets and models

5.2 Experiment Setup

We use PS structure which consists a central server and multiple clients to conduct our experiments until convergence. The training process has been illustrated in Section 1. Next, we will expound backdoor attack patterns for two tasks and various experiment scenarios.

Backdoor attack. The goal of backdoor attack is to change the global model’s behavior on some data samples with certain backdoor triggers, while maintain high performance on normal data samples. For image classification task, we consider some certain pixels as backdoor triggers. Specially, the backdoored client poisons its local training data samples using backdoor patterns in lower left corner of Figure 5(a) and swaps the original label into label “0”. For sentiment analysis task, the backdoored client inserts backdoor “The weather is so good, I want to eat noodles.” to its local training data, and swaps original label into label “0”, as illustrated in Figure 5(b)

. The backdoored attackers enforce the model to classify the test data with certain backdoor as a certain label they pre-config.

(a) Image’s backoor

(b) sentiment’s backoor
Figure 5: The backdoor triggers in image and sentiment.

Different number of attackers (DNA). In each round, we randomly select 100 clients to train a model, where a certain number of clients are backdoored attackers. We evaluate our algorithm RFLBAT under both IID and Non-IID scenarios with 10%, 50% and 90% backdoored clients, respectively. We verify that the superiority of RFLBAT compared to four state-of-the-art robust algorithms under different number of attackers.

Different Non-IID scenarios (DNS). Due to the differences in class and data size, there are various Non-IID scenarios. To evaluate the effectiveness of RFLBAT under different Non-IID scenarios, we use different datasets to realize different Non-IID scenarios. In addition to this, we also simulate different Non-IID scenarios through changing the parameter of Dirichlet distribution on MNIST. Parameter increases from 0.1 to 2 represent a variety of Non-IID scenarios.

Different number of clients (DNC). We also evaluate the effectiveness of RFLBAT in the face of different number of clients. In FEMNIST dataset, we respectively select 50, 100, 200, 400, 600, 800, 1600 clients among 3500 clients to conduct a series of experiments.

Distributed backdoor attack (DBA). Xie [27] proves that the distributed backdoor attack (DBA) is much stronger than centralized backdoor attack, and can beat most of state-of-the-art robust algorithms. Similar to [27], the centralized backdoor patterns are divided into four distributed backdoor pattern. In our experiment, we poison 40 clients among 100 total clients, and every 10 clients as a group performs each local backdoor attack.

To be fair, the only difference between benign clients and backdoored clients is the training data. Table 2 summarizes the total experiment scenarios.



Scenario
Description Dataset


DNA
Poisoning 10%, 50%, 90% clients All

DNS
Changing Dirichlet distribution from 0.1 to 2 MNIST

DNC
50, 100, 200, 400, 600, 800, 1600 clients FEMNIST

DBA
40 poisoned clients, every 10 clients performing local backdoor attack All

Table 2: The description of experiment scenarios

5.3 Comparison Algorithms

We compare our algorithm RFLBAT with four typical robust aggregation algorithms: FoolsGold Multi-Krum, GeoMed and RFA. FoolsGold. FoolsGold [9] reduces aggregation weights of participating parties that repeatedly contribute similar gradient updates while retaining the weights of parities that provide different gradient updates.

Multi-Krum. Multi-Krum [4] calculates the total Euclidean distance from the nearest neighbors for each local update. The local updates with the highest distances are excluded and average the rest of local updates. Nevertheless, Multi-Krum relies on parameter, and prior knowledge of is an unrealistic assumption when defending against backdoor attack.

GeoMed. GeoMed [7] generates a global model update using the geometric median of the local model updates, including the local model updates pushed by backdoored clients.

RFA. Similar to GeoMed, RFA [23] aggregates a global model update and appear robust to outliers by replacing the weighted arithmetic mean with an approximate geometric median, so as to reduce the impact of the contaminated updates.

5.4 Experiment Results

5.4.1 The Experiment Results of DNA

For all datasets, we run four existing typical robust algorithms FoolGold, Multi-Krum, GeoMed, RFA and our proposed RFLBAT  and train three above machine learning models until convergence. In MNIST and Amazon datasets, we consider both IID and Non-IID scenarios, while only consider Non-IID scenario for FEMNIST due to its inherently heterogeneous.

MNIST under IID scenario. For MNIST under IID scenario, Figure 6 shows the performance of the four typical robust aggregation algorithms and RFLBAT with an increasing number of backdoored clients: RFLBAT outperforms the other four existing algorithms. The four existing algorithms and RFLBAT can reach a fine training accuracy on normal testing samples, while the performance of the five algorithms facing backdoor attack is quite different.

(a) 10% backdoored clients
(b) 50% backdoored clients
(c) 90% backdoored clients
Figure 6: The performance of Multi-Krum, GeoMed, RFA algorithms and RFLBAT on MNIST under IID scenario with different number of backdoored clients.
(a) 10% backdoored clients
(b) 50% backdoored clients
(c) 90% backdoored clients
Figure 7: The performance of Multi-Krum, GeoMed, RFA algorithms and RFLBAT on MNIST under Non-IID scenario with different number of backdoored clients.
(a) 10% backdoored clients
(b) 50% backdoored clients
(c) 90% backdoored clients
Figure 8: The performance of Multi-Krum, GeoMed, RFA algorithms and RFLBAT on FEMNIST under Non-IID scenario with different number of backdoored clients.
(a) 10% backdoored clients
(b) 50% backdoored clients
(c) 90% backdoored clients
Figure 9: The performance of FoolsGold, Multi-Krum, GeoMed, RFA algorithms and RFLBAT on Amazon under IID scenario with different number of backdoored clients.

Specially, with the increasing number of backdoored clients, the four existing typical algorithms are gradually failed against backdoor attack, but RFLBAT always works. When the number of backdoored clients is 10%, as shown in Figure 6(a), FoolsGold, Multi-Krum and RFLBAT can effectively prevent this attack while maintaining high training accuracy. It indicates that the three algorithms can completely nullify all backdoored gradients during aggregating the global model. Compared to FoolsGold, Multi-Krum and RFLBAT  GeoMed and RFA perform worst, and the attack rate can reach about 26%, which means the backdoor triggers have been moderately inserted into the global model. Because both GeoMed and RFA are based on geometric median, the attack rate is similar.

As the number of backdoored clients increases to 50%, Figure 6(b) shows the results of five algorithms against this attack. We can see that RFLBAT can also resist this attack and meanwhile maintain high training accuracy. Although FoolsGold can defend against this attack, the attack rate of using FoolsGold reaches 9.5% which is higher than RFLBATṪhis demonstrates that the performance of FoolsGold is worse than RFLBAT. By contrast, Multi-Krum, GeoMed and RFA can hardly defend against this attack, resulting in reaching nearly 100% attack rate. It indicates that the backdoor triggers have fully been inserted into the global model using the three algorithms.

As the four existing typical algorithms and RFLBAT face lager groups of backdoored clients, for example 90% backdoored clients scenario, the results are shown in Figure 6(c). RFLBAT still exhibits robust to this attack, and the attack rate is nearly 0.5% which may be caused by the model error. In contrast, FoolsGold, Multi-Krum, GeoMed and RFA perform worst against this attack, and the attack rate reaches nearly 100%. It means that the three algorithms are completely ineffective against this attack.

MNIST under Non-IID scenario. Figure 7 shows the performance of FoolsGold, Multi-Krum, GeoMed, RFA and RFLBAT on MNIST under Non-IID setting with an increasing number of backdoored clients: RFLBAT outperforms the other four existing typical algorithms against these backdoor attacks. Compared with Figure 6, we can find the similar conclusions in Figure 7. Nevertheless, there are some differences between IID and Non-IID scenarios.

To be specific, as the number of backdoored clients is 10%, RFLBAT and Multi-Krum can also defend against this attack, but the attack rate reaches to nearly 27.5%,68.9% and 71.4% using FoolsGold, GeoMed and RFA respectively, which is different from Figure 6(a). Obviously, it is relevant to the data distribution. In IID scenario, the training data of each client is uniformly sampled from the whole data set and every client shares the common data size and data classes while every client shares the little or no data in Non-IID scenario. The model update are not only affected by data features, but also by the data size and data classes, which will seriously degrade the performance of FoolsGold, GeoMed and RFA.

As the number of backdoored clients increases to 50% and 90%, similar to IID scenario, RFLBAT are still robust with these attacks, while FoolsGold, Multi-Krum, GeoMed and RFA algorithms are disabled. This demonstrates that RFLBAT can remove all malicious gradients even in the Non-IID scenario. It is worth noting that although RFLBAT get high training accuracy, it is slightly lower than the other four algorithms in 90% backdoored clients scenario. The reason is simple: RFLBAT tries to select the best cluster containing the most benign gradients to aggregate, whereas the best cluster may not include all benign gradients in Non-IID scenario. Therefore, RFLBAT loses some normal gradients from unselected clients, consequently resulting in slightly lower training accuracy than the four contrastive algorithms.

FEMNIST under Non-IID scenario.Figure 8 shows the performance of FoolsGold, Multi-Krum, GeoMed, RFA and RFLBAT on FEMNIST under Non-IID scenario with an increasing number of backdoored clients. Similar to Figure 6 and Figure 7, as the number of backdoored clients grows, RFLBAT can still effectively defend against backdoor attacks at the cost of losing less than 3% of the training accuracy, while Multi-Krum, GeoMed and RFA are completely ineffective even if the number of backdoored clients is 10%, and FoolsGold gets 9.1% attack rate which is worse than that of RFLBAT. Different from MNIST under 10% backdoored clients scenario, the attack rate of using Multi-Krum, GeoMed and RFA can reach nearly 99%. This indicates that FEMNIST is more heterogeneous than MNIST in our experiment settings. When the number of backdoored clients increases to 50% and 90%, the four existing algorithms are completely disabled. It should be noted that the training accuracy of FoolsGold is the worst among the five algorithms, which is different from Figure 7. It may be relevant to the peculiarity of FoolsGold which reduces aggregation weights of participating parties based on similarity. During aggregation, FoolsGold may severely reduce the weights of benign clients, resulting in degrading the contribution of benign clients.

Amazon under IID scenario. Figure 9 shows the performance of FoolsGold, Multi-Krum, GeoMed, RFA and RFLBAT on Amazon dataset. For sentiment classification, there are no big differences compared to image classification. We can also draw a conclusion that RFLBAT is effectively defend against backdoor attack and outperforms the four existing typical algorithms. We should pay attention to the attack rate using RFLBAT: the attack rate of 90% backdoored clients is higher than that of 10% backdoored clients and 50% backdoored clients, which goes against our intuitive assumptions. The reason for this result may be that the number of backdoored clients participating in the aggregation under 10% backdoored clients and 50% bacdoored clients is more than that under 90% backdoored clients. This also suggests that the intensity of backdoor attack is not necessarily related to the number of backdoored attackers using RFLBAT.

Amazon under Non-IID scenario. In Amazon under IID scenario, all robust aggregation algorithms are failed except RFLBAT. For Amazon dataset under Non-IID, we only evaluate the performance of the five algorithms against 90% backdoored clients, and the result is shown in Figure 10. RFLBAT still performs the best among the five algorithms, and the other four algorithm are completely defeated by this attack. RFLBAT performs worse than it under IID scenario due to the data distribution. Even so, RFLBAT is still effective against this attack.

Figure 10: The performance of FoolsGold, Multi-Krum, GeoMed, RFA algorithms and RFLBAT on Amazon under Non-IID scenario with 90% backdoored clients.

5.4.2 The Experiment Results of DNS

RFLBAT relies on the the key sight that the backdoored clients perform more similar than benign clients. However, similarity is greatly influenced by different data distributions. To test the effectiveness of RFLBAT under diverse data distribution, we implement a DNS attack by changing Dirichlet distribution on MNIST.

Figure 11 shows the performance of RFLBAT against DNS attack. We see that RFLBAT can effectively defend against this attack with low attack rate and satisfied training accuracy. It indicates that RFLBAT can distinguish backdoored gradients and benign gradients and select benign gradients to aggregate under various Non-IID scenarios. Note that RFLBAT performs the worst when the Dirichlet distribution parameter . This is because the data distribution at is more heterogeneous than it at , as shown in Figure 12. Specially, from Figure 12, we can see that the data distribution of all clients at is more diverse than it at , resulting in more heterogeneous data distribution at a lower . Due to more heterogeneous data distribution, RFLBAT will cluster the gradients into more clusters, and each cluster contains fewer clients. Although RFLBAT can effectively select the optimal cluster to aggregate, this cluster only consists of partial clients of all benign clients. Therefore, RFLBATperforms the worst when the Dirichlet distribution parameter . However, RFLBAT performs similar as the Dirichlet distribution parameter . It indicates that backdoor attack cannot subvert RFLBAT by manipulating the data distribution.

Figure 11: The performance of RFLBAT against DNS attack on MNIST.
(a)
(b)
Figure 12: The data size of all clients under different Dirichlet distributions.

5.4.3 The Experiment Results of DNC

To test the performance of RFLBAT facing different number of participating clients, we implement DNC attack with 50% backdoored clients. We evaluate RFLBAT under 50, 100, 200, 400, 600, 800, 1200 and 1600 clients experiments, respectively. Figure 13 shows the result of RFLBAT under DNC scenario. RFLBAT are also effective against DNC attack: the attack rate is only 1%, while the training accuracy is about 80% which is not affected by the number of clients. It indicates that the the benign gradients can be clustered into a cluster by PCA technique and Kmeans clustering algorithm and can be selected based on cosine similarity using RFLBAT. Therefore, RFLBAT is robust against DNC attack.

Figure 13: The performance of RFLBAT against DNC on FEMNIST.

5.4.4 The Experiment Results of DBA

Xie [27] propose a viewpoint that distributed backdoor attack (DBA) is much stronger than centralized backdoor attack and verify it through a series of experiments. To evaluate the effectiveness of RFLBAT under DBA scenario, we also conduct a series of DBA experiments using RFLBAT on MNIST, FEMNIST and Amazon datasets. Specially, similar to [27], we separate the centralized backdoor into four parts: backdoor0, backdoor1, backdoor2 and backdoor3, and use each local backdoor to poison 10% clients among 100 clients. Hence there are 40% backdoored clients in DBA scenario and each 10% backdoored clients performs a single backdoor attack. We also consider two scenarios: 1, the backdoored clients push the original gradients to the central server; 2, the backdoored clients amplify the gradients and push them to the central server. Therefore, we set the scale factor to be 1 and 100 in DBA scenario.

Table 3 shows the training accuracy and attack rate using RFLBAT under DBA scenario. We can see that RFLBAT can resist DBA, meanwhile maintaining high training accuracy. To be specific, in MNIST, when the scale factor , the attack rates of backdoor attacks which contains full backdoor and local backdoor are all about 0.8%, and the train accuracy is 89.1%. As the scale factor is 100, the attack rates of backdoor attacks are barely changed, but the training accuracy is 90.2% which is little higher than that of scale factor . The similar results can be found on FEMNIST and Amazon.



Dataset
Scale factor Training accuracy Full backdoor Backdoor0 Backdoor1 Backdoor2 Backdoor3


MNIST
1 89.1% 0.8% 0.8% 0.8% 0.8% 0.8%
100 90.2% 0.7% 0.7% 0.6% 0.7% 0.6%

FEMNIST
1 66.1% 4.1% 2.6% 2.7% 3.4% 4.4%
100 68.6% 2.4% 1.6% 1.7% 1.9% 2.3%

Amazon
1 41.8% 5.1% 4.5% 4.8% 4.3% 4.6%
100 43.3% 3.2% 2.8% 3.1% 3.4% 3.1%

Table 3: The training accuracy and attack rate using RFLBAT under DBA scenario on MNIST, FEMNIST and Amazon.

To understand how RFLBAT handles these two scenarios, we visualize the gradients after PCA for MNIST in Figure 14. Figure 14(a) shows the gradients distribution at scale factor . Since the gradients are scattered, the benign gradients cannot be clustered into a cluster. RFLBAT can only select partial benign gradients to aggregate. In contrast to , as the scale factor , the benign gradients and backdoored gradients can be clearly distinguished, and the benign gradients can be clustered into a cluster. Hence RFLBAT can select all benign gradients to aggregate , resulting in a higher training accuracy.

(a)
(b)
Figure 14: A visualization of the gradients of clients after PCA under and in MNIST.

Note that the attack rates of full backdoor and local backdoor are lower at than that at . The reason is simple: take MNIST as an example as well, in Figure 14(a) (), the backdoored gradients are very close to the benign gradients, leading to a few backdoored gradients in the final selected cluster; in contrast to the result of , in Figure 14(b) (), the backdoored gradients can be fully detected and removed during aggregation. It is worth mentioning that the multiple benign gradients converge into a point in Figure 14(b), not a single gradient.

As the results in DBA scenario demonstrate, RFLBAT is effective at defending against DBA regardless of whether the backdoored clients amplify the gradients.

6 Related Works

Backdoor Attacks on Federated Learning. Backdoor attacks aim to insert backdoor into final global model by training strong poisoned local models and submitting model updates to the central server, so as to mislead the final global model [3]. [2] studies the model replacement approach, where the attacker scales malicious model updates to replace the global model with local backdoored one. [27] experimentally proves that distributed backdoor attack is stronger than centralized attack.

Robust Aggregation Algorithms in Federated Learning. To nullify the impact of attacks in aggregating local model updates, many robust aggregation algorithms have been proposed [4, 7, 22, 23, 30, 17, 9, 25]. Krum [4] selects a representative worker among multiple workers and use its update to estimate the true update of global model. Bulyan [22] uses Krum to iteratively select benign workers and then aggregates these workers by a variant of the TrimmedMean [30]. Because the median-based algorithms are more resistant to outliers than mean-based algorithms, other algorithms employ coordinate-wise median [30], geometric median [7], and approximate geometric median [23] to aggregate a global model. [16, 17]

require a pre-trained model to detect and remove malicious model updates in aggregation. The malicious worker detection model can be trained using autoencoder and test data.

[25] proposes AUROR to address backdoor attacks in collaborative machine learning. [23] proposes a robust aggregation algorithm named RFA by replacing the weighted arithmetic mean with an approximate geometric median, so as to reduce the impact of the contaminated updates. [9] proposes FoolsGold, which calculates the cosine similarity of the gradient updates from clients, reduces aggregation weights of clients that contribute similar gradient updates, thus promoting contribution diversity.

7 Conclusion

In this work, we propose a robust aggregation algorithm named RFLBAT based on unsupervised learning against backdoor attack, in which the central server can detect and remove backdoored gradients using PCA technique and Kmeans clustering algorithm. Our algorithm does not require prior knowledge of the expected number of backdoored attackers, and does not access to the training data and test data. We have conducted extensive experiments using MNIST, FEMNIST and Amazon datasets with LR, CNN and LSTM models respectively. We consider four backdoor attack scenarios: different number of attackers (DNA), different Non-IID scenarios (DNS), different number of clients (DNC) and distributed backdoor attack (DBA). The experimental results indicate that RFLBAT is able to outperform the existing robust aggregation algorithms, and can mitigate various backdoor attack scenarios. RFLBAT is also effective even when backdoored clients overwhelm benign clients.

References

  • [1] D. Arthur and S. Vassilvitskii (2006) K-means++: the advantages of careful seeding. Technical report Stanford. Cited by: §2.
  • [2] E. Bagdasaryan, A. Veit, Y. Hua, D. Estrin, and V. Shmatikov (2020) How to backdoor federated learning. In

    International Conference on Artificial Intelligence and Statistics

    ,
    pp. 2938–2948. Cited by: §1, §1, §6.
  • [3] A. N. Bhagoji, S. Chakraborty, P. Mittal, and S. Calo (2019) Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning, pp. 634–643. Cited by: §1, §2, §6.
  • [4] P. Blanchard, R. Guerraoui, and et al (2017) Machine learning with adversaries: byzantine tolerant gradient descent. In NIPS, pp. 119–129. Cited by: §1, §5.3, §6.
  • [5] K. Bonawitz, H. Eichner, W. Grieskamp, D. Huba, A. Ingerman, V. Ivanov, C. Kiddon, J. Konečnỳ, S. Mazzocchi, H. B. McMahan, et al. (2019) Towards federated learning at scale: system design. arXiv preprint arXiv:1902.01046. Cited by: §1.
  • [6] S. Caldas, S. M. K. Duddu, P. Wu, T. Li, J. Konečnỳ, H. B. McMahan, V. Smith, and A. Talwalkar (2018) Leaf: a benchmark for federated settings. arXiv preprint arXiv:1812.01097. Cited by: §1, §5.1, §5.1.
  • [7] Y. Chen, L. Su, and et al (2018) Distributed statistical machine learning in adversarial settings: byzantine gradient descent. In ACM SIGMETRICS, pp. 96–96. Cited by: §1, §5.3, §6.
  • [8] M. Fang, X. Cao, J. Jia, and N. Gong (2020) Local model poisoning attacks to byzantine-robust federated learning. In 29th USENIX Security Symposium (USENIX Security 20), pp. 1605–1622. Cited by: §1.
  • [9] C. Fung, C. J. M. Yoon, and et al (2018) Mitigating sybils in federated learning poisoning. Cited by: §1, §5.3, §6.
  • [10] T. Gu, K. Liu, B. Dolan-Gavitt, and S. Garg (2019) Badnets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, pp. 47230–47244. Cited by: §3.2.
  • [11] A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augenstein, H. Eichner, C. Kiddon, and D. Ramage (2018) Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604. Cited by: §1.
  • [12] H. Hotelling (1933) Analysis of a complex of statistical variables into principal components.. Journal of educational psychology 24 (6), pp. 417. Cited by: §2.
  • [13] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, et al. (2019) Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977. Cited by: §1.
  • [14] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner (1998) Gradient-based learning applied to document recognition. Proceedings of the IEEE 86 (11), pp. 2278–2324. Cited by: §1.
  • [15] M. Li, D. G. Andersen, and et al (2014) Scaling distributed machine learning with the parameter server. In OSDI, pp. 583–598. Cited by: §1.
  • [16] S. Li, Y. Cheng, and et al (2019) Abnormal client behavior detection in federated learning. Cited by: §2, §6.
  • [17] S. Li, Y. Cheng, and et al (2020) Learning to detect malicious clients for robust federated learning. arXiv preprint arXiv:2002.00211. Cited by: §1, §2, §6.
  • [18] X. Li, K. Huang, W. Yang, S. Wang, and Z. Zhang (2019) On the convergence of fedavg on non-iid data. arXiv preprint arXiv:1907.02189. Cited by: §4.
  • [19] B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas (2017) Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pp. 1273–1282. Cited by: §3.1.1.
  • [20] H. B. Mcmahan, E. Moore, and et al (2016) Communication-efficient learning of deep networks from decentralized data. Cited by: §1, §5.1, §5.1.
  • [21] R. Mehrotra (2017) A complete text classfication guide(word2vec+lstm). Note: [Online]https://www.kaggle.com/rajmehra03/a-complete-text-classfication-guide-word2vec-lstm/ Cited by: §5.1, §5.1.
  • [22] E. M. E. Mhamdi, R. Guerraoui, and et al (2018) The hidden vulnerability of distributed learning in byzantium. arXiv preprint arXiv:1802.07927. Cited by: §1, §6.
  • [23] K. Pillutla, S. M. Kakade, and Z. Harchaoui (2019) Robust aggregation for federated learning. arXiv preprint arXiv:1912.13445. Cited by: §1, §5.3, §6.
  • [24] M. J. Sheller, G. A. Reina, and et al (2018) Multi-institutional deep learning modeling without sharing patient data: a feasibility study on brain tumor segmentation. Cited by: §1.
  • [25] S. Shen, S. Tople, and P. Saxena (2016) Auror: defending against poisoning attacks in collaborative deep learning systems. In ACSAC, pp. 508–519. Cited by: §1, §6.
  • [26] H. Wang, K. Sreenivasan, S. Rajput, H. Vishwakarma, S. Agarwal, J. Sohn, K. Lee, and D. Papailiopoulos (2020) Attack of the tails: yes, you really can backdoor federated learning. arXiv preprint arXiv:2007.05084. Cited by: §1.
  • [27] C. Xie, K. Huang, P. Chen, and B. Li (2019) Dba: distributed backdoor attacks against federated learning. In International Conference on Learning Representations, Cited by: §1, §1, §5.2, §5.4.4, §6.
  • [28] C. Xie, S. Koyejo, and I. Gupta (2019)

    Zeno: distributed stochastic gradient descent with suspicion-based fault-tolerance

    .
    In International Conference on Machine Learning, pp. 6893–6901. Cited by: §2.
  • [29] Q. Yang, Y. Liu, and et al (2019) Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology 10 (2), pp. 1–19. Cited by: §1.
  • [30] D. Yin, Y. Chen, and et al (2018) Byzantine-robust distributed learning: towards optimal statistical rates. arXiv preprint arXiv:1803.01498. Cited by: §1, §6.