DeSMP: Differential Privacy-exploited Stealthy Model Poisoning Attacks in Federated Learning

by   Md Tamjid Hossain, et al.
University of Nevada, Reno

Federated learning (FL) has become an emerging machine learning technique lately due to its efficacy in safeguarding the client's confidential information. Nevertheless, despite the inherent and additional privacy-preserving mechanisms (e.g., differential privacy, secure multi-party computation, etc.), the FL models are still vulnerable to various privacy-violating and security-compromising attacks (e.g., data or model poisoning) due to their numerous attack vectors which in turn, make the models either ineffective or sub-optimal. Existing adversarial models focusing on untargeted model poisoning attacks are not enough stealthy and persistent at the same time because of their conflicting nature (large scale attacks are easier to detect and vice versa) and thus, remain an unsolved research problem in this adversarial learning paradigm. Considering this, in this paper, we analyze this adversarial learning process in an FL setting and show that a stealthy and persistent model poisoning attack can be conducted exploiting the differential noise. More specifically, we develop an unprecedented DP-exploited stealthy model poisoning (DeSMP) attack for FL models. Our empirical analysis on both the classification and regression tasks using two popular datasets reflects the effectiveness of the proposed DeSMP attack. Moreover, we develop a novel reinforcement learning (RL)-based defense strategy against such model poisoning attacks which can intelligently and dynamically select the privacy level of the FL models to minimize the DeSMP attack surface and facilitate the attack detection.


page 1

page 6


Adversarial Analysis of the Differentially-Private Federated Learning in Cyber-Physical Critical Infrastructures

Differential privacy (DP) is considered to be an effective privacy-prese...

Measuring Lower Bounds of Local Differential Privacy via Adversary Instantiations in Federated Learning

Local differential privacy (LDP) gives a strong privacy guarantee to be ...

Understanding the Interplay between Privacy and Robustness in Federated Learning

Federated Learning (FL) is emerging as a promising paradigm of privacy-p...

PRECAD: Privacy-Preserving and Robust Federated Learning via Crypto-Aided Differential Privacy

Federated Learning (FL) allows multiple participating clients to train m...

Dynamic backdoor attacks against federated learning

Federated Learning (FL) is a new machine learning framework, which enabl...

PerDoor: Persistent Non-Uniform Backdoors in Federated Learning using Adversarial Perturbations

Federated Learning (FL) enables numerous participants to train deep lear...

Technical Report: Assisting Backdoor Federated Learning with Whole Population Knowledge Alignment

Due to the distributed nature of Federated Learning (FL), researchers ha...

I Introduction

Federated Learning (FL), also known as collaborative learning, has caught a lot of attention from the research community since it has been first introduced back in 2016 by McMahan et al. [mcmahan2017communication]. It is mostly because of the inherent privacy protection that FL offers to its users. In the FL process, a model is trained on a diffuse network of edge nodes using their local data; rather than the traditional centralized training fashion. This provides a level of data privacy assurance to the users since the confidential data do not leave the edge nodes.

However, the process of FL can be vulnerable to differential attacks (e.g., membership inference attacks (MIA)) which aim to reveal the sensitive information of a node by analyzing the distributed model parameters [geyer2017differentially] or gradients [zhu2020deep]. To alleviate this privacy issue, extensive research have been carried out lately, focusing on developing secure multi-party computation (SMPC) [li2020privacy], trusted execution environments (TEEs) [mo2019efficient], cryptographic encryption [zhang2020batchcrypt, sadique2021cybersecurity, sadique2019system], and differential privacy (DP)-based privacy-preservation techniques [dwork2006calibrating, kairouz2019advances, bhagoji2019analyzing] for FL. Among these, DP is considered a very promising technique to preserve the data privacy and prevent MIA [shokri2017membership]. Existing works along this research line include DP-based distributed SGD [abadi2016deep], local DP (LDP) [wang2019local],

Although DP is providing a level of privacy guarantee, an adversary can exploit the DP noise to inject false data into the original data and hide the attack identity exploiting the noise range [giraldo2020adversarial]. In this paper, we investigate this vulnerability of DP-based applications and show that in a differentially private FL setting (we call it ‘DPFL’), a malicious actor can inject the false data either into the differentially private training data (i.e., data poisoning attack [biggio2012poisoning] or into the model parameters (i.e., model poisoning attack [bagdasaryan2020backdoor, bhagoji2019analyzing]

). More specifically, we demonstrate a stealthy model poisoning attack in the FL model exploiting the noise of the DP mechanism that (1) reduces the overall accuracy of the global federated model, and (2) deceives the traditional anomaly detection mechanisms by hiding the false data into the DP-noise. The results in this paper reveal a new backdoor for stealthy and untargeted model poisoning attacks in FL through the exploitation of the DP mechanism.

I-a Motivations

Poisoning attacks in any machine learning (ML) setting can be broadly divided into two major categories: targeted and untargeted attacks [kairouz2019advances]. Targeted poisoning attacks [bagdasaryan2020backdoor, bhagoji2019analyzing] aim to change the outcome or behavior of the model on particular inputs while maintaining a good overall accuracy on all other inputs, thus makes the attack and defense processes more difficult. On the contrary,the untargeted model poisoning attacks [biggio2012poisoning, fang2020local] have the power to make a model unusable and eventually leads to a denial-of-service attack [fang2020local]. For instance, an adversary may perform untargeted attacks on its competitor’s FL model with an intention to make the model unfeasible.

However, traditional untargeted poisoning attacks mainly utilize the hyperparameters of the targeted model to scale up the effectiveness of the malicious model

[bhagoji2019analyzing]. To attain the goal of poisoning, the adversary may use explicit boosting that deforms the weights’ distribution, however, then it can be easily detected by the server through simple server-side model checking [pillutla2019robust]. Hence, untargeted model poisoning attacks in a stealthy manner remain an open problem in FL [zhou2021deep]. Moreover, since an FL system usually consists of a huge number of clients and only a portion of clients are chosen for any particular round [sun2019can]

, the odds of impacting the global model accuracy significantly by a single malicious contribution is very low. This leads us to the question-

“How can the adversary perform an untargeted model poisoning attacks in a stealthy but persistent fashion?”. Motivated by this, in this paper, we investigate the DP mechanism as a tool to conduct such adversarial poisoning attacks in FL. In the rest of the paper, the ‘false data injection (FDI)’ attack and ‘model poisoning’ attack is mentioned interchangeably.

I-B Contributions

In this paper, we show that the DP mechanism is creating a new attack avenue for stealthy false data injection (FDI) or model poisoning attacks in a DPFL environment. We name this attack model as ‘DP-exploited stealthy model poisoning’ (in short, DeSMP) attacks. Particularly, we make the following contributions:

  • We demonstrate that DP, as a privacy-preserving tool, is opening a new backdoor for untargeted model poisoning attacks in the FL setting. Our proposed attack strategy (DeSMP) is stealthy and persistent in nature.

  • To tackle the proposed DeSMP attack, we develop a reinforcement learning (RL)-based defense strategy. The proposed RL-based defense approach intelligently selects the differential privacy level for the clients’ model update. It also minimizes the attack vectors and facilitates attack disclosure.

Section II of this paper covers preliminaries of FL and a brief review of the related works while section III outlines the research problem and threat model. Section IV formulates the proposed DeSMP attack and defense model and their working principle. In section V, we analyze and evaluate the effectiveness of our proposed model. Finally, in section VI, we conclude the paper with some future research directions.

Ii Preliminaries and Literature Review

Here, we discuss the basic mechanism of FL while pointing out some significant contrasting contributions between this work and existing notable research work in adversarial FL. Table I describes the major symbols used in this paper.

Ii-a Mechanism of Federated Learning with DP

FL introduces a collaborative zone for training a model among a set of workers. Here, each participating node maintains a local model for its local training dataset. Additionally, FL incorporates a server that aggregates all the local models to form a global model [mcmahan2017communication]. Furthermore, to tackle MIA through analyzing the model weights, the FL server generally includes a privacy-preserving mechanism such as DP [geyer2017differentially]. Here, DP adds the random Laplacian () or Gaussian noise () to the model weights.

max width= Symbols Description Symbols Description Accuracy or loss threshold Mean Attack impact Measurement data Attacker’s tolerance Norm of model updates Batch Participating clients in each round Communication round PDF of attack distribution DP parameters

PDF of benign Gaussian distribution

Final global model Privacy budget FL parameters Privacy loss Global model Privacy spent in each round Gradient descent RL parameters Input data or Sensitivity Kullback-Leibler divergence Standard deviation Learning rate Total clients Local Model Weights Attacker loss Federated loss Agent reward Agent state Action Learning rate of RL agent Optimal policy Discount factor converged Q table Reward balancing parameter

TABLE I: List of major symbols and their description

Nonetheless, while deploying the DP mechanism, researchers [geyer2017differentially, sun2019can] have suggested using norm clipping or early stopping methods to compensate for the high level of random differential noise and prevent the model to be completely unusable. Once a pre-defined testing criterion (e.g., model accuracy is greater than a threshold or privacy budget exceeds) is met, the server finalizes the global model and stops the training procedure; otherwise, the training process re-initiates.

Ii-B Adversarial Federated Learning

Although the DP-based FL models do not expose the client’s training data to the rest of the world, there exist several attack vectors that an adversary can exploit to perform malicious modification or gain unauthorized access to confidential information. For instance, there could be some malicious clients who might inspect all messages received from the server and then, in the training phases, selectively poison the local models to reduce the efficiency of the global model [kairouz2019advances]. Other examples of the adversarial FL include the targeted and untargeted model poisoning attacks [bhagoji2019analyzing, fang2020local]. However, unlike the centralized ML schemes, the FL systems may employ a large number of untrusted devices which may facilitate the training-time attacks and inference-time attacks [kairouz2019advances]. In this paper, we focus on one of the powerful attack classes which is an untargeted model poisoning attack [fang2020local]. The adversary can conduct this model poisoning attack either by directly manipulating a client’s model or through the widely known man-in-the-middle attack formation leveraging the network and system vulnerabilities [kairouz2019advances].

Ii-C Related Research Work

In this part, we discuss some notable prior research related to the untargeted model poisoning attacks and defenses in FL while outlining some contrasting points with ours.

Ii-C1 Byzantine-robust Aggregation in Adversarial Setting

Byzantine threat models[guerraoui2018hidden] produce arbitrary outputs for any wrong inputs (either by an honest participant or a malicious actor). These arbitrary outputs can lead to converging the model to a sub-optimal model. Moreover, the Byzantine clients may need to have the white-box access or the non-Byzantine client updates to make their attack stealthy [kairouz2019advances]. Nonetheless, to the best of our knowledge, none of the existing works explore the vulnerabilities of the DP-based applications in tailoring such stealthy attacks. In contrast, we demonstrate that the Byzantine clients or the server can conduct stealthy and persistent untargeted model poisoning attacks by hiding behind the DP mechanism. In particular, we demonstrate the DP-exploited stealthy model poisoning (DeSMP) attacks in an untargeted manner for FL models.

Ii-C2 DP-assisted FL Frameworks in CPSs

Another related line of research focus on developing novel FL frameworks for cyber-physical systems (CPSs) such as power IoT [cao2020ifed], internet of vehicles (IoV) [zhao2020local], smart grids [taik2020electrical] etc. They pave the way for adopting FL into the CPS domain. Particularly, [taik2020electrical] shows that the FL models, coupled with edge computing, perform very efficiently in short-term load forecasting while significantly reducing the networking load compared to a centralized model. Nevertheless, they do not cover the adversarial analysis of the FL systems for model update poisoning attacks in CPSs. Since, in CPSs like smart grids, many mission-critical operations depend on the model accuracy, the DP-assisted poisoning attacks may create devastating consequences through the failure of physical layer devices. Therefore, it is non-trivial to investigate the attack surfaces of a DPFL model in CPSs. In this context, we focus on the adversarial analysis of the DP-technique in the CPS domain, which will facilitate the future development of novel and effective defense strategies.

Ii-C3 Attack Mitigation Strategies in Adversarial DP

Although some recent works [farokhi2018security, giraldo2020adversarial] consider active attacks (e.g., FDI attacks, poisoning attacks, etc.) in DP-based CPSs (e.g., smart grids, transportation systems, etc.), they neither discuss the stealthy model poisoning attacks nor develop any defense strategies based on intelligent decision making for differential privacy level through RL. In particular, they discuss and successively solve the optimal FDI attack problems by developing defense mechanisms based on the anomaly detection schemes for the post-attack phases; instead of taking any initiative to reduce the attack surface beforehand.

In contrast, we analyze the correlation of the DP and FL parameters under adversarial settings; then, leveraging the correlation, we facilitate deployment of the desired level of privacy, utility, and security among the participating nodes in a DPFL system through RL. Following the adversarial analysis and our proposed RL-assisted defense strategy, the large-scale poisoning attacks can be detected and the attack surface can be minimized, i.e., the incentive of the attacker can be reduced, which in turn reduces attack motivations while assisting attack prevention. In short, we develop our RL-assisted defense strategy as a part of the design process (pre-attack phase) to prohibit the untargeted model poisoning attacks. To the best of our knowledge, this is the first work that addresses the DP-exploitation issue in FL setting and successively develops the RL-based defense strategy.

Iii Problem Formulation and Threat Model

Suppose, we have clients, among which number of clients are selected in each communication round by the server. If the local model updates are , then the global model update at () communication round is: where the local model update at () round is: . In an alternative fashion, the loss of the predication can also be calculated as where is the examples set and represent the weights of global model. Now, according to the FederatedAveraging algorithm [mcmahan2017communication], the objective of the federated server is to minimize the following function: . The server continues the process until the objective is met.

To introduce DP for preventing model privacy leakage while keeping the model usable, we need to (a) clip the local model updates using the median norm of the unclipped contributions () so that the norm is limited and learning is progressing, and (b) add noise from a DP-preserving randomized mechanism (e.g., Laplace or Gaussian mechanism). Therefore, the new global model update with Gaussian noise at () round becomes: . Here,

is the variance and

is the sensitivity of the dataset with respect to the aggregation operation. The value of needs to be selected in an optimal way so that the noise variance stays sufficient while the aggregated weight’s distribution remains as close as possible to the original distribution. Following the related previous research [abadi2016deep, geyer2017differentially], we set . We draw the noise from a Gaussian distribution with mean (), variance

and PDF (probability density function) as:


However, a malicious actor (if presents) may modify (increase or decrease) the randomized noise in such a fashion that would facilitates (a) maximum damage, and (b) avoid detection. To perform such stealthy but strong malicious modification, the adversary needs to craft a fake noise profile from either the same or at least, similar distribution function as (1). Earlier research on adversarial differential privacy [giraldo2020adversarial, hossain2021PSU] present us with such optimal attack distribution () and impact () as follows:


Here, a high value of attacker’s tolerance () represents that the adversary does not care to be detected whereas a low value means the adversary wants to keep a low profile to avoid detection, and thus sacrifices the attack impact (). More specifically, the adversarial objective of stealthiness can be formulated as: , where is the Kullback-Leibler divergence between the PDF of attack distributions () and benign distribution (

) and indicates the classifier’s ability to correctly identify the inputs. Moreover, it can be inferred from (

2) that during a data or model poisoning attack, the optimal attack impact () is shifting the benign mean from to . However, is equal to the actual mean () when is zero. In short, it implies that when there is no attack or no DP mechanism, the results (in this case, the model weights) remain intact. The optimal attack distribution of (2), has been obtained by solving the functional multi-criteria optimization problem of the attacker (i.e., maximum attack while minimum disclosure) and the defender (i.e., maximum privacy with maximum utility). Therefore, if the adversary deviates from the strategy as given by (2), he could end up with even lower payoffs [giraldo2020adversarial].

This observation on adversarial DP analysis motivates us to first raise the question- “what would be the adversarial impact on the DPFL system if the adversary follows the optimal attack strategy, ?” and then, answer it through theoretical and empirical analysis. Moreover, this potential research problem motivates us to develop a novel and effective defense strategy against such attacks using RL-based intelligent differential privacy level selection.

Iii-a Threat Model

Our proposed threat model has been depicted in Fig. 1. Here, we are considering a simplified smart grid data transmission architecture which consists of some edge devices (e.g., distribution energy resources (DERs), intelligent electronic devices (IEDs), phasor measurement units (PMUs), etc.), data aggregators (e.g., phasor data concentrators (PDCs)), and a central server. The adversary can mark his presence in- (1) the edge nodes (i.e. disguise as an edge device), (2) the communication pathway between the clients and the server, and (3) the server-side. In case of data poisoning attacks, it is convenient for the adversary to compromise some edge devices (i.e., position 1), manipulates local training data, and disguises them as honest edge nodes. However, for model poisoning attacks, the suitable positions for the attacker are positions 2 and 3 since from those positions, the adversary can directly manipulate the FL models through compromising the communication path, sieging the model parameters, and then injecting fake noise into the parameters.

In the proposed setting, we assume that the adversary can manipulate the model updates regardless of the attack vectors (i.e., through man-in-the-middle or server-side attack formation). However, the adversary cannot directly change the models that are already on the server. He has white-box access (i.e., full knowledge of the global and local model parameters). The adversary might have partial knowledge of the training and testing data (i.e., distributions of the data); however, this is not a strict requirement in our threat model. In addition, we assume that the adversary has the knowledge of the imposed DP mechanism and privacy budget (). This assumption is particularly important and realistic as many researchers including Dwork et al. [dwork2019differential] emphasize the necessity of publishing the privacy budget in order to increase the trustworthiness of the system.

Fig. 1: Threat model: The adversary is exploiting DP to inject false data into the model weights by compromising either the communication path or acting as a server.

Iv Modeling DeSMP attack and Defense in Dpfl

In this section, we first describe the methodology of our proposed system development from an algorithmic point of view, and then, we model the proposed DeSMP attack and RL-assisted defense strategy.

Iv-a Development of Dpfl Systems

As discussed in section II-A, in a DPFL system, the global model is first constructed by aggregating all the local models from the randomly selected clients, and then, DP-noise is added into the model parameters to obfuscate the individual contribution of the clients. The working principle of a DPFL system with RL-based privacy selection is described through the pseudocodes of algorithm 1. The algorithm simply takes the measured data and the parameters of FL, RL, and DP as input. Then, through some intermediary functions (i.e., : local model, : reinforcement learning model, : global model), the global model is computed. If the computed global model passes the accuracy-test (i.e., accuracy is more than a pre-defined threshold, ), the global model is finalized and the DPFL process completes.

Output: Final Global FL model ()
Function ():

local epoch

             for batch  do
             end for
       end for
      return ;
End Function
Function ():
       Choose action using epsilon-greedy policy;
       Observe ;
       return ;
End Function
Function ():
       , ;
       if  then  return ;
       else ;
       return ;
End Function
while  do
       if  is available then
             wait for to be available
       end if
end while
Algorithm 1 DP- and RL- assisted FL process

Iv-A1 Local Model ()

The function takes the measurement data and learning parameters as inputs. Each client shares a portion of data (i.e., mini-batch) and train the global model with their local data. Finally, the local model and norm updates are calculated and sent back to the server.

Iv-A2 Reinforcement Learning Model

The purpose of the function is to generate the optimal policy for determining the privacy budget () considering the trade-off among the privacy, utility, and security in a DPFL system. The input of this function is the state of the system which comprises of . The function exploits the converged Q-table to determine the optimal action (or value of ) at each state of the learning process.

Iv-A3 Global Model ()

The sole purpose of function is to produce the global model () after each communication round through FederatedAveraging procedure [mcmahan2017communication] until the model finally converges around a pre-defined threshold value, . The function also checks if the privacy budget is expired on it. Another important task of this function is to clip the gradient to avoid over-fitting or gradient exploding and add Gaussian noise accordingly.

Iv-B Modeling DeSMP Attack

To perform the proposed DeSMP attack, the adversary needs to choose the level of his stealthiness or attacker’s tolerance (). Here, the adversarial goal is to perform the attack so that the model is unusable and ineffective (i.e., converges to a bad-minimum or starts denial-of-service) and the attack is stealthy. For instance, in a classification problem, if the test inputs are , output labels are , global weight vector is , global model is , benign and attack distributions are and respectively, then the adversarial objective is-


It means the adversary wants to maximize the number of misclassification () while keeping divergence value below his tolerance level ().

Fig. 2: Evaluation of DeSMP

attack model on MNIST dataset

[deng2012mnist]: (a) training vs validation loss for three random clients (b) test accuracy for non-DP, DP, and FDI-DP data with varying privacy loss () and attacker’s tolerance () (c) DPFL model prediction (d) generating incorrect prediction due to DeSMP attack

To achieve this goal, the adversary carefully selects the tolerance value () and draws noise from the optimal attack distribution, as represented by (2). In other words, the adversary replaces the benign Gaussian noise mechanism, by malicious noise adding mechanism following (2). Here, represents the mean value or location parameter of the Gaussian distribution while indicates the scaling factor of the same distribution. By controlling the value of the tolerance level (), the adversary can control the attack impact level () and shift the mean value further from the actual value (i.e., to ). In short, increasing/decreasing the value of increases/decreases the level of noise and vice versa. Nevertheless, since the attack distribution follows the same statistical properties of a benign distribution , the adversarial noise as well as the poisonous weights will not be very different statistically from other weights. More specifically, unless the adversary chooses a very large , the proposed DeSMP attack will achieve stealthiness while remaining persistent. We empirically observe and evaluate the proposed DeSMP attack on the FL models in section V.

Iv-C Modeling RL-assisted Defense Strategy:

RL[sutton2018reinforcement] is an adaptive ML algorithm that can facilitate conventional mechanisms with intelligence without the need for any supervision. Distinguishable attributes of RL is a feedback loop (or trial and error) based on the search for optimal action set and delayed rewards. These attributes motivate researchers in deploying RL in divergent sectors, i.e., mmWave communications, smart grid, IoV[9469488], etc.
The addition of DP during the training process will enable the adversary in launching stealthy FDI or poisoning attacks. Moreover, DP will cause degradation in federated accuracy which is difficult to understand and balance the trade-off between privacy, and model performance, both theoretically and empirically[9084352]. On top of this, the FDI attack vector extends the requirement for a trade-off among three different parameters, e.g., privacy, utility, and security. Therefore, selecting the privacy loss () level optimally is a crucial requirement in a DPFL system considering the privacy, utility, and security aspects. Our proposed RL-based model assists this optimal privacy policy selection process. Moreover, it defends the learning process from the DeSMP attacks by reducing the incentive of the adversary, which in turn reduces attack motivations while assisting attack prevention. In short, in this pa e., .

Action Space: We assume that the agent makes a decision in an event-driven manner. By observing the federated environment’s current state, the agent makes one of the decisions as described in the action set . We can define the action-space as, . To fine grain the agent’s action making process, we assume that the agent can increase or decrease privacy loss by multiple steps (alternatively, a single unit or double unit at any state).

Reward Function: Reward motivates an agent to make decision towards the learning objectives. For defense against DeSMP attack, the objective for the agent is to minimize the maximum attack accuracy as well as maximize the federated accuracy. We assume that the maximum and minimum thresholds are set and regulated by the DPFL system designer. We define the reward function for the agent as in equation (4),


where and denotes the maximum value of FDI attack loss and federated loss whereas , , and denotes the balancing parameters.
Here, we use epsilon-greedy policy[wunder2010classes]

for determining the trade-off between exploration and exploitation. We set the initial exploration probability at

, and gradually reduce the exploration probability over episodes until it matches with the minimum exploration probability (which we assume in this paper).

V Experimental Analysis

We simulate an FL environment in order to test our proposed algorithm. Moreover, for comprehensive evaluation of our proposed DeSMP attack model, we focus on the persistence, effectiveness, and stealthiness of the proposed attack under different scenarios for two well-known dataset.

V-a Dataset Description and Experimental Setup:

We utilize the benchmark dataset MNIST (with Non-I.I.D. distributions)[deng2012mnist], Individual household electric power consumption dataset [hebrail2012individual] to evaluate our proposed DeSMP

attack. For MNIST, we have used 10,000 test images to evaluate the performance of the attack model whereas In all of the experiments using these two datasets, following the standard FL setup, each selected participants use the SGD (stochastic gradient descent) optimizer to train their local model for internal epoch with local learning rate (

). All of the experiments are done on a server with Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz, 4 NVIDIA GeForce RTX 2060 GPUs with 16 GB RAM each, and Windows 10 (64-bit) OS, with Python 3.8.8 and PyTorch 1.5.1.

V-B Deployment and Evaluation of Dpfl Model:

To simulate the DPFL environment, we follow some notable prior works [geyer2017differentially, bhagoji2019analyzing, fang2020local] and select the value of some major parameters according to the Table II

. Moreover, for simplicity, we conduct the experiments with a neural network of three layers. For classification problem (i.e., MNIST), the

Log_Softmaxactivation function has been used on top of the ReLU function whereas in regression problems, only ReLU function has been used. To add DP-generated noise into the model weights, we modify the FederatedAveraging [mcmahan2017communication] procedure according to the Algorithm 1. For each experiments, when the privacy budget () exceeds, the learning stops and the server finalizes the global model.

MNIST 100 30 32 0.001 0.1-20 10 0.01
Consumption 100 30 7 0.001 0.1-20 10 0.1
TABLE II: Parameters for FL simulation

Fig. 3: Evaluation of DeSMP attack model on Individual household electric power consumption dataset [hebrail2012individual]: (a) test loss converges even when DP is applied (b) test loss increases as the attacker’s tolerance () increases. (c) more privacy (i.e., small ) leads to more attack opportunity (d) high privacy and high attacker’s tolerance initiates denial-of-service.

For MNIST, the training (Tr) and validation (Val) loss of three random clients (C1, C2, and C3) in an arbitrary communication round has been depicted in Fig. 2(a). It can be inferred that, in each incremental epoch, the training and validation loss is decreasing. Also, from Fig. 2(b), we can see that the DPFL algorithm converges after a few communication rounds. The final accuracy value after round is around . Another important thing to notice is that the privacy budget () is spent very quickly if the is small and the model can not converge properly. Therefore, it is significantly important to select the privacy loss and budget level (i.e., and ) intelligently and in an optimal way so that the model possesses the desired level of privacy and utility. Likewise, for Power consumption dataset, we verify the DPFL approach and find similar results. The cost of applying DP (i.e., the ‘privacy cost‘) over the global model loss varies with . Thus, more privacy leads to more loss for both the classification and regression problems.

V-C Implementation and Evaluation of DeSMP Model:

To demonstrate the proposed DeSMP attack, we replace the benign noise addition mechanism of DP-technique with the adversarial noise addition scheme . More specifically, to simulate the behavior of the actual adversary, instead of drawing noise from the benign Gaussian distribution (), now we draw noise from attack distribution (). We can see the impact of such model poisoning action through the DP-FDI curves of Fig. 2(b). Due to the addition of malicious noise, the overall accuracy has been decreased. However, the degree of model accuracy largely depends on the attacker’s tolerance level (). If the attacker chooses to perform more devastating attacks without paying much attention towards achieving the stealthiness, he would select a large (i.e., ) and in the process, be able to reduce the accuracy largely. In opposite, selecting a small (i.e., ) would give him less payoff in terms of attack impact .

In Fig. 2(c) and (d), We can further observe the impact of our proposed approach. Fig. 2(c) reflects the outcome of the proposed DPFL model on some randomly selected MNIST image samples whereas 2(d) depicts the adversarial outcomes through our proposed (DeSMP) model for the same samples. Due to the stealthy adversarial noise with , only the image of digit ‘4’ has been predicted wrongly as digit ‘7’ while the other digits are predicted correctly. Since we are considering the untargeted model poisoning attack, the adversarial action through the proposed DeSMP model may alter the image label differently each time. However, as the overall accuracy does not degrade too much with a low , the malicious action becomes stealthy and goes unnoticed by the anomaly detectors.

Likewise, the impacts of adversarial action exploiting the DP noise for Power consumption dataset have been illustrated in Fig. (3). It can be observed that the DeSMP attack is also increasing the loss with respect to the increase in and value. However, for the regression problem, if the raw training data across all the clients are similar and identically distributed, then the attack requires adding more noise (i.e., small and large ) in order to achieve the desired level of attack impact. For instance, we can see from Fig. 3(a) that even after adding the DP mechanism with different , the model converges after a sufficient number of communication rounds. It is also desirable since the privacy preserves and the utility remains satisfactory. However, in the presence of an adversary. the loss starts to increase. This phenomena can be observed in Fig. 3(b)-(d). Moreover, as starts to decrease (i.e., privacy increases) from to , the attacker obtains more attack opportunities. From Fig. 3(b), it can be inferred that the shifting from to is increasing the loss by times (i.e., to ) whereas in Fig. 3(c), shifting from to is increasing the loss by more than times (i.e., increasing from to more than ). Moreover, comparing the red FDI-DP curveS of Fig. 3(b) and (c), it can be perceived that decreasing by half (from to ) is increasing the test loss by almost times ( to ) when tolerance level is relatively high ().

Therefore, the attack impact () increases significantly with the increment of the attacker’s tolerance level, , and the model turns to a sub-optimal model. Eventually, through the DeSMP attack, at a very low and high , the DPFL model becomes unusable and initiates denial-of-service. Nevertheless, the proposed DeSMP model can also be tailored to conduct more devastating attacks while maintaining stealthiness through hyper-parameter tuning and selectively choosing the FL-parameters that are mentioned in Table II.

V-D Implementing RL-assisted Privacy Selection

Fig. 4 illustrates the accumulated reward of the defending agent for learning rate and discount factor for two distinct datasets (MNIST and Power consumption). The trend in the figure illustrates that the agent learns optimal policy over episodes, and it converges after sufficient episodes are executed. Since we define the reward function such that it takes care of federated loss , attacker loss , and privacy loss , this convergence finds the optimal trade-off policy for the privacy, security, and utility of the system. Since the agent outputs an action (or ) for each state, we can calculate the standard value of federated loss for that state. Therefore, if the practical or real-time observed federated loss differs from the expected (or standard) one, we can infer whether the attack is launched or not. Specifically, if the is less than , we can infer that the large scale (large ) FDI attack is launched; otherwise, the system is not compromised or the degree of FDI attack scale (low ) is very low.

Vi Conclusion and Future Works

Federated learning (FL) can be vulnerable to privacy-violating and security-compromising attacks despite having privacy-preserving tools like DP. Model update poisoning is one of such attacks. However, stealthy and persistent model poisoning attacks are difficult to achieve. Motivated by this, in this paper, we analyze the adversarial learning process in an FL setting and show that a stealthy and persistent model poisoning attack can be conducted exploiting the differential noise. More specifically, we develop an unprecedented DP-exploited stealthy model poisoning (DeSMP) attack for FL models. Our empirical analysis on both the classification and regression tasks using two popular datasets reflects the effectiveness of the proposed DeSMP attack. Moreover, we develop a reinforcement learning (RL)-based novel defense strategy against such poisoning attacks which can intelligently and dynamically select the privacy policy of the FL models to minimize the DeSMP attack surface, optimize privacy, security, and utility, and facilitate attack detection.

In the future, we will extend our defense model for a collaborative multi-agent setting where the team of clients can exploit the learned policy for collaboratively provisioning privacy during the training phase. Although we focus on the untargeted model poisoning attacks in a DPFL system in this paper, it would be also interesting to investigate the adversarial impact in targeted model poisoning with our proposed DeSMP attack model. We leave it for our future works on adversarial federated learning.

Fig. 4: No. of episodes vs. accumulated rewards for RL assisted privacy selection agent