I Introduction
Federated Learning (FL), also known as collaborative learning, has caught a lot of attention from the research community since it has been first introduced back in 2016 by McMahan et al. [mcmahan2017communication]. It is mostly because of the inherent privacy protection that FL offers to its users. In the FL process, a model is trained on a diffuse network of edge nodes using their local data; rather than the traditional centralized training fashion. This provides a level of data privacy assurance to the users since the confidential data do not leave the edge nodes.
However, the process of FL can be vulnerable to differential attacks (e.g., membership inference attacks (MIA)) which aim to reveal the sensitive information of a node by analyzing the distributed model parameters [geyer2017differentially] or gradients [zhu2020deep]. To alleviate this privacy issue, extensive research have been carried out lately, focusing on developing secure multiparty computation (SMPC) [li2020privacy], trusted execution environments (TEEs) [mo2019efficient], cryptographic encryption [zhang2020batchcrypt, sadique2021cybersecurity, sadique2019system], and differential privacy (DP)based privacypreservation techniques [dwork2006calibrating, kairouz2019advances, bhagoji2019analyzing] for FL. Among these, DP is considered a very promising technique to preserve the data privacy and prevent MIA [shokri2017membership]. Existing works along this research line include DPbased distributed SGD [abadi2016deep], local DP (LDP) [wang2019local],
Although DP is providing a level of privacy guarantee, an adversary can exploit the DP noise to inject false data into the original data and hide the attack identity exploiting the noise range [giraldo2020adversarial]. In this paper, we investigate this vulnerability of DPbased applications and show that in a differentially private FL setting (we call it ‘DPFL’), a malicious actor can inject the false data either into the differentially private training data (i.e., data poisoning attack [biggio2012poisoning] or into the model parameters (i.e., model poisoning attack [bagdasaryan2020backdoor, bhagoji2019analyzing]
). More specifically, we demonstrate a stealthy model poisoning attack in the FL model exploiting the noise of the DP mechanism that (1) reduces the overall accuracy of the global federated model, and (2) deceives the traditional anomaly detection mechanisms by hiding the false data into the DPnoise. The results in this paper reveal a new backdoor for stealthy and untargeted model poisoning attacks in FL through the exploitation of the DP mechanism.
Ia Motivations
Poisoning attacks in any machine learning (ML) setting can be broadly divided into two major categories: targeted and untargeted attacks [kairouz2019advances]. Targeted poisoning attacks [bagdasaryan2020backdoor, bhagoji2019analyzing] aim to change the outcome or behavior of the model on particular inputs while maintaining a good overall accuracy on all other inputs, thus makes the attack and defense processes more difficult. On the contrary,the untargeted model poisoning attacks [biggio2012poisoning, fang2020local] have the power to make a model unusable and eventually leads to a denialofservice attack [fang2020local]. For instance, an adversary may perform untargeted attacks on its competitor’s FL model with an intention to make the model unfeasible.
However, traditional untargeted poisoning attacks mainly utilize the hyperparameters of the targeted model to scale up the effectiveness of the malicious model
[bhagoji2019analyzing]. To attain the goal of poisoning, the adversary may use explicit boosting that deforms the weights’ distribution, however, then it can be easily detected by the server through simple serverside model checking [pillutla2019robust]. Hence, untargeted model poisoning attacks in a stealthy manner remain an open problem in FL [zhou2021deep]. Moreover, since an FL system usually consists of a huge number of clients and only a portion of clients are chosen for any particular round [sun2019can], the odds of impacting the global model accuracy significantly by a single malicious contribution is very low. This leads us to the question
“How can the adversary perform an untargeted model poisoning attacks in a stealthy but persistent fashion?”. Motivated by this, in this paper, we investigate the DP mechanism as a tool to conduct such adversarial poisoning attacks in FL. In the rest of the paper, the ‘false data injection (FDI)’ attack and ‘model poisoning’ attack is mentioned interchangeably.IB Contributions
In this paper, we show that the DP mechanism is creating a new attack avenue for stealthy false data injection (FDI) or model poisoning attacks in a DPFL environment. We name this attack model as ‘DPexploited stealthy model poisoning’ (in short, DeSMP) attacks. Particularly, we make the following contributions:

We demonstrate that DP, as a privacypreserving tool, is opening a new backdoor for untargeted model poisoning attacks in the FL setting. Our proposed attack strategy (DeSMP) is stealthy and persistent in nature.

To tackle the proposed DeSMP attack, we develop a reinforcement learning (RL)based defense strategy. The proposed RLbased defense approach intelligently selects the differential privacy level for the clients’ model update. It also minimizes the attack vectors and facilitates attack disclosure.
Section II of this paper covers preliminaries of FL and a brief review of the related works while section III outlines the research problem and threat model. Section IV formulates the proposed DeSMP attack and defense model and their working principle. In section V, we analyze and evaluate the effectiveness of our proposed model. Finally, in section VI, we conclude the paper with some future research directions.
Ii Preliminaries and Literature Review
Here, we discuss the basic mechanism of FL while pointing out some significant contrasting contributions between this work and existing notable research work in adversarial FL. Table I describes the major symbols used in this paper.
Iia Mechanism of Federated Learning with DP
FL introduces a collaborative zone for training a model among a set of workers. Here, each participating node maintains a local model for its local training dataset. Additionally, FL incorporates a server that aggregates all the local models to form a global model [mcmahan2017communication]. Furthermore, to tackle MIA through analyzing the model weights, the FL server generally includes a privacypreserving mechanism such as DP [geyer2017differentially]. Here, DP adds the random Laplacian () or Gaussian noise () to the model weights.
Nonetheless, while deploying the DP mechanism, researchers [geyer2017differentially, sun2019can] have suggested using norm clipping or early stopping methods to compensate for the high level of random differential noise and prevent the model to be completely unusable. Once a predefined testing criterion (e.g., model accuracy is greater than a threshold or privacy budget exceeds) is met, the server finalizes the global model and stops the training procedure; otherwise, the training process reinitiates.
IiB Adversarial Federated Learning
Although the DPbased FL models do not expose the client’s training data to the rest of the world, there exist several attack vectors that an adversary can exploit to perform malicious modification or gain unauthorized access to confidential information. For instance, there could be some malicious clients who might inspect all messages received from the server and then, in the training phases, selectively poison the local models to reduce the efficiency of the global model [kairouz2019advances]. Other examples of the adversarial FL include the targeted and untargeted model poisoning attacks [bhagoji2019analyzing, fang2020local]. However, unlike the centralized ML schemes, the FL systems may employ a large number of untrusted devices which may facilitate the trainingtime attacks and inferencetime attacks [kairouz2019advances]. In this paper, we focus on one of the powerful attack classes which is an untargeted model poisoning attack [fang2020local]. The adversary can conduct this model poisoning attack either by directly manipulating a client’s model or through the widely known maninthemiddle attack formation leveraging the network and system vulnerabilities [kairouz2019advances].
IiC Related Research Work
In this part, we discuss some notable prior research related to the untargeted model poisoning attacks and defenses in FL while outlining some contrasting points with ours.
IiC1 Byzantinerobust Aggregation in Adversarial Setting
Byzantine threat models[guerraoui2018hidden] produce arbitrary outputs for any wrong inputs (either by an honest participant or a malicious actor). These arbitrary outputs can lead to converging the model to a suboptimal model. Moreover, the Byzantine clients may need to have the whitebox access or the nonByzantine client updates to make their attack stealthy [kairouz2019advances]. Nonetheless, to the best of our knowledge, none of the existing works explore the vulnerabilities of the DPbased applications in tailoring such stealthy attacks. In contrast, we demonstrate that the Byzantine clients or the server can conduct stealthy and persistent untargeted model poisoning attacks by hiding behind the DP mechanism. In particular, we demonstrate the DPexploited stealthy model poisoning (DeSMP) attacks in an untargeted manner for FL models.
IiC2 DPassisted FL Frameworks in CPSs
Another related line of research focus on developing novel FL frameworks for cyberphysical systems (CPSs) such as power IoT [cao2020ifed], internet of vehicles (IoV) [zhao2020local], smart grids [taik2020electrical] etc. They pave the way for adopting FL into the CPS domain. Particularly, [taik2020electrical] shows that the FL models, coupled with edge computing, perform very efficiently in shortterm load forecasting while significantly reducing the networking load compared to a centralized model. Nevertheless, they do not cover the adversarial analysis of the FL systems for model update poisoning attacks in CPSs. Since, in CPSs like smart grids, many missioncritical operations depend on the model accuracy, the DPassisted poisoning attacks may create devastating consequences through the failure of physical layer devices. Therefore, it is nontrivial to investigate the attack surfaces of a DPFL model in CPSs. In this context, we focus on the adversarial analysis of the DPtechnique in the CPS domain, which will facilitate the future development of novel and effective defense strategies.
IiC3 Attack Mitigation Strategies in Adversarial DP
Although some recent works [farokhi2018security, giraldo2020adversarial] consider active attacks (e.g., FDI attacks, poisoning attacks, etc.) in DPbased CPSs (e.g., smart grids, transportation systems, etc.), they neither discuss the stealthy model poisoning attacks nor develop any defense strategies based on intelligent decision making for differential privacy level through RL. In particular, they discuss and successively solve the optimal FDI attack problems by developing defense mechanisms based on the anomaly detection schemes for the postattack phases; instead of taking any initiative to reduce the attack surface beforehand.
In contrast, we analyze the correlation of the DP and FL parameters under adversarial settings; then, leveraging the correlation, we facilitate deployment of the desired level of privacy, utility, and security among the participating nodes in a DPFL system through RL. Following the adversarial analysis and our proposed RLassisted defense strategy, the largescale poisoning attacks can be detected and the attack surface can be minimized, i.e., the incentive of the attacker can be reduced, which in turn reduces attack motivations while assisting attack prevention. In short, we develop our RLassisted defense strategy as a part of the design process (preattack phase) to prohibit the untargeted model poisoning attacks. To the best of our knowledge, this is the first work that addresses the DPexploitation issue in FL setting and successively develops the RLbased defense strategy.
Iii Problem Formulation and Threat Model
Suppose, we have clients, among which number of clients are selected in each communication round by the server. If the local model updates are , then the global model update at () communication round is: where the local model update at () round is: . In an alternative fashion, the loss of the predication can also be calculated as where is the examples set and represent the weights of global model. Now, according to the FederatedAveraging algorithm [mcmahan2017communication], the objective of the federated server is to minimize the following function: . The server continues the process until the objective is met.
To introduce DP for preventing model privacy leakage while keeping the model usable, we need to (a) clip the local model updates using the median norm of the unclipped contributions () so that the norm is limited and learning is progressing, and (b) add noise from a DPpreserving randomized mechanism (e.g., Laplace or Gaussian mechanism). Therefore, the new global model update with Gaussian noise at () round becomes: . Here,
is the variance and
is the sensitivity of the dataset with respect to the aggregation operation. The value of needs to be selected in an optimal way so that the noise variance stays sufficient while the aggregated weight’s distribution remains as close as possible to the original distribution. Following the related previous research [abadi2016deep, geyer2017differentially], we set . We draw the noise from a Gaussian distribution with mean (), varianceand PDF (probability density function) as:
(1) 
However, a malicious actor (if presents) may modify (increase or decrease) the randomized noise in such a fashion that would facilitates (a) maximum damage, and (b) avoid detection. To perform such stealthy but strong malicious modification, the adversary needs to craft a fake noise profile from either the same or at least, similar distribution function as (1). Earlier research on adversarial differential privacy [giraldo2020adversarial, hossain2021PSU] present us with such optimal attack distribution () and impact () as follows:
(2) 
Here, a high value of attacker’s tolerance () represents that the adversary does not care to be detected whereas a low value means the adversary wants to keep a low profile to avoid detection, and thus sacrifices the attack impact (). More specifically, the adversarial objective of stealthiness can be formulated as: , where is the KullbackLeibler divergence between the PDF of attack distributions () and benign distribution (
) and indicates the classifier’s ability to correctly identify the inputs. Moreover, it can be inferred from (
2) that during a data or model poisoning attack, the optimal attack impact () is shifting the benign mean from to . However, is equal to the actual mean () when is zero. In short, it implies that when there is no attack or no DP mechanism, the results (in this case, the model weights) remain intact. The optimal attack distribution of (2), has been obtained by solving the functional multicriteria optimization problem of the attacker (i.e., maximum attack while minimum disclosure) and the defender (i.e., maximum privacy with maximum utility). Therefore, if the adversary deviates from the strategy as given by (2), he could end up with even lower payoffs [giraldo2020adversarial].This observation on adversarial DP analysis motivates us to first raise the question “what would be the adversarial impact on the DPFL system if the adversary follows the optimal attack strategy, ?” and then, answer it through theoretical and empirical analysis. Moreover, this potential research problem motivates us to develop a novel and effective defense strategy against such attacks using RLbased intelligent differential privacy level selection.
Iiia Threat Model
Our proposed threat model has been depicted in Fig. 1. Here, we are considering a simplified smart grid data transmission architecture which consists of some edge devices (e.g., distribution energy resources (DERs), intelligent electronic devices (IEDs), phasor measurement units (PMUs), etc.), data aggregators (e.g., phasor data concentrators (PDCs)), and a central server. The adversary can mark his presence in (1) the edge nodes (i.e. disguise as an edge device), (2) the communication pathway between the clients and the server, and (3) the serverside. In case of data poisoning attacks, it is convenient for the adversary to compromise some edge devices (i.e., position 1), manipulates local training data, and disguises them as honest edge nodes. However, for model poisoning attacks, the suitable positions for the attacker are positions 2 and 3 since from those positions, the adversary can directly manipulate the FL models through compromising the communication path, sieging the model parameters, and then injecting fake noise into the parameters.
In the proposed setting, we assume that the adversary can manipulate the model updates regardless of the attack vectors (i.e., through maninthemiddle or serverside attack formation). However, the adversary cannot directly change the models that are already on the server. He has whitebox access (i.e., full knowledge of the global and local model parameters). The adversary might have partial knowledge of the training and testing data (i.e., distributions of the data); however, this is not a strict requirement in our threat model. In addition, we assume that the adversary has the knowledge of the imposed DP mechanism and privacy budget (). This assumption is particularly important and realistic as many researchers including Dwork et al. [dwork2019differential] emphasize the necessity of publishing the privacy budget in order to increase the trustworthiness of the system.
Iv Modeling DeSMP attack and Defense in Dpfl
In this section, we first describe the methodology of our proposed system development from an algorithmic point of view, and then, we model the proposed DeSMP attack and RLassisted defense strategy.
Iva Development of Dpfl Systems
As discussed in section IIA, in a DPFL system, the global model is first constructed by aggregating all the local models from the randomly selected clients, and then, DPnoise is added into the model parameters to obfuscate the individual contribution of the clients. The working principle of a DPFL system with RLbased privacy selection is described through the pseudocodes of algorithm 1. The algorithm simply takes the measured data and the parameters of FL, RL, and DP as input. Then, through some intermediary functions (i.e., : local model, : reinforcement learning model, : global model), the global model is computed. If the computed global model passes the accuracytest (i.e., accuracy is more than a predefined threshold, ), the global model is finalized and the DPFL process completes.
IvA1 Local Model ()
The function takes the measurement data and learning parameters as inputs. Each client shares a portion of data (i.e., minibatch) and train the global model with their local data. Finally, the local model and norm updates are calculated and sent back to the server.
IvA2 Reinforcement Learning Model
The purpose of the function is to generate the optimal policy for determining the privacy budget () considering the tradeoff among the privacy, utility, and security in a DPFL system. The input of this function is the state of the system which comprises of . The function exploits the converged Qtable to determine the optimal action (or value of ) at each state of the learning process.
IvA3 Global Model ()
The sole purpose of function is to produce the global model () after each communication round through FederatedAveraging procedure [mcmahan2017communication] until the model finally converges around a predefined threshold value, . The function also checks if the privacy budget is expired on it. Another important task of this function is to clip the gradient to avoid overfitting or gradient exploding and add Gaussian noise accordingly.
IvB Modeling DeSMP Attack
To perform the proposed DeSMP attack, the adversary needs to choose the level of his stealthiness or attacker’s tolerance (). Here, the adversarial goal is to perform the attack so that the model is unusable and ineffective (i.e., converges to a badminimum or starts denialofservice) and the attack is stealthy. For instance, in a classification problem, if the test inputs are , output labels are , global weight vector is , global model is , benign and attack distributions are and respectively, then the adversarial objective is
(3) 
It means the adversary wants to maximize the number of misclassification () while keeping divergence value below his tolerance level ().
To achieve this goal, the adversary carefully selects the tolerance value () and draws noise from the optimal attack distribution, as represented by (2). In other words, the adversary replaces the benign Gaussian noise mechanism, by malicious noise adding mechanism following (2). Here, represents the mean value or location parameter of the Gaussian distribution while indicates the scaling factor of the same distribution. By controlling the value of the tolerance level (), the adversary can control the attack impact level () and shift the mean value further from the actual value (i.e., to ). In short, increasing/decreasing the value of increases/decreases the level of noise and vice versa. Nevertheless, since the attack distribution follows the same statistical properties of a benign distribution , the adversarial noise as well as the poisonous weights will not be very different statistically from other weights. More specifically, unless the adversary chooses a very large , the proposed DeSMP attack will achieve stealthiness while remaining persistent. We empirically observe and evaluate the proposed DeSMP attack on the FL models in section V.
IvC Modeling RLassisted Defense Strategy:
RL[sutton2018reinforcement] is an adaptive ML algorithm that can facilitate conventional mechanisms with intelligence without the need for any supervision. Distinguishable attributes of RL is a feedback loop (or trial and error) based on the search for optimal action set and delayed rewards. These attributes motivate researchers in deploying RL in divergent sectors, i.e., mmWave communications, smart grid, IoV[9469488], etc.
The addition of DP during the training process will enable the adversary in launching stealthy FDI or poisoning attacks. Moreover, DP will cause degradation in federated accuracy which is difficult to understand and balance the tradeoff between privacy, and model performance, both theoretically and empirically[9084352]. On top of this, the FDI attack vector extends the requirement for a tradeoff among three different parameters, e.g., privacy, utility, and security. Therefore, selecting the privacy loss () level optimally is a crucial requirement in a DPFL system considering the privacy, utility, and security aspects. Our proposed RLbased model assists this optimal privacy policy selection process. Moreover, it defends the learning process from the DeSMP attacks by reducing the incentive of the adversary, which in turn reduces attack motivations while assisting attack prevention. In short, in this pa
e., .
Action Space: We assume that the agent makes a decision in an eventdriven manner. By observing the federated environment’s current state, the agent makes one of the decisions as described in the action set . We can define the actionspace as, . To fine grain the agent’s action making process, we assume that the agent can increase or decrease privacy loss by multiple steps (alternatively, a single unit or double unit at any state).
Reward Function: Reward motivates an agent to make decision towards the learning objectives. For defense against DeSMP attack, the objective for the agent is to minimize the maximum attack accuracy as well as maximize the federated accuracy. We assume that the maximum and minimum thresholds are set and regulated by the DPFL system designer. We define the reward function for the agent as in equation (4),
(4) 
where and denotes the maximum value of FDI attack loss and federated loss whereas , , and denotes the balancing parameters.
Here, we use epsilongreedy policy[wunder2010classes]
for determining the tradeoff between exploration and exploitation. We set the initial exploration probability at
, and gradually reduce the exploration probability over episodes until it matches with the minimum exploration probability (which we assume in this paper).V Experimental Analysis
We simulate an FL environment in order to test our proposed algorithm. Moreover, for comprehensive evaluation of our proposed DeSMP attack model, we focus on the persistence, effectiveness, and stealthiness of the proposed attack under different scenarios for two wellknown dataset.
Va Dataset Description and Experimental Setup:
We utilize the benchmark dataset MNIST (with NonI.I.D. distributions)[deng2012mnist], Individual household electric power consumption dataset [hebrail2012individual] to evaluate our proposed DeSMP
attack. For MNIST, we have used 10,000 test images to evaluate the performance of the attack model whereas In all of the experiments using these two datasets, following the standard FL setup, each selected participants use the SGD (stochastic gradient descent) optimizer to train their local model for internal epoch with local learning rate (
). All of the experiments are done on a server with Intel(R) Core(TM) i79700F CPU @ 3.00GHz, 4 NVIDIA GeForce RTX 2060 GPUs with 16 GB RAM each, and Windows 10 (64bit) OS, with Python 3.8.8 and PyTorch 1.5.1.
VB Deployment and Evaluation of Dpfl Model:
To simulate the DPFL environment, we follow some notable prior works [geyer2017differentially, bhagoji2019analyzing, fang2020local] and select the value of some major parameters according to the Table II
. Moreover, for simplicity, we conduct the experiments with a neural network of three layers. For classification problem (i.e., MNIST), the
Log_Softmaxactivation function has been used on top of the ReLU function whereas in regression problems, only ReLU function has been used. To add DPgenerated noise into the model weights, we modify the FederatedAveraging [mcmahan2017communication] procedure according to the Algorithm 1. For each experiments, when the privacy budget () exceeds, the learning stops and the server finalizes the global model.



MNIST  100  30  32  0.001  0.120  10  0.01  
Consumption  100  30  7  0.001  0.120  10  0.1 
For MNIST, the training (Tr) and validation (Val) loss of three random clients (C1, C2, and C3) in an arbitrary communication round has been depicted in Fig. 2(a). It can be inferred that, in each incremental epoch, the training and validation loss is decreasing. Also, from Fig. 2(b), we can see that the DPFL algorithm converges after a few communication rounds. The final accuracy value after round is around . Another important thing to notice is that the privacy budget () is spent very quickly if the is small and the model can not converge properly. Therefore, it is significantly important to select the privacy loss and budget level (i.e., and ) intelligently and in an optimal way so that the model possesses the desired level of privacy and utility. Likewise, for Power consumption dataset, we verify the DPFL approach and find similar results. The cost of applying DP (i.e., the ‘privacy cost‘) over the global model loss varies with . Thus, more privacy leads to more loss for both the classification and regression problems.
VC Implementation and Evaluation of DeSMP Model:
To demonstrate the proposed DeSMP attack, we replace the benign noise addition mechanism of DPtechnique with the adversarial noise addition scheme . More specifically, to simulate the behavior of the actual adversary, instead of drawing noise from the benign Gaussian distribution (), now we draw noise from attack distribution (). We can see the impact of such model poisoning action through the DPFDI curves of Fig. 2(b). Due to the addition of malicious noise, the overall accuracy has been decreased. However, the degree of model accuracy largely depends on the attacker’s tolerance level (). If the attacker chooses to perform more devastating attacks without paying much attention towards achieving the stealthiness, he would select a large (i.e., ) and in the process, be able to reduce the accuracy largely. In opposite, selecting a small (i.e., ) would give him less payoff in terms of attack impact .
In Fig. 2(c) and (d), We can further observe the impact of our proposed approach. Fig. 2(c) reflects the outcome of the proposed DPFL model on some randomly selected MNIST image samples whereas 2(d) depicts the adversarial outcomes through our proposed (DeSMP) model for the same samples. Due to the stealthy adversarial noise with , only the image of digit ‘4’ has been predicted wrongly as digit ‘7’ while the other digits are predicted correctly. Since we are considering the untargeted model poisoning attack, the adversarial action through the proposed DeSMP model may alter the image label differently each time. However, as the overall accuracy does not degrade too much with a low , the malicious action becomes stealthy and goes unnoticed by the anomaly detectors.
Likewise, the impacts of adversarial action exploiting the DP noise for Power consumption dataset have been illustrated in Fig. (3). It can be observed that the DeSMP attack is also increasing the loss with respect to the increase in and value. However, for the regression problem, if the raw training data across all the clients are similar and identically distributed, then the attack requires adding more noise (i.e., small and large ) in order to achieve the desired level of attack impact. For instance, we can see from Fig. 3(a) that even after adding the DP mechanism with different , the model converges after a sufficient number of communication rounds. It is also desirable since the privacy preserves and the utility remains satisfactory. However, in the presence of an adversary. the loss starts to increase. This phenomena can be observed in Fig. 3(b)(d). Moreover, as starts to decrease (i.e., privacy increases) from to , the attacker obtains more attack opportunities. From Fig. 3(b), it can be inferred that the shifting from to is increasing the loss by times (i.e., to ) whereas in Fig. 3(c), shifting from to is increasing the loss by more than times (i.e., increasing from to more than ). Moreover, comparing the red FDIDP curveS of Fig. 3(b) and (c), it can be perceived that decreasing by half (from to ) is increasing the test loss by almost times ( to ) when tolerance level is relatively high ().
Therefore, the attack impact () increases significantly with the increment of the attacker’s tolerance level, , and the model turns to a suboptimal model. Eventually, through the DeSMP attack, at a very low and high , the DPFL model becomes unusable and initiates denialofservice. Nevertheless, the proposed DeSMP model can also be tailored to conduct more devastating attacks while maintaining stealthiness through hyperparameter tuning and selectively choosing the FLparameters that are mentioned in Table II.
VD Implementing RLassisted Privacy Selection
Fig. 4 illustrates the accumulated reward of the defending agent for learning rate and discount factor for two distinct datasets (MNIST and Power consumption). The trend in the figure illustrates that the agent learns optimal policy over episodes, and it converges after sufficient episodes are executed. Since we define the reward function such that it takes care of federated loss , attacker loss , and privacy loss , this convergence finds the optimal tradeoff policy for the privacy, security, and utility of the system. Since the agent outputs an action (or ) for each state, we can calculate the standard value of federated loss for that state. Therefore, if the practical or realtime observed federated loss differs from the expected (or standard) one, we can infer whether the attack is launched or not. Specifically, if the is less than , we can infer that the large scale (large ) FDI attack is launched; otherwise, the system is not compromised or the degree of FDI attack scale (low ) is very low.
Vi Conclusion and Future Works
Federated learning (FL) can be vulnerable to privacyviolating and securitycompromising attacks despite having privacypreserving tools like DP. Model update poisoning is one of such attacks. However, stealthy and persistent model poisoning attacks are difficult to achieve. Motivated by this, in this paper, we analyze the adversarial learning process in an FL setting and show that a stealthy and persistent model poisoning attack can be conducted exploiting the differential noise. More specifically, we develop an unprecedented DPexploited stealthy model poisoning (DeSMP) attack for FL models. Our empirical analysis on both the classification and regression tasks using two popular datasets reflects the effectiveness of the proposed DeSMP attack. Moreover, we develop a reinforcement learning (RL)based novel defense strategy against such poisoning attacks which can intelligently and dynamically select the privacy policy of the FL models to minimize the DeSMP attack surface, optimize privacy, security, and utility, and facilitate attack detection.
In the future, we will extend our defense model for a collaborative multiagent setting where the team of clients can exploit the learned policy for collaboratively provisioning privacy during the training phase. Although we focus on the untargeted model poisoning attacks in a DPFL system in this paper, it would be also interesting to investigate the adversarial impact in targeted model poisoning with our proposed DeSMP attack model. We leave it for our future works on adversarial federated learning.