Privacy, Security, and Utility Analysis of Differentially Private CPES Data

09/21/2021
by   Md Tamjid Hossain, et al.
University of Nevada, Reno
0

Differential privacy (DP) has been widely used to protect the privacy of confidential cyber physical energy systems (CPES) data. However, applying DP without analyzing the utility, privacy, and security requirements can affect the data utility as well as help the attacker to conduct integrity attacks (e.g., False Data Injection(FDI)) leveraging the differentially private data. Existing anomaly-detection-based defense strategies against data integrity attacks in DP-based smart grids fail to minimize the attack impact while maximizing data privacy and utility. To address this challenge, it is nontrivial to apply a defensive approach during the design process. In this paper, we formulate and develop the defense strategy as a part of the design process to investigate data privacy, security, and utility in a DP-based smart grid network. We have proposed a provable relationship among the DP-parameters that enables the defender to design a fault-tolerant system against FDI attacks. To experimentally evaluate and prove the effectiveness of our proposed design approach, we have simulated the FDI attack in a DP-based grid. The evaluation indicates that the attack impact can be minimized if the designer calibrates the privacy level according to the proposed correlation of the DP-parameters to design the grid network. Moreover, we analyze the feasibility of the DP mechanism and QoS of the smart grid network in an adversarial setting. Our analysis suggests that the DP mechanism is feasible over existing privacy-preserving mechanisms in the smart grid domain. Also, the QoS of the differentially private grid applications is found satisfactory in adversarial presence.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

04/06/2022

Adversarial Analysis of the Differentially-Private Federated Learning in Cyber-Physical Critical Infrastructures

Differential privacy (DP) is considered to be an effective privacy-prese...
10/13/2020

Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings

Machine learning models in health care are often deployed in settings wh...
01/11/2022

Feature Space Hijacking Attacks against Differentially Private Split Learning

Split learning and differential privacy are technologies with growing po...
03/02/2021

DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations

Data poisoning and backdoor attacks manipulate training data to induce s...
01/01/2021

Disclosure Risk from Homogeneity Attack in Differentially Private Frequency Distribution

Homogeneity attack allows adversaries to obtain the exact values on the ...
04/08/2020

Differentially Private Optimal Power Flow for Distribution Grids

Although distribution grid customers are obliged to share their consumpt...
02/26/2020

On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping

Machine learning algorithms are vulnerable to data poisoning attacks. Pr...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Cyber Physical Energy System (CPES) such as smart grid data is used for various mission-critical (e.g., state estimation, microgrid islanding, and synchronization, load balancing, etc.) and non-mission-critical applications (e.g., energy consumption prediction, power outage forecasting, etc.)

[sadeghi2015security, sun2016cyber, berriel2017monthly, liu2020study]. The granular level grid data with numerous features pave the way for state-of-the-art security and privacy research. However, grid data also carries the personal and confidential information of the customers. In most cases, releasing such data is restricted due to security and legal issues [fioretto2019differential]. Therefore, the data needs privacy preservation, especially during data sharing or data exchanging operations [giraldo2017security].

Over the years, several research works have been conducted on the data privacy-preservation mechanisms (e.g., secure multi-party communication, k-anonymization, l-diversity, etc.) [kelarev2019multistage] and the privacy-violating attacks on those mechanisms [narayanan2008robust, sweeney2000simple, ohm2009broken]. To overcome such privacy-violating attacks while preserving data confidentiality and maintaining a level of data utility, the concept of DP (Differential Privacy) has been introduced by Dwork et al. [dwork2006calibrating]. The DP mechanism adds sufficient randomized noise to the aggregation result. This randomized noise prevents the attacker from revealing the identity of an entity by obscuring the contribution of a single record. This privacy assurance can motivate the data providers to provide open access to their differentially private databases to the research community and others.

I-a Motivations

Although the DP mechanism achieves a privacy guarantee, a well-known drawback of it is the degradation of data utility with the increments of the data privacy [dwork2006calibrating]. Moreover, from recent research by Giraldo et al., we came to know that differentially private data (i.e., DP-data) can be exploited by the attacker to conduct False Data Injection (FDI) attacks on a DP-based system in smart grid domain [giraldo2020adversarial]. These data integrity attacks on differentially private data have led us to the question- “What should be the design approach to defend a DP-based cyber physical energy systems (CPES) against FDI attack?”. To answer these questions, it is essential to find out the factors that facilitate this FDI attack. Likewise, to design a fault-tolerant, resilient, privacy-preserved, and well-secured defense mechanism, a provable relationship among these factors, outcomes, and parameters of the DP method needs to be developed.

Besides, any active or passive attack degrades the overall QoS of a system. So, it is also essential to raise and subsequently answer the question- “What are the impacts of the FDI attacks on the QoS of a DP-based system?”. Based on this impact analysis of the FDI attack, the feasibility analysis of the DP technique in the CPES domain needs to be carried out in an adversarial setting. Successively, the feasibility analysis would help the CPES designers to determine the proper applicable areas of the DP mechanism.

As DP can be a tool for the attacker to conduct the FDI attack, it should not be applied in the mission-critical operations of any CPES just to enhance data privacy. On the contrary, the method must be applied with proper security measures. Otherwise, the entire CPES can be comprised by any malicious actor. Our motivation behind this research comes from these defense perspectives.

Earlier researches have focused on building robust anomaly detection algorithms to either defend a DP-based CPES or improve the accuracy of ML-based prediction algorithm [giraldo2020adversarial, lecuyer2019certified, cohen2019certified]. However, if the DP technique is applied in several layers of the CPES network, then identifying the false data is very difficult (if not impossible) for the traditional anomaly detectors. Therefore, novel defense strategies need to be devised against FDI attacks in DP-based smart grid applications, and for that more research should be carried out to analyze the correlation among DP-parameters.

I-B Contributions

In this paper, we address and analyze the concern of the FDI attack exploiting the differentially private data through adversarial analysis of the DP-based mechanisms into the smart grid domain. Our main contributions in this research work can be summarized as follows:

  • We have demonstrated the formulation of a provable relationship between the attack impact and the parameters (data sensitivity, privacy loss, and attackers’ tolerance to be detected) associated with DP mechanism. This relationship enables the designer to design a robust and fault-tolerant DP-based privacy-preserving system.

  • We have experimentally evaluated as well as proved that the maximum attack impact of the FDI attack could be minimized if the designer uses our proposed approach to design the process in the first place.

  • We have rigorously analyzed the feasibility of the DP mechanism in CPES data under an adversarial setting from a QoS point of view. We further evaluated the usability of the data considering an attack scenario where the attacker manipulates the data despite all the privacy and security measures. The evaluation shows that the manipulated data are usable for many non-mission-critical operations of CPESs like smart grids.

Section II of this paper covers a brief review of the related works while section III outlines the research problem and threat model. Section IV points out the objectives of the attacker and the defender as well as develops correlation among DP-parameters as a part of a defense against optimal FDI attack. Section V analyzes the effectiveness of our proposed model and correlation through QoS and feasibility analysis. Finally, in section VI, we conclude the paper with some future research directions.

Ii Literature Review

To protect data confidentiality, numerous privacy-preserving techniques have been suggested and applied in the smart grid domain over the last decades [liu2018practical, ferrag2016survey, lu2012eppa]. Some of them use cryptographic encryption techniques (AES, RSA, etc.), while others incorporate homomorphic encryption schemes, pseudonymization, l-diversity, t-closeness, differential privacy, etc.[kelarev2019multistage]. Each of them has its advantages and shortcomings. The advantage of encryption is that it can transfer data without noises while the drawbacks are the high resource utilization, cryptographic key generation, and distribution complexity, latency, etc. [lu2014toward, barbosa2016technique]. Likewise, the data anonymization techniques also suffer from several drawbacks which include the high possibility of being compromised by revealing actual identities and impracticability of the method for data containing millions of records [narayanan2008robust].

Ii-a Countermeasures of Passive Attacks with DP technique

Unlike other privacy-preserving techniques, DP can preserve data privacy more effectively by perturbing a small amount of noise in the query result while making sure not to overly degrade the data utility for many application scenarios. As a consequence, researchers have proposed to use the DP mechanism in CPESs like smart grids, IoT networks, automated vehicles, etc. to protect the data privacy [barbosa2016technique, sandberg2015differentially]. In the smart grid domain, the DP technique has been proposed for smart meters [barbosa2016technique] and state estimations [sandberg2015differentially], power grid obfuscation [fioretto2019differential], PMUs (Phasor Measurement Units) and PMUs [pinte2015low]. All of these works have proposed DP as a solution to passive attacks (e.g., eavesdropping). However, none of these works have considered the impact of the FDI attacks in their smart grid schemes.

Ii-B Countermeasures of Active Attacks in Smart Grid

Another related line of works focuses on the data integrity attacks (e.g., FDI attack) on state estimation, future consumption prediction, billing, and pricing, etc. in the smart grid domain [liu2011false]. Several techniques have been proposed over the years to prevent data integrity attacks using methods such as bad data filtration, blockchain, encryption, etc. [rahman2013false]. In particular, the integration of blockchain to achieve security and integrity has been found effective in the smart grid sector [bhattacharjee2020block, bhattacharjee2020blockchain, hossain2020porch]. Although these works have considered the active attacks in their schemes, they have not formulated these attacks in any DP-based system.

Symbols Description Symbols Description
Dataset contains all the measurement value Optimal attack impact
Dataset differs by a single record from Query result deviation threshold
Transcripts Measured voltage by PMU
Privacy loss PDF of Laplace distribution
Data sensitivity PDF of attack distribution
Differentially private data PDF of optimal attack distribution
Noise drawn from Laplace distribution Mean or location parameter of Laplace distribution
Noise drawn from Optimal attack distribution Kullback-Leibler divergence
Scale parameter of Laplace distribution Query result
Attack impact Attackers’ tolerance to be detected
TABLE I: List of symbols and their description

Ii-C Adversarial Classification of the DP technique

Several studies [giraldo2017security_2, farokhi2018security] have considered the active attacks (particularly, FDI attack) in a DP-based CPES (e.g., smart grid, transportation systems, etc.). In [giraldo2017security_2], Giraldo et al. have analyzed how an attacker can take advantage of the added noise in DP to design stealthy attacks that maximize the physical impact in the system. Following their research, [farokhi2018security] shows that the level of guaranteed privacy times the level of security equals or upper bounded by a constant. Nonetheless, these works have mostly considered the defense mechanism based on anomaly detection schemes.

The closest work to our own is [giraldo2020adversarial], which has proposed the optimal FDI attacks that degrade the anomaly detection capabilities of the system while allowing the attacker to remain undetected by “hiding” false data into the DP noise. The authors also proposed a detection-based defense mechanism to minimize the impact of such attacks. Contrarily to the post-attack detection, we have developed the correlation that facilitates the defense against FDI attacks. Our proposed scheme will assist the system designer to design a robust DP-based architecture considering adversarial presence.

In addition to work on the defense mechanism as a part of the design process, our work is related to the feasibility and QoS analysis of DP-based systems in an adversarial setting. To date, little attention has been paid to this line of research due to the provable privacy guarantee of the DP mechanism. In [pokhrel2020qos], a QoS-aware personalized privacy protection model has been proposed. However, their work considered privacy attacks only. [jeon2011qos] has outlined the QoS requirements of smart grid in terms of low latency, seamless data availability, high data utility, and low complexity.

Nevertheless, their architecture does not cover the DP technique and the active attack scenarios. As DP can be a tool for the attacker to conduct active attacks in the smart grid domain, it is necessary to analyze the feasibility of applying the DP mechanism in smart grid operations. At the same time, it is also essential to carry out the QoS analysis of a DP-based smart grid in the adversarial setting to determine the most suitable applications where DP can be used as a secured privacy-preserving tool. This work also focuses on this particular issue and carries out the required QoS analysis under an active attack scenario.

Iii Problem Statement and Threat Model

In this section, the main research problem, as well as the objectives of this research, are formulated. Also, a threat model of the research problem is developed. Table I provides the symbols and their description used in this paper.

Iii-a Problem Statement

Fig. 1 depicts the basic operational principle of a DP-based system. A mechanism is - indistinguishable if for all pairs which differ in only one entry, for all adversaries , and for all transcripts [dwork2006calibrating]:

(1)

Equation (1) provides the privacy guarantee of DP mechanism. Applying DP during a sum query over a database containing as input, we get-

(2)

Here, represents the Laplacian distribution mechanism. is the sensitivity of the data which is inherent in the dataset. For desired privacy, the noise is calibrated according to the sensitivity of the data. Larger noise yields a smaller value of privacy loss and vice versa. However, more noise leads to less data utility. Besides, applying large noise into the data for ensuring better privacy provides an attacker the opportunity of injecting false data into it. So, if the DP technique is applied in several layers of a CPES network and an attacker injects a low amount of false data into the DP-data (differentially private data), then the traditional anomaly detectors cannot detect the manipulation.

An alternative defense strategy that can minimize the attack impact is to design and apply the DP technique targeting minimum attack impact and maximum privacy. For this, a verifiable correlation between the outcome (attack impact) and the parameters of the DP method needs to be developed that can enable the designer to design a resilient and privacy-preserved system as a part of the defense mechanism.

The primary objective of our analysis is to enable the defender to design a privacy-preserved and secured DP-based system (e.g., synchrophasor network) against FDI attacks. The secondary objective of our research is to analyze the feasibility of the DP mechanism in synchrophasor networks considering the vulnerability of the DP-based system against FDI attacks. We also analyze the QoS of the DP-based synchrophasor network in an adversarial setting. More precisely, we want to raise and subsequently answer the following questions.

Iii-A1 Design objective

Given a dataset , what will be the value of scale parameter and epsilon such that the attack impact becomes equal or close to the actual query result (i.e., )

Iii-A2 Feasibility analysis objective

Would the DP technique remain feasible over other privacy-preserving mechanisms (e.g., encryption, masking, anonymization, etc.) considering its security challenges (i.e., the chance of FDI attacks exploiting DP-noise)?

Iii-A3 QoS analysis objective

What would be the QoS of the DP-based synchrophasor network in terms of computational overhead and data utility under the FDI attack?

Iii-B Threat Model

The proposed threat model, as depicted in Fig. 1, considers the FDI attack on the data packets while transferring from one node to another node through the spoofing technique. Throughout the attack, the attacker pretends to be an honest node to all other nodes and injects false measurement as a form of noise into DP-data. The model is developed based on an attack scenario, where the attacker can conduct the FDI attack despite all the privacy and security measures. More formally, for a DP-result of any aggregated query over a database consisting of grid measurement, the FDI attack is expressed as follows:

(3)

where, is the dataset containing number of measurement values obtained from a PMU dataset, is the measure of Laplace noise, is the non-manipulated differentially private query result over database , is the false measurement injected by the attacker as a form of noise and is the manipulated query result. Moreover, the Laplace noise,

is a random variable with a probability distribution that satisfies the condition of the differential privacy (as stated by (

1

)). Other than the Gaussian and the Exponential distribution, the Laplace distribution is also a good choice for extracting random noise as it satisfies

or

– differential privacy. Also, the Probability Density Function (PDF) of Laplace distribution has a fatter tail which is why Laplacian noise provides better privacy. The Laplace distribution with mean

, variance

and PDF, can be described by (4).

(4)

Here, is the scale parameter and can be represented as the ratio between the sensitivity of the data and the privacy loss . We consider that the attacker varies the value of according to her attack tolerance level (i.e., the willingness to be detected or remain undetected). Now, if is sufficiently small, then it is unlikely for the recipient to detect the manipulation without a well-designed and robust anomaly detector. On the other hand, the anomaly detectors also add extra latency to the network and slow down the system performance. This is a potential threat to the system’s performance that needs to be minimized.

Fig. 1: Differential privacy in an adversarial setting. Differentially private query results are manipulated by a malicious actor and sent to the analyst. Small amount of manipulation is difficult to detect

Iv Optimal Attack and Defense Strategy

In this section, we first describe the attackers’ and the defenders’ objectives. We then compute the correlation among the DP-parameters to facilitate defense against the optimal attack.

Iv-a Attackers’ Objective

An attacker can be a passive attacker or an active attacker. A passive attacker eavesdrops on the communication path while an active attacker directly interferes with the data (e.g., masquerades, modification of messages, denial of services, etc.). The DP mechanism gives protection against passive attacks. However, the attacker can still modify the DP-data by injecting false data into them, which becomes a concerning security issue.

Generally, the attacker of our threat model has two major objectives while carrying out the attack– “(a) maximum damage (b) avoid detection”. These two goals are contradictory with each other in the sense that it becomes difficult for an attacker to hide the attack or hide her identity if she wants to carry out maximum devastation to the system.

Iv-B Defenders’ Objective

Defenders’ objective is to defend the system against passive and active attacks. The defense can be carried out from various fronts of the system. One such technique can be setting up the anomaly detectors to minimize the attack impact. But, this would add extra latency to the system performance. Another way could be designing the system as a specific fault-tolerant in the first place considering the possible FDI attacks in the future. In our model, the defender has two major objectives- “(a) design the process as fault-tolerant against FDI attacks (b) preserve the data privacy and QoS of the system”.

These specific design requirements have led us to the quest of finding a provable relationship among the DP-parameters to achieve desired privacy, utility, security as well as the feasibility of the DP mechanism.

Iv-C Optimal Attack Strategy

An optimal attack would take place when the objectives of the attacker are achieved with maximum possible payoffs. In [giraldo2020adversarial], Giraldo et al. have discussed this optimization problem. They shows that an optimistic attacker would draw noise

from a probability density function,

, which can be represented as follows:

(5)

Here, is the scale factor and is the location parameter. is a solution to KL (Kullback–Leibler) divergence between two probability distributions and can be found by solving below equation –

(6)

Here, is the attackers’ tolerance to be detected. A large would mean that the attacker does not care to be detected whereas a small means hiding her entity is crucial for the attacker. To achieve her second objective which is to remain undetected as long as possible the value of should be as small as zero. However, then the attackers’ probability distribution would be similar to the original probability distribution of the Laplace mechanism (i.e., ). So, the attacker must go for a trade-off between the two objectives. If the attacker chooses a small value of , i.e., to remain undetected as to fulfill the second objective, approaches infinity. Again, as approaches to infinity, the PDF, as described by (5) approaches to . The optimal attack impact is given by the following formula [giraldo2020adversarial].

(7)

Here, is the amount of optimal bias introduced by the attacker. It depends on four parameters (mean, ; attacker’s tolerance, ; data sensitivity, and privacy loss, ). The optimal attack impacts largely vary with the privacy loss and attackers’ tolerance. More specifically, a low privacy loss (or a large amount of noise) and a high attackers’ tolerance help the attacker to conduct a more devastating attack (i.e., the high value of ) by injecting large false data and hiding behind the large noise of differential privacy.

Iv-D Parameter Design for Effective Defense Mechanism

Among the four parameters, the mean (, also called the ‘location’ in PDF) and the data sensitivity () can be calculated from the targeted dataset. Hence, these two parameters are not adjustable by the user as per the design requirements. However, these parameters are crucial for tuning the noise while applying differential privacy so that, either, the attack impact cannot be large, or the attack can be easily detected.

During the design process, the attackers’ tolerance () should be selected in such a way that considers the maximum possible attack impact or deviation from the actual data. The possible incidents that can take place are given below.

  • The attacker chooses a very large to maximize the attack impact without paying much attention to detection. In that case, a carefully designed anomaly detector can easily identify the attack.

  • The attacker chooses a very small to minimize the disclosure of her identity. In that case, the attack impact or deviation would be negligible (as becomes equal to ) and can be ignored.

  • The attacker chooses an optimal value of for an optimal attack impact and identity disclosure. In that case, the attack impact is not only tough to detect (if not impossible) but also impractical to be neglected totally. We elaborate and analyze this point in section V.

We are now ready to present one of the main contributions (i.e., correlation among the DP-parameters considering adversarial presence) of the paper.

Theorem 4.1: For any optimal attack strategy (stealthiness: , attack impact: ), the correlation among the DP-parameters that solves the design problem stated by III-A1 can be represented as follows:

(8)

Proof: The scale factor, needs to be greater than zero, otherwise (4) becomes undefined. Moreover, according to the optimal attack strategy described in IV-C, the optimal attack follows the Laplace distribution. Therefore, the absolute deviation of the attack impact from the mean of the query data (alternatively, Laplace density) cannot be less than zero (i.e., ). Under these conditions, we solve for the scale parameter, from (7) following simple algebraic rules and get–

(9)

Since it is an optimization problem in a constrained environment, we model as a Lagrange multiplier and compute numerically using (6) and (9). Thus, the general expression of (6) can be rewritten as (10). Moreover, we can solve numerically from (10) and successively compute from (11).

(10)
(11)

We can further evaluate the proposed theorem through boundary conditions.

Case-1 (lower bound, ): When (i.e., when we want to minimize the attack impact and make the deviation zero), (10) becomes-

(12)

Here, indicates that the attacker does not want to be detected at all and hence the payoff of the attacker (i.e., ) approaches to zero. After rewriting (9), we can apply limit () as follows:

(13)

Remark 4.1: When , then since Equation (13) and Remark 4.1 indicates that if the privacy loss, approaches to infinity (alternatively, noise approaches to zero) and the attacker does not want to be detected (), then the attack impact is minimum and the deviation from the mean is close to zero (i.e., ).

Case-2 (upper bound, ): If we consider the maximum attack impact (i.e., ), from (10), it can be inferred that , which means the attacker does not care about to be exposed and goes for maximum attack impact. From IV-C, we know that, when , then . Now, when is very small and , then the scale factor () is very large.

Remark 4.2: When , then since . Therefore, if the privacy loss, approaches to zero (0) (alternatively, noise approaches to infinity) and the attacker does not care to be exposed (), then the attack impact is maximum and the deviation from the mean very large (i.e., ).

In summary, (9) – (11) can be used to adjust the privacy loss () for a given sensitivity, , an optimal attack impact, , and a level of attackers’ tolerance, . The defender can use the correlation of (9) and (11) to design a robust, fault-tolerant, privacy-preserved, and well-secured DP-based system. We discuss the effectiveness of our proposed design approach analytically in section V.

Iv-E Modelling the Criterion for Feasibility and QoS Analysis

Analyzing QoS (Quality of Service) of the DP-based system under attack scenario is another key objective of our research. Here, QoS refers to the measurement of the overall performance of the system. To quantitatively measure the QoS of the system, several parameters of the network are considered, such as overall latency, congestion, data availability, data utility, system complexity, network speed, operational overhead, data redundancy, system resiliency, etc. As we have not considered any data availability attacks (e.g., DDoS attack, delay attack, etc.) in our model, we have eliminated the data availability from our QoS criterion list. Moreover, the DP technique is a better choice than other privacy-preserving techniques in terms of system complexity, system resiliency, and operational overhead. The cryptographic encryption techniques incorporate key generation, key distribution, and third-party involvement while the anonymization techniques require huge time for a large dataset. Consequently, we measure the QoS of the attack-prone DP-based synchrophasor network through overall data utility along with the privacy and security cost of DP.

As for the feasibility analysis, it is an assessment of the practicability of a proposed plan. The analysis is carried out by asking and subsequently answering the question- “Is the proposed method feasible?”. We analyze this feasibility question under an adversarial scenario. A method is labeled as feasible if it preserves the specific requirements (e.g., privacy, security, etc.) as well as the QoS of the system. From this perspective, the DP technique is feasible for the synchrophasor network if it provides a substantial amount of data privacy, does not violate the security of the system, and does not degrade the QoS of the system below a satisfactory level. At the same time, it is also necessary to compare the DP technique with other existing privacy-preserving techniques in terms of computational overhead. We carry out such a comparison in section V of this paper.

V Experimental Analysis

For the experimental purpose, we have used the same dataset used by [pignati2015real]. The dataset includes the real-time measurement of dedicated PMUs connected on the medium voltage side of the network secondary substations in a smart grid on the EPFL campus. We have selected this data because similar data are being used in various applications in the smart grid domain, e.g., state estimation, modeling power network, predicting future needs, etc. The data contains historical measurement values from 5 PMUs starting from the year 2014 to 2019 with a high data missing percentage. However, since the frequency of the measurement is the fraction of seconds (ms), the size of the dataset is sufficient for the practical demonstration purpose of our proposed model.

V-a FDI Attack Simulation in a Synchrophasor Network

The FDI attack can take place in different nodes (master nodes, fog nodes, edge nodes, etc.) of a synchrophasor network. The attacker can conduct the attack either by compromising nodes (or sensors) through direct access or by compromising the communication channel through the traditional MITM (Man-in-the-Middle) attack approach [yang2012man]. As physical access to a grid network is difficult for an attacker to achieve, an FDI attack by physically accessing the node is unlikely to occur. Therefore, for our proposed model we have considered the FDI attacks conducted by compromising the communication channel of the synchrophasor network.

In a synchrophasor network, the queries among the master nodes (controller), fog nodes (PDCs-Phasor Data Concentrates), and edge nodes (PMUs) flow from the master nodes towards the edge nodes.

Fig. 2: Hourly average user consumption (KWh) recorded by a PMU. Small level of privacy loss () leads to higher deviation. The DP-noise has been drawn from a Laplace distribution with zero mean and variance

Fig. 3: Correlation between the attack impact and the DP-parameters. Attack impact decreases when- (i) privacy loss increases, (ii) attackers’ tolerance decreases, and (iii) data sensitivity decreases.

When DP is applied over the entire synchrophasor network, hierarchically, the receiving node (e.g., PDC) gets the DP-data from the lower-level nodes (e.g., PMUs). However, as there are multiple layers in the synchrophasor network, the DP technique adds random noise multiple times in the same query result. So, it becomes difficult for the anomaly detectors to identify the false data.

In our experiment, the query is made to find out the hourly average of the user consumption, P (in KWh) recorded by a PMU on a particular day. Fig. 2 depicts the differentially private results (DP-results) of the query along with the actual result for varying privacy loss (). The DP-results differ from the actual query results due to the random Laplacian noise added through the DP process. For demonstrating the FDI attack, some false measurement is injected deliberately as a form of noise into the DP result. The false measurement is drawn from the attackers’ probability distribution following the optimal attack strategy of section IV-C. Corresponding attack impact is illustrated in Fig. 3 for various level of . The attack impact decreases for the following cases: (i) the privacy loss increases, (ii) attacker’s tolerance decreases, and (iii) data sensitivity decreases.

V-B Privacy, Security, and Utility Analysis

According to the -indistinguishability condition, a DP-result differs from the actual query result (Fig. 2). When the noise is large (alternatively, is lower), the DP-result of average user consumption deviates significantly from the true result and vice versa. The amount of random noise also depends on the data sensitivity. Highly sensitive data requires a large amount of noise to preserve data privacy. On the contrary, low sensitive data needs a small amount of noise.

Consequently, for a small amount of noise (alternatively, a high value of ), the DP-result differs from the true result only by small fractions (curve without DP and with in Fig. 2). So, for the non-critical applications requiring less sensitive data in the smart grid domain, the required amount of noise to preserve the data privacy is also low.

V-B1 Correlation between the Attack Impact and the DP-parameters

The data sensitivity level has an important role in the DP mechanism. If the defender designs to add large noise to highly sensitive data, then the attacker can conduct devastating attacks. For example, in Fig. 3, the privacy is very large when . Also, for highly sensitive data (e.g., ), if the defender keeps the privacy loss low () to preserve privacy, the attack impact is as large as . However, the attacker may not create a large attack to avoid the attack detection; rather she would follow the optimal attack distribution described by (5) for an optimal payoff.

In our experiment, the true query result (true average user consumption) is . Following the optimal attack distribution interpreted by (5), the attacker can add maximum noise of if her tolerance level () is . Now, if is increased by (e.g., to ) while remains same (e.g., ), the maximum possible attack impact is increased by approximately (e.g., to ). On the contrary, if rises by (e.g., to ) while remains same (e.g., ), the attack impact increases by approximately (e.g., to ). So, it is perceived that the attack impact depends more on than .

Moreover, we see that the actual mean () has no significant effect on the attack impact. For any mean , the attack impact follows the same principle. That means if the mean is lower than , the attack impact also decreases by the same amount following the DP principle. Based on this correlation between the data sensitivity , the attackers’ tolerance , and privacy loss , the designer can design his system targeting minimal attack impact, high data privacy, and high data utility. More specifically, the designer can calibrate the noise following (9)-(11) and adjust the tolerance of the system up to the calculated attack impact value.

V-B2 QoS of the DP-based Synchrophasor Network in Adversarial Setting

We use the defense cost and the data utility as indicators to evaluate the QoS of the DP-based system in an adversarial setting. The defense cost is the overall cost that arises from the defensive measures against cyber-attacks. It represents the distance of the DP-data from actual data. It is consists of two separate costs- the privacy cost and the security cost. Privacy cost (i.e., the distance of the DP-data from actual data due to privacy measures) takes place when the DP technique is applied to the query result. It increases with the number of times DP mechanism has been applied in the nodes. On the other hand, the security cost is added to the defense cost when the FDI attacks occur. Besides analyzing the defense cost of a DP-based synchrophasor network under an attack framework, we also measure the data utility in terms of prediction accuracy. The data utility measurement is performed with both the DP-data and the FDI-DP-data (i.e., the DP-data that is manipulated by the FDI attack).

Motivated by [yu2015towards], we build the proposed prediction model to evaluate the above-mentioned properties. The model simply executes a time-series prediction algorithm on the PMU dataset for different levels of privacy loss (). The experiment is conducted with the original data, the DP-data, and the FDI-DP-data. After executing the experiment with DP-data, the defense cost (i.e., the privacy and the security cost) is measured on the predicted values. Next, the FDI attack is conducted deliberately on both datasets (with and without DP mechanism). After that, the manipulated datasets are fed into the prediction model. Finally, the security cost (i.e., the prediction deviation) due to the FDI attack on the DP-data is measured.

We first study the defense cost (i.e., the sum of privacy cost and security cost) and find that the privacy cost is higher than the security cost when the attacker conducts a stealthy attack (as shown in Fig. 4). However, the overall defense cost reduces significantly when the privacy loss () is increased.

Fig. 4: Utility cost analysis of DP method for varying privacy loss (). Privacy cost is higher than the security cost. Moreover, the overall cost is small when epsilon is large (i.e., )

Then, we investigate the utility of both the DP-data and FDI-DP-data through a time-series prediction algorithm. We find that the “DP: predicted” value is higher than the “Original: predicted” value due to the privacy cost of DP mechanism as shown in Fig. (a)a(a). After that, we deliberately conduct the FDI attack (from “2018-10-14” to “2018-11-14” in Fig. (a)a(b) on the original data. Note that the FDI attack has a significant impact on the predicted values. Finally, the same degree of FDI attack is conducted intentionally on the DP-data to find the changes in the predicted values. The findings indicate that the distance between prediction accuracy with DP-data and FDI-DP-data is nominal as the security cost is very low (Fig. (a)a(c)). We find that the impact of the FDI attack for a short period of time is low on the predicted values and can be negligible for the non-critical applications of the smart grid domain.

V-B3 Feasibility of DP mechanism in Synchrophasor Network

The criterion for our feasibility analysis of DP mechanism is the comparison of overall latency. The latency comparison is carried out between the DP technique and the AES-256 encryption technique under an attack scenario. AES-256 is chosen because it is computationally faster than most of the other encryption techniques (e.g. RSA). Also, the anonymization techniques incur more latency than AES-256 encryption over the large dataset. So, for the feasibility analysis of the DP technique under attack scenario, the latency comparison between the DP and AES-256 technique should suffice.

(

(b)

(

(c)
Fig. 8: QoS analysis of a DP-based system under attack scenario. (a) ’DP-predicted’ vary from the ’original-predicted’ due to privacy cost (b) The attack impact on the prediction accuracy is negligible for a short-time FDI attack (circled in the graph) (c) The ’DP-FDI-predicted’ values are very close to the ’DP-predicted’ values as the security cost is low.
(a)

Both the passive and the active attacks add a latency burden to the overall system performance. In the case of the FDI attack, the attacker manipulates the data by adding false information. This process takes some time and adds latency to data flow speed. For any encryption technique, to avoid detection, the attacker needs to decrypt the data, inject false data and again encrypt the falsified data with the cryptographic key. However, for DP technique, the attacker only injects false data and remains undetected. As a consequence, the DP mechanism does not add as much latency as AES-256. Our experimental evaluation also supports this latency comparison under the FDI attack scenario. We find that the DP technique is approximately times faster (on an average) than the AES-256 in the same adversarial setting (i.e., the computational time of DP is only seconds whereas the computational time of AES-256 is ).

Vi Conclusion and Future Work

The DP mechanism can be exploited as a tool for conducting the FDI attack. To overcome this challenge, numerous research have been carried out on developing a defense mechanism based on the anomaly detection schemes in a DP-based CPES. However, the traditional anomaly detectors cannot detect the bad data added by exploiting the DP mechanism in multiple layers of the synchrophasor network. Addressing this, in this paper, we formulate and develop the defense strategy as a part of the design process. We have proposed a provable correlation among the factors affecting DP mechanism which enables the defender to design his DP-based system fault-tolerant against the FDI attacks. To experimentally evaluate and prove the effectiveness of the proposed correlation, we have simulated the FDI attack in a DP-based synchrophasor network. The evaluation indicates that the attack can be minimized using the proposed correlation in the design process. The QoS analysis of the DP-based CPES indicates the applicability of the DP mechanism in many non-critical operations. Furthermore, the feasibility analysis of DP mechanism in the CPES domain infers that the DP technique is feasible over other privacy-preserving mechanisms in terms of computational overheads.

DP has the potential to preserve data privacy in other fields also such as edge computing and cloud computing [du2018differential, wang2020edge]. We hope that this research will contribute to those fields too. Continuing this line of research, we hope to analyze the impact of distributed active attacks on the DP-based system. In particular, we want to investigate the impact of distributed FDI attacks on personalized-privacy-aware grid architecture. Additionally, we hope to extend our model from a game-theoretic viewpoint where multiple defenders and attackers play dynamic strategies in a repeated manner.

References