The power grid is evolving with increasing dependency on Information and Communication Technologies (ICT). Today, ICT is realized in energy control centers through Supervisory Control and Data Acquisition (SCADA) systems and Energy Managment Systems (EMS). While EMSs make commands for power grid operation, SCADA systems serve as the gateway between EMS and field networks by passing measurements and control commands. The present SCADA is in the fourth generation of architectures, which bring innovative and cost-efficient solutions, such as cloud computing and Internet of Things, while opening up a much wider scope of cyber-security concerns among utilities . Since the notorious Stuxnet attack to Siemens SIMATIC WinCC SCADA system in July 2010, approximately 45,000 cases of SCADA infection around the world have been reported, including the Iranian nuclear facilities and the Ukrainian power grid, according to Symantec’s statistics . These attacks, if successful, would lead to massive power outages, resulting in severe physical, economic, and social impacts.
Intrusion Detection Systems (IDS) are redeemed critical to protecting SCADA from cyber-attacks. In contrast to those methods aiming at strengthening the perimeter surrounding SCADA, IDSs generate ‘burglar alarms’ whenever the security of the system is compromised . To increase the chances of mounting a successful defense, the Department of Homeland Security recommends a combination of firewalls, De-militarized Zones, and IDSs grounded on the principle of defense-in-depth .
While IDSs for traditional ICT systems are mature, implementing IDS in industrial control systems, such as power grids’ SCADA, is facing unprecedented challenges in twofold. First, the power grid is a cyber-physical system, wherein continuity of operation is critical. Unlike traditional ICT systems, in which the effects of false alarms are limited to computer operations, false alarms in power grids would disrupt dependent vital physical processes and inflict severe consequences. Therefore, false positive (which falsely generates alarms for normal actions) is unacceptable whereas low false negative rate is desired. Second, the power grid is a real-time dynamical system. Any delay of control actions could lead to instabilities from local plant angle instability to inter-area oscillation . In the extremity, delayed response of protective devices will cause cascading blackouts over a large scale. For this reason, propagation latency of control and measurement signals induced from IDS audit and process must be minimized.
To address the first challenge, recent works develop IDS by integrating contextual information of power grids[6, 7, 8, 9, 10, 11, 12]. The most common approach is to identify attacks based on their impact on power grids. For example, in 
, Bayesian network models for the whole cyber-infrastructure and underlying power grids are constructed based on SCADA logs along with power network topological information. Power contingencies are then simulated on the Bayesian model to rank the severity of a detected cyber-intrusion. In[10, 13, 6], IDSs audit and select packets that contain control commands, which (dis)connect grid components, e.g., generators, transmission lines and substations. Cyber-attacks are identified if the power flow diverges in simulation under those control commands.
Another approach is to calibrate the detection results in cyber-space with historical data of power grid operation, wherein data mining techniques are often applied. For example, deviations between current and historical Area Control Errors are used as indicators of cyber-attacks to Automatic Generation Control in EMS . A hybrid IDS is developed in  that learns temporal state-based specifications for power grid scenarios of physical disturbances, cyber-attacks, and normal operations.
However, both of these approaches share the common deficiency of requiring a long runtime, exacerbating the second challenge. While the former approach simulates power grids’ response, which is a non-trivial task given the enormous size of power networks and the number of grid devices, the latter approach relies on frequent auditing and processing historical data over a sufficiently long period in order to ensure the desired accuracy. These put a high requirement on IDS accounting resources and could significantly reduce IDSs’ performance in timely processing and propagating the information to grid functions and responsible defense authorities.
Despite initial attempts on reducing IDS runtime in [6, 14], they are restricted to certain attack groups, wherein attacks are aimed at individual grid components and assume a single step in the cyber-physical causal chain (i.e., adversaries directly disconnect grid devices through remote control); they are not able to handle more sophisticated attacks that are coordinated and through EMS. These attacks are defined as Monitoring-Control Attacks (MCA) and considered highly threatening , because they are (i) more likely to happen with greater attack surface and lower attack cost, (ii) difficult to detect by hiding in measurement signals and masquerading through EMS, and (iii) capable of inflicting much more severe consequences at a greater scale by coordinating attack resources targeting at multiple grid components. Although MCAs’ attack mechanisms and physical impacts have been studied in a few works [16, 17, 18, 19], there is no effective IDS solution available to defend against MCAs.
To bridge this gap, this paper presents a semantic analysis framework for IDSs in power grids, which detects MCAs with promising runtime and detection accuracy. The framework is implemented as two parts running in parallel in IDS: a Correlation Index Generator (CIG), which indexes correlated attacks, and a Correlation Knowledge-Base (CKB), which is updated aperiodically with attacks’ Correlation Indices (CI). In addition, this paper makes the following contribution:
A theoretical basis for CIG. We formulate MCAs as a bi-level mix-integer optimization program and solve it to provide CI solutions.
A suite of detection rules for CKB. Derived from set theory, these rules characterize the relation between adversaries’ goals and coordinated attacks, thus enabling CKB to detect MCAs at runtime.
Defense strategies against MCAs. While most IDSs are passive, that is, they only generate “burglar alarms”, our proposed method actively derives defense strategies against MCAs using a set-theoretic approach.
The rest of the paper is organized as follows. Section II introduces the threat model, MCAs mechanisms and IDS implementation of the proposed semantics framework. Section III presents the mathematical model of power grids and MCAs. The theoretic basis for CIG and detection rules for CKB are derived in Section IV and V. In Section VI, the performance of proposed semantic framework is demonstrated with numerical experiments. Finally, all results of this paper are concluded in Section VII. While the proposed framework is capable of defending against less sophisticated attacks, such as control attacks, we elaborate the framework’s working principle mainly based on MCAs in this paper.
In order to develop the semantic analysis framework for IDSs in power grids, we need to consider three factors: the environment in which intrusions occur (the threat model), the intrusions we wish to detect (MCAs), and the intrusion detector (IDS implementation).
Ii-a Threat Model
In the previous generations, SCADA activities were basically confined to proprietary networks. In contrast, the current fourth generation of SCADA is mostly internet-based, as illustrated in Fig. 1. In particular, a large amount of measurement signals from transducers of grid equipment (e.g., relays, generators and switch gears) are transmitted with raw data protocol in field networks . This widens the cyber-attack surface in the following attack entry points as numbered in Fig. 1 :
Directly hack into field devices, including transducers, actuators and meters.
Attack field network links between devices and from devices to Energy Control Centers (ECC).
Attack from inside of the ECC. This could happen within or external of the security enclaves, which boundaries are defined by the trust nodes (e.g., firewall and IDS) .
Attack from inside enterprises functions or attack at its perimeter networks.
Through these chanels, adversaries can install malware, sniff, inject and modify host files and network traffic [21, 1, 22]. Based on the above fact, we make the following assumptions about the threat model:
Adversaries can remotely penetrate the Local Area Network (LAN) and Wide Area Network (WAN). Though insider attacks outside security enclaves are allowed under the proposed framework, it is not our focus. We do not consider insider attacks within the security enclaves.
In ECC, we trust EMS. In other words, attacks are only executed on packets containing control and measurement signals that are transmitted over the network; they do not damage the EMS functions nor alter its encoded working principles.
IDSs are secure (i.e., not compromised). In addition, we assume there are separate computing machines dedicated to IDSs that implement the proposed semantic analysis framework. Therefore, IDSs do not introduce extra vulnerabilities into power grids.
IDS communication is secure. In other words, IDSs can safely exchange data.
We do not consider attacks through enterprises functions. Launching MCAs through this path, though theoretically possible, is much more likely to fail due to extra layers of trust nodes.
Ii-B Monitoring-Control Attacks
There are two clases of attack mechanisms in power grids, control attacks and monitoring attacks . They are illustrated in with a generic control diagram in Fig. 2. Control attacks refer to attacks that directly hijack and falsify control commands in power grids, such as disconnecting transmission lines and changing the power output of generators [21, 6, 14]. While able to inflict immediate physical consequences, they are less likely to occur in practice due to the restricted communication channels and easiness of detection. For example in conventional substations, relay commands, which trigger circuit breakers, are usually transmitted over proprietary communication channels or hard wire connection; generator power adjustments are requested through Human Machine Interface (HMI), where operators would block and report suspicious actions.
Monitoring attacks contaminate or eavesdrop measurements collected from transducers. In contrast to control commands, measurement signals have been more often transmitted over open-communication channels (i.e., without any available authentication method) due to their large transmission volume and high transmission rate. This opens a wider cyber-surface to attacks. An important subset of monitoring attacks is Monitoring-Control Attacks, in which adversaries manipulate control decisions by fabricating measurement signals in the feedback loop. On one hand, MCAs are difficult to detect, since the attack goals are hidden behind measurements and the control mechanisms. Thus, they cannot be inspected and intervened by human operators. On the other hand, they can inflict severe consequence by coordinating attack resources targeting at many measurements simultaneously; they are different from non-disruptive monitoring attacks that only exploit private information. Therefore, MCAs are considered highly threatening.
MCAs’ mechanism in power grids is briefed next. Main control functions of the power grid are realized through EMS, which consists of four blocks: network model-building (including topology processor and state estimation), security assessment, automatic generation control, and dispatch. Information flows within EMS are shown in Fig. 3. In path 1, contaminated measurements drive control decisions in automatic generation control and dispatch after going through network-building models. While state estimation could effectively correct and identify bad data, a rich body of literature has demonstrated that contaminated measurements can still be injected through when the measurement errors are within the tolerance and/or the measurements are structure-wise conforming [23, 24, 25]. In path 2, contaminated measurements directly drive control decisions, as it is common for system operators to make a decision based on raw measurements in security constrained dispatch. Through both paths, adversaries may realize goals, such as depriving profit in electricity markets, disturbing power grid frequency and overloading grid equipment, causing tremendous financial losses, sabotaging, or even interrupting continuous grid operation.
Ii-C IDS Implementation
Ii-C1 Proposed Framework in IDS Architecture
A general IDS architecture is defined with four modules, Event (E-blocks), Analysis (A-blocks), Database (D-blocks), and Response (R-blocks), as shown in Fig. 4 . The proposed semantic analysis framework has two parts: Correlation Index Generator (CIG) and Correlation Knowledge Base (CKB). They are aimed to provide contextual information of power grids additional to the traits that IDS sensed in the cyber-space (e.g., host syslog and network traffic).
CIG, depicted in Fig. 5, belongs to A-blocks. It analyzes the correlation of the potential hostile behaviors sensed by E-blocks, and indexes these behaviors with inductive-deductive patterns. For example, if a set of measurements are suspected to be contaminated, CIG first induces their consequence on the power grid with optimal power flow. If a transmission line is overloaded, then these measurements are weakly correlated. Next, CIG deduces the critical measurements required to overload the transmission line. These critical measurements are strongly correlated and will be represented by a set of Correlation Indices (CI). The inductive-deductive patterns ensure minimal false negative rates that might be caused by normal deviations, such as noises and faults. In addition, CIG can be used to protect critical grid assets from MCAs, in which case CIs can be directly deduced from the predicted failures of these assets. Details about CIG are provided in Section IV.
CKB, depicted in Fig. 6, belongs to D-blocks. It is updated with the CIs generated from CIG at an adaptive rate, which is determined by (i) configuration change of power networks, (ii) power grid stress level, (iii) detection rate of potential hostile events of E-blocks, and (iv) human operator’s settings. At runtime, measurements detected by E-blocks are compared with the CIs in CKB. If the comparison is positive, then these measurements are considered forming an MCA. This information is passed to other A-blocks and R-blocks for further response. Since CKB does not contain any computation function, apart from arithmetic operation for CI comparison, it allows fast contextual information integration in IDSs. Details about CKB are provided in Section V.
Derived from set theory, defense strategies are proposed for R-blocks. The design of E-blocks is out of the scope of this paper.
Ii-C2 IDS Dimensions
We consider two dimensions of IDS implementation related to the proposed semantic analysis framework. The proposed framework is flexible in implementation in the other dimensions, such as audit source (i.e., host- or network-based detection), audit frequency and continuity, which definitions are given in the survey .
Detection Approach. There are two main detection approaches in IDS development: signature- and anomaly-based. In between these approaches lie the probabilistic- and specification-based methods [28, 1, 27]. All of these approaches are based on direct knowledge of cyber-activities (i.e., host syslog and network traffic). In complementary, behavioral detection approaches capture the patterns, which are not necessarily illegitimate in a direct setting but wrong in a contextual setting as a secondary evidence. The proposed analysis framework belongs to the last class and will be implemented with other direct knowledge-based approaches in IDS.
Distributed v.s. Centralized. The proposed analysis framework can be implemented under centralized, distributed or hierarchical structure of IDS. Provided the cost and communication constraints in power grids, we consider IDSs are only installed at the substation level and above, but not at individual Intelligent Electronic Devices or Remote Terminal Units. Thus, under a centralized structure, the proposed framework will allow IDS at a substation to detect and identify MCAs within its service area. For MCAs across service areas under multiple substations, a distributed structure is needed, wherein IDS at substations have peer-to-peer communication so that detected events can be exchanged. Alternatively, a hierarchical structure can be formed. The proposed analysis framework is integrated at a master IDS, which supervises all the substation IDSs by collecting, analyzing their detected events and sending instructions for detected MCAs.
Iii Mathematical Models
In this section, we model the power grid, dispatch applications, and Monitoring Control Attacks.
Iii-a Mathematical Notation
Throughout this paper, we use the following mathematical notation. Let and (resp. ) denote the set of real numbers and the set of non-negative (resp. positive) real numbers. We let and
denote, respectively, the vectors or matrices with all components equal to one and zero. Given a finite set, we let denote its cardinality, i.e., the number of elements of , and the power set of , i.e., the set of all subsets of .
For a matrix , we let denote its th row. For a vector , denotes its th element, the diagonal matrix of , and the zero-norm of , i.e., the number of non-zero elements of .
Iii-B Power Grid Model
We model the power grid as the graph , where is the set of buses and is the set of transmission lines. To each bus , we associate the demand (or consumption) . In addition, let denote the set of buses with dispatchable generation. To each generator bus , we associate the power generated . Similarly, to each transmission line connecting buses and , we associate the power flow . In vector form, the demand, generation, and power flows are, respectively, , , and .
The power grid is assumed to have a set of substations, i.e., . We model the power grid within substation ’s service area as the sub-graph with the following properties:
All substations’ service areas compose the power grid, i.e., .
Substations’ service areas might overlap, i.e., for some , we may have .
The overlapped areas do not contain generator buses.
Each substation collects demand measurements, denoted as , within its service area, i.e., all such that .
Iii-C Dispatch Application Model
Dispatch applications in EMS compute the generation output for the grid, denoted as , by observing demand measurements and using security constrained optimal power flows. These applications are triggered based on a guard condition (i.e., a boolean condition). This guard condition is enabled by a security assessment algorithm (which usually involves network model-building), or by a system operator during real-time and contingency dispatch. Examples include generation dispatch in Real-Time Markets and Ancillary Services (see Fig. 3).
Dispatch applications are based on the active and reactive power flow model, which describes how power balances on buses and flows on transmission lines. However, computing this coupled power flow may become computationally intractable for large-scale power grids. For this reason, the decoupled DC power flow is commonly adopted by operators when the power grid is in the normal status . The linearity and sparsity in the DC power flow allows much faster computation.
We formulate the security constrained DC optimal power flow as a convex optimization problem that minimizes the generation cost (1a), balances generation and demand (1b), and keeps the generation (1c) and power flows (1d) within operational limits, i.e.,
where are the cost coefficients for generators, , is the rated power from generators, is the thermal capacity of transmission lines, is the generator shift matrix, and is a matrix that maps generator buses to buses.
Thus, given the demand measurements , an optimal solution corresponds to the new generation output for the grid.
Iii-D Attack Model
In this subsection, we define MCAs, attack goals, and attack constraints. We also describe two types of MCAs: strongly and weakly correlated.
Monitoring Control Attacks
MCAs aim to manipulate dispatch applications in EMS. In an MCA, adversaries hack into substations’ ICT. The corrupted measurements are modeled as follows:
where denotes the difference between the attack signal and the actual signal .
The adversary uses these MCAs to manipulate (1), so the new (deceived) generation output increases the power flows on a set of target lines . Therefore, the attack goal is denoted as,
where is the attack goal, is the th row of the generation shifting matrix, denotes the power flow on line before the MCA, and quantifies the flow increase on . We choose this flow increase with semantics, including the flow increase that congests a transmission line or trips the line’s protection.
MCAs are constrained based on the path they take on EMS. If the attack takes path 1 (see Fig. 3), the MCA gets through state estimation and its data screening method. If the attack takes path 2 (see Fig. 3), the MCA must take any value that deceives the operator. In any case, we can model this constraint as
In the above, is the vector of max values allowed for the attack signal. We can use this vector to design different attack scenarios.
The constraint for path 1 can take a form that explicitly describes the condition under which measurement attacks get trough state estimation and its data screening methods. These methods, however, are not used during real-time and contingency dispatch (Path 2).
MCAs are also constrained by defense at substations. If the grid’s operator deploys defense at substation , the adversary cannot corrupt its measurements. We model this constraint as
where if measurements at substation are corruptible and if not. The vector describes target and safe substations during MCAs. Using , we can identify the set of target/attacked substations as follows
Note that we can also use (5) to model the desire (for the adversary) to attack substation .
Finally, MCAs are constrained by the adversary’s resources. If the adversary has limited resources, (s)he can only attack (hack) a limited number of substations. We model this constraint as
In the worst case scenario for the operator, the adversary minimizes .
Types of Coordinated MCAs
Since the power grid is built with redundant measurements, attacking measurements in a single substation may not induce any consequence. In other words, effective MCAs are usually launched as a coordinated effort, which consists of temporally and spatially correlated events. Given the attack goal
, we classify coordinated MCAs as strongly and weakly correlated.Strongly Correlated MCAs, denoted as , achieve by attacking the least number of substations. Strongly correlated MCAs describe attacks with minimum resources and allow us to predict attack consequences and derive defense implications. In Section IV, we will introduce a formal method to model and study strongly correlated MCAs. On the other hand, Weakly Correlated MCAs, denoted as , achieve by attacking more substations than needed. Adversaries execute weakly correlated MCAs to probe defense at substations.
Iv Correlation Index Generator
In this section, we describe the working principles of the Correlation Index Generator (see Fig. 5) and its components, namely the Induction Engine and the Deduction Engine.
Iv-a Induction Engine
Suppose the E-blocks detected an MCA that is not in CKB and has corrupted measurements . The induction engine computes the new (deceived) generation output by solving , i.e.,
Then, using , the induction engine determines the set of attack consequences, i.e., the set . As shown in (3), the set of consequences depends on and . The flow increase is chosen with semantics and the real consumption is obtained as follows.
where is a (conservative) estimated consumption or a redundant measurement.
Iv-B Deduction Engine
Given the set of consequences inflicted , the deduction engine computes strongly correlated MCAs that reach using the following bilevel mix-integer optimization program:
In our previous work , we derived a method that addresses the mathematical challenges of (7) and computes strongly correlated MCAs. The method first computes the security index, which corresponds to the optimal solution . This security index describes the minimum number of substations the adversary must attack to reach . Then, the method determines the target and safe substations during the MCA from the optimal solution . Since is not necessarily unique, we proposed in  an algorithm to determine all feasible solutions such that . All these correspond to strongly correlated MCAs associated with the attack goal .
We use a set-theoretic approach to describe all these strongly correlated MCAs, which we define as Correlation Indices.
Let denote a feasible solution of (7) associated with such that . A Correlation Index (CI), denoted as , is a strongly correlated MCA that extracts target substations from as follows
and inflicts the consequences described by .
The set of all CIs associated with the inflicted consequences is given by .
As a result, the CIG generates a CI-target tuple –i.e., the set of strongly correlated MCAs and the associated inflicted consequences– and sends this CI-target tuple to the Correlation Knowledge-Base (CKB).
V Correlation Knowledge-Base
In this section, we describe the working principles of the Correlation Knowledge-Base (CKB) (see Fig. 6) using a set-theoretic approach. The CKB has a Scanning Engine and a Reasoning Engine.
V-a Scanning Engine
Suppose the E-blocks detected a (possibly weakly correlated) MCA . The Scanning Engine verifies if is an existing MCA, i.e., if . The MCA is an existing MCA if
The MCA is a CI (or strongly correlated MCA), i.e., for some .
The MCA is a weakly correlated MCA but a superset of at least one CI, i.e., such that .
The MCA is uncorrelated, is a subset of at least one CI, i.e., such that , and has less cardinality than all CIs in CKB, i.e., for all .
If is an existing MCA, then CKB uses the reasoning engine to identify physical targets and derive defense strategies. Otherwise, CKB calls the CIG to analyze .
V-B Reasoning Engine
The reasoning engine identifies physical targets and derives defense strategies for the detected MCA . Technically, the reasoning engine is an R-block (see Fig. 4) and can work also with CIG to derive defense strategies.
To identify physical targets associated with , we proceed as follows.
If the MCA is a CI, then the physical targets are described by the set of inflicted consequences .
If the MCA is a weakly correlated MCA that contains a set of CIs, i.e., the set
then the physical targets are given by the union of the inflicted consequences associated with each CI, i.e., where is a CI-tuple of an existing MCA.
To derive defense strategies against , we proceed as follows.
If the MCA is a CI, then the best defense strategy is to defend any substation.
This defense will render the attack ineffective, which we justify next.
(Defense against strongly correlated MCAs) Let denote a strongly correlated MCA. If the operator protects measurements at any substation substation such that , the attack becomes ineffective.
See Appendix. ∎
If the MCA is a weakly correlated MCA that contains the set of CIs , then we may have one of the following cases.
Case I: If , then the best defense strategy is to protect measurements at substation that satisfies , which we justify next.
(Defense against a set of strongly correlated MCAs with non-empty intersection) Let denote a weakly correlated MCA that contains the set of CIs . Suppose these CIs satisfy . If the operator protects measurements at a substation such that , the attack becomes ineffective.
Follows from Proposition 1. ∎
Case II: If , then the best strategy is to defend all CIs individually, which we justified using Proposition 1.
Case III: Finally, there is an intermediate case in which only some CIs have a non-empty intersection. For this case, a combination of the defense strategies described for Case I and II should be implemented.
Vi Numerical Experiments
In this section, we use numerical experiments to validate our proposed framework. In particular, we compute the false alarm rates for CIG and CKB under different attack scenarios.
Vi-a Experimental Setup
We describe the experimental environment, the IDS benchmark systems, and the evaluation metric next.
We model a power grid with substations using the New England 39-bus system illustrated in Fig. 7. We model the dispatch application using the DC Optimal Power Flow tool from MatPower . The data used for the power grid and dispatch application corresponds to Matpower base-case data.
In our experiments, we used the adversarial environment introduced in . This adversarial environment is characterized by a nominal attack rate (or attack intensity) , which E-blocks estimate as .
We model MCAs using a random approach, that is, we selected the corrupted measurements and target substations uniformly at random. In particular, was chosen uniformly from the interval where . This random approach allowed us to model attack events that are a threat and attack events that are not.
Vi-A2 Intrusion Detection Systems
We model E-blocks (or IDS’s detector) with the following characteristics. The E-blocks have a detection rate and a false alarm rate . In our experiments, we selected the values of , . The adversary attempts to manipulate the E-blocks’ , , and by using the following parameters:
: the maximum deviation under .
: the maximum probability to launch a zero-day (i.e., undetectable) attack.
: the maximum probability to intentionally trigger a false alarm.
In the simulation, we selected the values , , , and .
We model two benchmark IDSs, a simple IDS (IDS-1) and a Bayesian IDS (IDS-2). IDS-1 has the following working principle. If the E-blocks trigger an alarm, IDS-1 will label the event as an intrusion. IDS-2, on the other hand, has the following working principle. An event is labeled as an intrusion based on , i.e., the probability of intrusion given that an alarm has been triggered. This probability is computed as follows
where denotes the alarm and intrusion. Since , , , and ; we write as
which is also known as the Bayesian detection rate .
To model CKB and CIG, we proceed as follows. For CKB, we computed CI-tuples for each experiment using CVX and Gurobi, packages for specifying and solving convex and mix-integer programs . CIG detects possible threats based on deviation from the pseudo-measurements
, which are generated from a uniform distribution in. We assume no redundant measurements are available for CIG to replace the corrupted measurements. Nevertheless, if they are available, the false alarms (for CIG) will tend to 0.
CKB and CIG will label an incoming MCA as a threat, if the attack can increase the flow in any of the following target lines (see Fig. 7). This requires for CKB to have CI-tuples for each line ..
The performance of the benchmark IDSs and the proposed framework is measured by the false negative rate , where FN denotes the false negatives (i.e., failure of generating an alarm) and TP the true positives (i.e., success of generating an alarm correctly), and the false positive rate , where FP denotes the false positives (i.e., generating a false alarm) and TN the true negatives (i.e., stay silent when there is no event).
We further define these metrics for intrusions that are not a threat (i.e., ineffective attacks) and for intrusions that are a threat (denoted as and ). Since IDS-1 and IDS-2 are not capable of estimating attack consequences and determining possible threats, we compute and only for the proposed framework. All metrics are evaluated through a large sample of events using the pseudo-code algorithm described in Appendix B.
Vi-B Experimental Results
False Alarm Rates. In this experiment, we computed the FNR and FPR for IDS-1, IDS-2, and CKB/CIG. We used the pseudo-code to simulate experiments of attack/normal events. Fig. 8 shows the FNRs and Fig. 9 the FPRs (using box plots) for the attack rates .
For the FNR case, the results show that for both and , CKB/CIG outperforms IDS-2 but not IDS-1. This is because CKB and CIG label an event as an intrusion if and only if the event threatens the power grid. As a result, ineffective attacks are not labeled as intrusions, which increases the number of false negatives. If instead of computing the FNR for intrusions, we compute the FNR for threats (i.e., ), then we will see how CKB and CIG outperform IDSs with no contextual information, which we describe in Experiment II.
For the FPR case, the results show that for , CKB/CIG performs worse than for IDS-1 and IDS-2. In a more friendly environment, i.e., when , CKB/CIG outperforms IDS-1 but not IDS-2. This is because (i) the fast screening of CKB increases the number of false positives in a less friendly environment and (ii) CKB is sensitive to the number of critical targets (i.e., the cardinality of ), which we describe in Experiment III.
Threat Analysis. In this experiment, we computed the FNR for threats (i.e., ). A false negative occurs if the random MCA was a threat for the power grid but CKB and CIG determined that it was not a threat. Fig. 10 shows the FNRs for the attack rates . As expected, the contextual information used by CKB and CIG considerably decreases the FNR for threats.
Sensitivity Analysis. In this experiment, we studied the sensitivity of to the cardinality of . Table I shows that the average , denoted as , decreases as the number of critical/target lines decreases. This is because, as the number of critical lines decreases, the number of CI-tuples stored in CKB decreases too. As a result, the fast scanning feature of CKB will be less prone to false positives.
Note that there is a trade-off between the number of critical targets selected and the maximum allowed, which should be adjusted based on risk assessment or experience. A different solution would be to always use CIG. This, however, will greatly increase the runtime of our proposed framework.
In this paper, we developed a semantic analysis framework for Intrusion Detection Systems (IDS) against Monitor-Control Attacks (MCA) in power grids. The framework has two parts running in parallel with IDS: A Correlation Index Generator (CIG) that analyzes the correlation of potential hostile behaviors and indexes these behaviors, and a Correlation Knowledge-Base (CKB) that is updated with the Indices generated by CIG. The performance of the proposed framework is evaluated under different attack scenarios in a cyber-physical setting. It is shown that the proposed framework is capable of detecting MCA and estimating attack consequences with promising runtime and detection accuracy. In addition, the experiments show that the detection outcome of the proposed framework is sensitive to both the size and locations of attack goals. Future work includes developing methods, which adapt CKB parameter settings to attack activities, to achieve an optimal trade-off between the FNR/FPR and detection runtime.
Appendix A Proof of Proposition 1
Suppose, to get a contradiction, that is an effective MCA. Thus, is a correlated MCA with cardinality , which contradicts the fact that is a strongly correlated MCA, i.e., a CI with minimum cardinality. This proves the proposition. ∎
Appendix B Pseudo-Code
We use Algorithm 1 to compute the FNR/ and FPR/ for CIG and CKB. Some remarks on Algorithm 1 are the following. (i) The if-conditionals describe how the E-blocks, CKB, and CIG interact during normal/attack events. (ii) Algorithm 1 describes attacks at the grid level, that is, either the grid is under attack or not. We remark, however, that it can be easily adapted to model attacks at the substation level, that is, individual substations are under attack or not. (iii) Finally, by making the appropriate changes, Algorithm 1 can compute the FNR and FPR for IDS-1 and IDS-2.
-  B. Zhu and S. Sastry, “Scada-specific intrusion detection/prevention systems: a survey and taxonomy,” in Proceedings of the 1st Workshop on Secure Control Systems (SCS), vol. 11, 2010.
-  R. M. Lee, M. J. Assante, and T. Conway, “Analysis of the cyber attack on the ukrainian power grid,” SANS Industrial Control Systems, 2016.
-  M. M. Hasan and H. T. Mouftah, “Optimal trust system placement in smart grid scada networks,” IEEE Access, vol. 4, pp. 2907–2919, 2016.
-  D. Kuipers and M. Fabro, “Control systems cyber security: Defense in depth strategies,” Idaho National Laboratory (INL), Tech. Rep., 2006.
-  J. Wang and C. Peng, “Analysis of time delay attacks against power grid stability,” in Proceedings of the 2nd Workshop on Cyber-Physical Security and Resilience in Smart Grids. ACM, 2017, pp. 67–72.
-  H. Lin, A. Slagell, Z. Kalbarczyk, P. Sauer, and R. Iyer, “Runtime semantic security analysis to detect and mitigate control-related attacks in power grids,” IEEE Transactions on Smart Grid, 2016.
-  S. Pan, T. Morris, and U. Adhikari, “Developing a hybrid intrusion detection system using data mining for power systems,” IEEE Transactions on Smart Grid, vol. 6, no. 6, pp. 3104–3113, 2015.
-  C. Vellaithurai, A. Srivastava, S. Zonouz, and R. Berthier, “Cpindex: cyber-physical vulnerability assessment for power-grid infrastructures,” IEEE Transactions on Smart Grid, vol. 6, no. 2, pp. 566–575, 2015.
-  S. Sridhar and M. Govindarasu, “Model-based attack detection and mitigation for automatic generation control,” IEEE Transactions on Smart Grid, vol. 5, no. 2, pp. 580–591, 2014.
-  C.-C. Sun, J. Hong, and C.-C. Liu, “A coordinated cyber attack detection system (ccads) for multiple substations,” in Power Systems Computation Conference (PSCC), 2016. Power Systems Computation Conference, 2016, pp. 1–7.
-  N. Liu, J. Zhang, H. Zhang, and W. Liu, “Security assessment for communication networks of power control systems using attack graph and mcdm,” IEEE Transactions on Power Delivery, vol. 25, no. 3, pp. 1492–1500, 2010.
-  M. Vrakopoulou, P. M. Esfahani, K. Margellos, J. Lygeros, and G. Andersson, “Cyber-attacks in the automatic generation control,” in Cyber Physical Systems Approach to Smart Electric Power Grid. Springer, 2015, pp. 303–328.
-  C.-W. Ten, C.-C. Liu, and G. Manimaran, “Vulnerability assessment of cybersecurity for scada systems,” IEEE Transactions on Power Systems, vol. 23, no. 4, pp. 1836–1846, 2008.
-  C.-W. Ten, A. Ginter, and R. Bulbul, “Cyber-based contingency analysis,” IEEE Transactions on Power Systems, vol. 31, no. 4, pp. 3040–3050, 2016.
-  H. Song, R. Srinivasan, T. Sookoor, and S. Jeschke, Smart Cities: Foundations, Principles, and Applications. John Wiley & Sons, 2017.
-  L. Xie, Y. Mo, and B. Sinopoli, “False data injection attacks in electricity markets,” in Smart Grid Communications (SmartGridComm), 2010 First IEEE International Conference on. IEEE, 2010, pp. 226–231.
-  J. Liang, L. Sankar, and O. Kosut, “Vulnerability analysis and consequences of false data injection attack on power system state estimation,” IEEE Transactions on Power Systems, vol. 31, no. 5, pp. 3864–3872, 2016.
-  J. Wang and C. Moya, “Attack path reconstruction from adverse consequences on power grids with a focus on monitoring-layer attacks,” in Cyber-Physical Security and Resilience in Smart Grids (CPSR-SG), Joint Workshop on. IEEE, 2016, pp. 1–6.
-  C. Moya, C. Sun, J. Wang, and C. Liu, “Defending against measurement attacks on sub-transmission level,” in Power Energy Society General Meeting, Boston, 2016.
-  Y. Yang, K. McLaughlin, S. Sezer, T. Littler, E. G. Im, B. Pranggono, and H. Wang, “Multiattribute scada-specific intrusion detection system for power networks,” IEEE Transactions on Power Delivery, vol. 29, no. 3, pp. 1092–1102, 2014.
-  C.-C. Liu, C.-W. Ten, and M. Govindarasu, “Cybersecurity of scada systems: Vulnerability assessment and mitigation,” in 2009 IEEE/PES Power Systems Conference and Exposition, 2009.
-  C.-W. Ten, C.-C. Liu, and M. Govindarasu, “Vulnerability assessment of cybersecurity for scada systems using attack trees,” in Power Engineering Society General Meeting, 2007. IEEE. IEEE, 2007, pp. 1–8.
-  Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against state estimation in electric power grids,” ACM Transactions on Information and System Security (TISSEC), vol. 14, no. 1, p. 13, 2011.
-  X. Liu and Z. Li, “False data attacks against ac state estimation with incomplete network information.”
-  J. Liang, O. Kosut, and L. Sankar, “Cyber attacks on ac state estimation: Unobservability and physical consequences,” in PES General Meeting— Conference & Exposition, 2014 IEEE. IEEE, 2014, pp. 1–5.
-  C. Kahn, P. A. Porras, S. Staniford-Chen, and B. Tung, “A common intrusion detection framework,” 1998.
-  H. Debar, M. Dacier, and A. Wespi, “Towards a taxonomy of intrusion-detection systems,” Computer Networks, vol. 31, no. 8, pp. 805–822, 1999.
-  S. Axelsson, “Intrusion detection systems: A survey and taxonomy,” Technical report, Tech. Rep., 2000.
-  F. Dorfler and F. Bullo, “Novel insights into lossless ac and dc power flow,” in Power and Energy Society General Meeting (PES), 2013 IEEE. IEEE, 2013, pp. 1–5.
-  C. Moya and J. Wang, “Developing correlation indices to identify coordinated cyber-attacks on power grids,” CoRR, vol. abs/1707.00672, 2017. [Online]. Available: http://arxiv.org/abs/1707.00672
-  R. D. Zimmerman, C. E. Murillo-Sánchez, and R. J. Thomas, “Matpower: Steady-state operations, planning, and analysis tools for power systems research and education,” IEEE Transactions on power systems, vol. 26, no. 1, pp. 12–19, 2011.
-  A. A. Cárdenas, J. S. Baras, and K. Seamon, “A framework for the evaluation of intrusion detection systems,” in Security and Privacy, 2006 IEEE Symposium on. IEEE, 2006, pp. 15–pp.
-  M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1,” http://cvxr.com/cvx, Mar. 2014.