Evaluating Cascading Effects of Attacks on Resilience of Industrial Control Systems: A Design-Centric Modeling Approach

A design-centric modeling approach was proposed to model the behavior of the physical process controlled by an Industrial Control System (ICS) and study the cascading effects of data-oriented attacks. A threat model was used as input to guide the construction of the model where control components which are within the adversary's intent and capabilities are extracted. The relevant control components are subsequently modeled together with their control dependencies and operational design specifications. The approach was demonstrated and validated on a water treatment testbed. Attacks were simulated on the testbed model where its resilience to attacks was evaluated using proposed metrics such as Impact Ratio and Time-to-Critical-State. From the analysis of the attacks, design strengths and weaknesses were identified and design improvements were recommended to increase the testbed's resilience to attacks.

READ FULL TEXT VIEW PDF

Authors

page 1

page 2

page 3

page 4

04/24/2020

KYPO4INDUSTRY: A Testbed for Teaching Cybersecurity of Industrial Control Systems

There are different requirements on cybersecurity of industrial control ...
02/10/2019

SCADA System Testbed for Cybersecurity Research Using Machine Learning Approach

This paper presents the development of a Supervisory Control and Data Ac...
11/04/2019

Design Considerations for Building Credible Security Testbeds: A Systematic Study of Industrial Control System Use Cases

This paper presents a mapping framework for design factors and implement...
03/29/2022

Systematically Evaluation of Challenge Obfuscated APUFs

As a well-known physical unclonable function that can provide huge numbe...
02/01/2020

The Separator, a Two-Phase Oil and Water Gravity CPS Separator Testbed

Industrial Control Systems (ICS) are evolving with advances in new techn...
02/17/2021

Scanning the Cycle: Timing-based Authentication on PLCs

Programmable Logic Controllers (PLCs) are a core component of an Industr...
08/20/2021

A Quantitative Framework for Network Resilience Evaluation using Dynamic Bayesian Network

Measuring and evaluating network resilience has become an important aspe...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Cyber-physical systems (CPSs) are systems that comprise of networked computational and communication devices that interacts with the physical environment it is deployed in. Industrial Control Systems (ICS) belong to the family of CPS and consist of controllers, sensors and actuators which are used across various industries to monitor and control physical processes. With advances in technology, the operators demand for modernization of these systems for improved efficiency and performance by improving connectivity, monitoring and control capabilities [krotofil2014you]. However, this also increases the attack surface of these systems, making them more vulnerable to cyber attacks. Attacks on CPSs have the ability to cause damage to the physical environment, bringing about harm to infrastructure and human lives [cardenas2011attacks, krotofil2013resilience, krotofil2015process].

Acknowledging the need to address security of CPSs, there is increasing research on securing CPSs from different aspects such as hardware security, network-level intrusions detection and more recently, phyiscs-based attack detection [stouffer2011guide, giraldo2018survey]. Securing CPSs serves to protect the operation of the system, but to achieve the desired disruptive results, an attacker requires understanding about workings of the system such as control logic and failure modes in order to maximise damage [orojloo2017method]. Hence, there is a need to study the interaction of attackers with the system and how the attacks translate to impact on the physical process or environment.

Process systems controlled by ICS are complex with numerous sub-processes that may span across vast geographical areas and are inter-connected either physically or by the control logic of the controllers in the ICS. As such, any attacks on individual or subset of components would result in cascading effect within the system. As these systems are safety-critical, an important aspect of security research is to study the impact and consequences of attacks on the physical environment [krotofil2015process, orojloo2017method]. With better understanding of the interaction of adversary with the physical system and how the impact of attack propagates, it can help in the design of defences that improves the system’s resilience to cyber attacks.

While many research focus on securing CPSs from attacks, there exists a need to understand and quantify the impact of attacks on these systems post-detection. To this end, this paper presents a framework to construct a process model to simulate the process dynamics and propagation of attacks across the sub-processes of a process system. Quantitative metric to measure the resilience of the system was adopted from the resilience study of networked infrastructure that uses performance degradation as a measure of impact [ouyang2014multi, reed2009methodology, o2007critical]. We also quantified the ability of the system to withstand attacks before an undesirable consequence occurs by measuring the time to critical state whilst under attack [castellanos2019AModular].

1.1 Contributions

The main contributions of this paper are as follows:

  1. A generic framework that uses a threat model to guide the construction of the model to evaluate cascading impact of attacks on Process Systems.

  2. Design driven modeling approach that combines the system’s control strategy and operating specifications to model the process dynamics.

  3. Systematic approach to evaluate the impact of attacks and measure the resilience of the system under various attack scenarios.

We applied the proposed framework on the Secure Water Treatment Testbed (SWaT) [SWaT] to validate the modeling approach. Subsequently, we attempted to quantify the impact of attacks by providing a measurement of the system’s resilience to the disruptions under the various attack scenarios.

2 Related Work

In this section, we review some of the related work on approaches undertaken to measure and analyse impact of cyber attacks on cyber-physical systems.

Cardenas et al. in [cardenas2011attacks] demonstrated the use of process models to study the interaction between control system and the physical processes. The model of a single stage chemical reactor process was used to determine sensors to attack in order to drive the system to an unsafe state. The approach taken was to perform integrity and denial of service attacks on all sensors to evaluate each attack’s ability to drive the system to an unsafe state. Krotofil and Cardenas in [krotofil2013resilience] also investigated the impact of attacks on control systems using process models. Attacks were simulated on a single-stage chemical reactor process and empirical analysis on the effects of attacks on physical process behavior were conducted. These works, however, do not discuss the coupling effects of attacks that results from control loop dependencies.

In [genge2015system], Genge et al. proposed a methodology of assessing the impact of attacks by measuring the cross-covariances of control variables before and after the system is perturbed. While this method provides insights on how the impact of attack propagates through the system via the relationship between control variables, it does not translate to the consequence of attacks on system performance.

Milosevic et al. in [milovsevic2018quantifying] proposed a framework to measure impact of attack on stochastic linear control systems using the infinity norm of critical states over a time window. The impact is measured by how much the critical states of the system deviates from the steady state over a period of time steps while remaining undetected by an anomaly detector. The impact metric provides information of the extent to which the system is perturbed during an attack but does not give resolution on the impact on the physical process.

In [orojloo2017method], Orojloo and Azgomi proposed a modeling approach that considers the systems dynamics and control dependencies between the various components in the cyber-physical systems. The model built was used to perform sensitivity analysis to understand the system behaviour under various attack scenarios, providing insights to vulnerable control loops. The impact on system’s physical parameters by attacks on specific components on the system was subsequently used to evaluate the component’s criticality for successful attacks.

Adepu and Mathur in [adepu2016investigation] studied the response of a water treatment plant to single point attacks. The attack propagation in terms of number of components in the system were affected were analyzed. System behavior such as changes in physical process metrics during attacks were investigated. The results of the study were used to propose attack detection mechanisms that were based on physical properties of the system.

Comparing the works that focuses on measuring impact of attacks on cyber-physical systems, we identified that there is a gap in providing an approach to quantitatively assess cascading impact of attacks across the multiple processes within a system and provide measurements of resilience of the system to attacks with respect to safety and performance.

3 Approach

In this section, we outline the approach taken to build a threat intent guided model of a cyber-physical system which incorporates the interaction of process’s control strategy with the physical process and the attacker’s capabilities. This allows us to conduct simulations of attacks on the physical process via the control system to analyse how the impact of attacks propagates in the system and study the system’s resilience to cyber attacks.

3.1 Threat Modeling

Threat modeling starts by describing the intention and capabilities of the attacker and the output would be a subset of physical components and related control strategies that the adversary would be able to attack and affect.

3.1.1 Definition 1.

We define the intention of the attacker as adversarial goals that the attacker wants to achieve from an attack. We assume that impact of attack can be observed and translated to affected operational metrics. The operational metrics affected can then be mapped onto a set of sensors that measures these metrics.

The subset of sensors that measure the impact of attack directly related to the attacker’s goal is a subset of all the sensors in the system .

(1)

With the knowledge of the design of the control strategy for normal operations, we are able to extract control statements () that relates to other components in the system.

(2)

where :

  • is the total number of control statements

  • , and is the set of all Sensors in the system

  • , and is the set of all Actuators in the system

  • are conditions that relates states of and to changes to state of

3.1.2 Definition 2.

We define the capability () of the attacker as a subset of sensor and actuators that the attacker has control over, meaning that the attacker is able to manipulate the data received and sent by the component.

(3)

where :

  • , and is the set of all Sensors in the system

  • , and is the set of all Actuators in the system

3.1.3 Output for model.

From the adversary’s threat intent and capabilities, we are able to obtain set of sensors, actuators and control statements that are relevant for modeling the interaction of the attacker with the process dynamics and control loops :

3.2 Modeling Process System Dynamics

The approach to construct the model of a process system draws inspiration from System Dynamics, an area of research that aims to understand the behavior of complex systems over time. With fundamentals of control theory and theory of non-linear dynamical systems as the foundation of System Dynamics [sterman2001system], it is a suitable modeling approach for studying interaction of attacker’s input on process dynamics.

System Dynamics models actual flow of physical entities or information, regulated by feedback loops [forrester1997industrial]. Tools used to describe the system are Stock and Flow Diagram (SFD) and Causal Loop Diagrams (CLD). The relationship between system variables is represented by a CLD, where qualitative analysis of the dynamics can be performed. The flow and accumulation characteristics can be studied using a SFD where quantitative analysis can be performed using simulations. From the output of threat modeling phase, we have a set of Control Statements which relates sensors and actuators. Sensors in a dynamical systems measures the stock and flow, whereas the actuators changes these measurements in the system. In addition, we utilize the inputs from design specifications of the process system to construct the system model with information of how components are physically related such as physical connection and operating parameters that affect stock and flow.

[Tank Example] [Tank example with system state representation in ”cyber realm” under attack]

Figure 1: Illustrative Single Tank Example

We use a simple tank example to illustrate the modeling approach. In Figure 1, we have a system comprising of a tank with an inlet pump and an outlet pump, the level of liquid in the tank is measured by a level sensor. The rate of inflow and outflow determines the rate of tank level increase and decrease; these are described by design specifications of these components.

[language = python, caption = Tank Level Control Strategy, label = code] # Control Statement 1 # tank level is high, stop inflow and start outflow if level_sensor.get() ¿= level_high : pump_in.off() pump_out.on()

# Control Statement 2 # tank level is low, start inflow and stop outflow if level_sensor.get() ¡= level_low : pump_in.on() pump_out.off()

Control statements (Listing LABEL:code) exists such that the tank level does not rise or fall below pre-defined levels. This results in causal loops that serves to regulate the liquid level in the tank. The SFD, together with design specifications of the system, was used to build a simulation model for quantitative analysis.

3.2.1 Modeling Adversarial Interactions and Physical Impact.

Under normal operation, when there is absence of attacks and faulty components, the state of the system perceived by the controller is approximately identical to the actual physical state of the process. However, during adversarial cyber attacks, the perceived states of the system can be manipulated to disrupt the process dynamics by exploiting control loops to cause the system to react to the perceived state.

During attack, the perceived state of the system by the controller deviates from the actual physical state and would not be able to detect actual changes to the physical state caused by control instructions. In order to study the physical impact of the attacks, there is a need to model both the cyber and physical representations of the system. Under normal conditions, the cyber representation is the duplicate of the physical representation. When under cyber attack, the physical representation is the actual physical state whereas the cyber representation is controlled by the attacker.

Using the same tank example, in Figure 1, we duplicate the states of every component. The level sensor measures the actual physical level of the tank whereas the cyber representation of the sensor is manipulated by the attacker. The controller perceives the ”cyber realm” system state and give instructions to actuators that act on the physical process. With both cyber and physical representations of the system, we are able to study the impact of attacks on physical process with system state manipulation in cyber realm.

3.3 Attack Impact Measurement

With the model, we are able to simulate the process dynamics under both scenarios of normal conditions and when the system is under attack. To help us study the interaction of the attacker with the system behavior and understand the resilience of the system to attacks, it would be important for us to be able to measure the impact of attack in comparison to normal conditions. In this section, we discuss two methods of quantifying impact of attack.

3.3.1 Area Between Operational Curves.

In many recent studies on resilience of networked infrastructure such as power systems and Information Technology systems, resilience is quantified by a time-dependent metric of changes in operational levels over time [reed2009methodology, ouyang2014multi]. Resilience is defined as the ability of the system to withstand, absorb, adapt to and recover from disturbances to the system [o2007critical]. The resilience trapezoid (Figure 2), shows how the operational level changes over time as it transits through the various phases from original state to system undergoing disruption, disrupted state, recovery state and recovered state.

Figure 2: Operational Level F(t), Transition Over Time. Figure from [henry2012generic]

In our work, since we are able to simulate system performance under both scenarios of normal and attack conditions, we are able to quantify the impact of the attack by measuring the divergence of the two curves. This could be measured using the difference in cumulative area of the normal operating curve and operating curve when the system is under attack. The operational levels used to calculate area under graph would be sensor measurements of the actual physical process (i.e. before duplication of state for cyber representation which can be manipulated by attacker). The difference in cumulative area will then be normalized against the area under the normal operating curves, this results in a value we define as Impact Ratio.

(4)

where :

  • is the operating curve (time-series of operational metric) under normal operating conditions

  • is the operating curve when system is under attack

  • and are start and end times of analysis respectively

In the absence of attacks, there will be no deviation and impact ratio would be 0, whereas during an attack, the deviation in the two operating curves can be measured by the magnitude of change in the ratio. A negative ratio indicates decrease in the measured operational variable whilst under attack, conversely, a positive ratio indicates an increase in the measured operational variable.

3.3.2 Time to Critical State.

Critical states are states where a system’s operational performance and/or safety has reached an unacceptable level. We can ’measure’ how fast a system can reach the nearest critical state by computing the time-to-critical-state [castellanos2019AModular] of the system from a particular time. A longer time-to-critical-state during an attack would indicate that the system has better capabilities to withstand the particular system disruption and hence, more resilient to the particular attack.

4 Model Validation

The approach outlined in Section 3 was applied to model the Secure Water Treatment Testbed (SWaT) [SWaT] using an attacker model and the operational specifications of the testbed as inputs. The attacker is assumed to have control of all the sensors and actuators in the testbed and the attacker’s intention is to disrupt the flow of water through the testbed such that the treated water throughput of the plant is affected. As such, the attacker is able to perform sensor measurement and actuator command spoofing across the whole system to perform complex attacks that maximises damage. Consequentially, the impact on safety that arises from attacks such that the attacker’s intention are fulfilled would be overflowing of tanks and dry running of centrifugal pumps.

4.1 SWaT Model

SWaT is a fully-functional six stage water treatment plant testbed with a throughput of 5 gallons/min (1.14 ) of treated water that is primarily used by researchers to study attacks on ICS infrastructure. Figure 3 provides an overview of the processes within the water treatment plant and the various components within each process (LIT : Level Indicator Transmitter, AIT : Analytical Indicator Transmitter, FIT : Flow Indicator Transmitter, P = Pump).

Figure 3: Overview of Processes in SWaT. Figure from [SWaT]

4.2 Model Validation Approach

In [SWaT], the work provides historical dataset of the testbed’s state under normal operating conditions and under various attack scenarios. We utilize the dataset to validate our model, our model uses the same initial conditions as provided by the dataset and run the simulation for a stipulated time period. The experiments to validate are as follows :

  1. Normal Operating Conditions for 4 hours

  2. Attack 7: Single Stage, Single Point Attack (SSSP) on Process 3’s LIT-301

  3. Attack 30: Multi-Stage, Multi-Point Attack (MSMP) on components in Process 1 and Process 2

We have only simulated Process 1 to Process 5 as Process 6 serves as a recycle stage and out of scope of the study of impact on plant operations related to attacker’s intent and capabilities.

4.2.1 Normal Operations.

The model was first validated against historical testbed data under normal operating conditions for 4 hours. Figures 4 and 5 shows that the model is able to predict the behavior of the testbed as the trends due to control strategy of the plant were replicated. However, we observed that as simulation time increases, the historical plant data lags behind the model prediction. This was due to the ideal assumptions of the model where state-switching time (i.e. time to change valve and pump states) is instantaneous and flows were assumed to be constant at the design specifications. The time lags due to ideal assumption increases over simulation time, however, this can be mitigated by running shorter simulation cycles (about 1 hour).

[FIT-101] [FIT-201] [FIT-301] [FIT-401] [FIT-501] [FIT-502 / RO Permeate]

Figure 4: Summary of Flow Over 4hrs

[LIT-101] [LIT-301] [LIT-401]

Figure 5: Summary of Tank Levels Over 4hrs

4.2.2 Attack 7 : SSSP Attack.

Attack 7 from the dataset was used to validate a single point single stage attack on the testbed. LIT-301 was spoofed to 1200mm (High High Level), triggering the controllers to close MV-201 and stopping P1 pumps. This would stop flow out of P1 and stop flow into P3, hence we would expect LIT-101 to increase and LIT 301 to decrease.

From Figure 7, it could be observed that at 500s, the attack commenced and the LIT-301 level was immediately changed to 1200mm, and during the period of 500s when the level was spoofed, the tank level decreases as no water enters P3 (no flow in FIT-201) and there is flow out of P3 (indicated by non-zero flow in FIT-301). From Figures 6 and 7, FIT-201 flow rate is 0 due to the closure of MV-201, indicating there is no flow out of P1 and hence, an increase in LIT-101 was observed. When the attack was removed, it was noted that the model prediction for tank level LIT-301 matches that of the sensor data with genuine tank level, indicating that the model was able to predict the actual physical water tank level during an attack accurately.

[FIT-101] [FIT-201] [FIT-301] [FIT-401] [FIT-501] [FIT-502 / RO Permeate]

Figure 6: Summary of Flow Over Duration of SSSP Attack

[LIT-101] [LIT-301] [LIT-401]

Figure 7: Summary of Tank Levels Over Duration of SSSP Attack

4.2.3 Attack 30 : MSMP Attack.

Attack 30 from the dataset was used to validate a multi-stage,multi-point attack on the testbed. LIT-101 tank level was spoofed to a constant level of 700mm , actuator commands were spoofed to keep MV-101 closed, P101 on and MV-201 open. The intended result was to cause Tank 1 to underflow and Tank 3 to overflow. The model is able to predict the behavior of attack during the attack, the response of the system of Tank 1 level decreasing and Tank 3 level increasing (Appendix G, Figure 12 and 12 respectively) were accurately predicted. Similarly, our model was able to predict accurately the actual physical tank level of Tank 1 (LIT-101) while the sensor level was being spoofed.

4.3 Summary

In all, the proposed model construction approach based on threat intent and system design specifications was able to extract relevant control components such as sensors and actuators and build the dependencies based on control strategies and system design. By identifying a subset of control variables and components, the model could be easily constructed, especially for large scale systems where many components are unessential with respect to the threat model we wish to analyze. The model was validated and was able to simulate the plant operation with high accuracy for simulation time of less than an hour. The model allows us to simulate plant behavior under normal and attack conditions for analysis.

5 Sensitivity Analysis: Cascading Impact of Attacks

With the model validated, we proceed to conduct sensitivity analysis by performing SSSP attacks on P1 and P3 to study the cascading effect of attacks within the system from attacks that originate from a single point. For our experiments, the attacker’s goal is to decrease the throughput of the system by stopping flow of water. This could be achieved by spoofing actuator commands or spoofing sensor values to trick the controllers to change states of actuators such as valves and pumps to manipulate flow of water in and out of the whole system or specific processes.

5.1 Vulnerable and Critical States for SWaT

Vulnerable states are states in normal operation which are closest to critical states and are defined locally (i.e. for sub-processes), whereas critical states can be either local or global. Global critical states are states where the whole system’s performance or safety has reached an unacceptable level. For our study, we investigate an attack’s cascading effects with respect to global performance critical state where the throughput of the testbed has reached zero.

From the normal operating curves in Figures 4 and 5, we determine the highest and lowest operating points (i.e. 500mm and 800mm for Tank 1 and 800mm and 1000mm for all other tanks) as the vulnerable states as they are states closest to the local critical states such as overflow or underflow of tanks. We use the system state at these specific vulnerable states for the model initial conditions for sensitivity analysis of data spoofing attacks.

5.2 Time-to-Critical-State From Attack Commencement

The time-to-critical-state is the measure of distance of a system to an unacceptable state. For our experiments, we measure the time-to-critical-state of the system from the commencement of the attack where the attack location is a local vulnerable state. The time to critical state would be the time from attack when there is no flow measured or when there is a safety violation such as overflow or underflow of tank.

5.3 Analysis Approach

The cascading effects of attacks would be measured for each process by: 1) computing the impact ratio of the tank levels and flow rates, and 2) determining the time-to-critical state from the commencement of an attack. The local impact of attack would allow us to understand the propagation of attack from a specific location to other parts of the system.

5.4 Stopping Water Inflow Into System

We first explore the effects of attackers conducting attacks to stop inflow of water into the system. This can be achieved by: a) spoofing actuator commands to close inlet valve (MV-101) and, b) spoofing P1 tank level to ”High” to trick the controller to close the valve. The attacks were conducted at the system’s first High Vulnerable State and Low Vulnerable State where Tank 1 is at the highest and lowest normal operating level respectively. The location of attack is MV-101, the valve that controls water inflow into the system and the flow is measured by FIT-101.

[Impact Ratio] [Time-to Critical-State]

Figure 8: Metrics for Impact of Attacks That Closes MV-101

It was observed that the attack by command injection to turn of MV-101 and spoofing of tank level to ”High” state did not result in any difference in the impact on the system. The reason was that the control loop that controls MV-101 acts independently and do not have any other control variables other than LIT-101. As such, the spoofing of tank level was equivalent to injecting command to turn off the valve. Operating curves of system under normal and attack scenarios can be found in Appendix H Figures 13 and 14..

5.4.1 Impact Ratio.

The Impact Ratio (Figure 8) of the operational metrics at the end of the attack shows the magnitude of the attack on the operation. It was found that for this particular attack, the impact was greatest at the location of the attack and all ratios were negative, indicating that the decrease in operational performance is greatest at the point of attack and the magnitude of impact diminishes downstream. The impact of attack was found to be greater when the initial state of the system was at the high vulnerable state as compared to the low vulnerable state, this is due to the additional capacity in the tank which has to be emptied to result in no flow.

5.4.2 Time-to-critical-state.

The time-to-critical-state is the time taken for each process to reach an unacceptable state. From Figure 8, it was observed the time to reach critical state increases downstream, which was expected from processes that are connected sequentially. For the attack when initial state was at the low vulnerable state, the time-to-critical state was found to be shorter, implying that the system reaches unacceptable state in a shorter time and consequence of attack is achieved quicker. It was also observed that the time-to-critical-state were identical for pairs of processes which do not have tanks (i.e. P1-P2 and P4-P5), whereas there is an increase in the time for processes with tanks. This implies that the tanks act as buffer against the impact of attacks.

5.5 Disrupting Flow in System by Manipulating Pump State

The effects of manipulating pump states on the system were explored. Pumps could be switched off by: a) command spoofing and, b) spoofing tank levels to trick the controller to send commands to change the pump state. We conduct these attacks separately for P1 and P3, analysis of the system’s resilience to such attacks were performed.

5.5.1 Attack on P1 pumps.

The location of the attack is at the pumps of P1 where water is removed from Tank 1 (measured by LIT-101) and flow into P2 is measured by FIT-201. The following 3 attacks were conducted: 1) Command spoofing to switch off pumps, 2) Spoof Tank 1 state to ”Low Low”, and 3) Spoof Tank 3 state to ”High”. Operating curves of system under normal and attack scenarios can be found in Appendix H Figures 15 and 16.

[Impact Ratio] [Time-to Critical-State]

Figure 9: Metrics for Impact of Attacks That Turns Off Pumps in P1

It could be observed from Figure 9 that the effect of attacks on P1 and P2 were instantaneous and flow was disrupted. However, due to the tank in P3, disruption of flow was only observed after the tank was depleted. From Figure 9, the impact of attack for command spoofing and spoofing of Tank 3 level were similar where the impact was largest for FIT-101 and FIT-201. The spoofing of Tank 1 level resulted in a positive ratio due to the increase in tank level from the exploitation of control actions to maintain inflow into tank and stop outflow.

5.5.2 Attack on P3 pumps.

The location of the attack is at the pumps of P3 where water is removed from Tank 3 (measured by LIT-301) and flow into P4 is measured by FIT-301. The following 3 attacks were conducted: 1) Command spoofing to switch off pumps, 2) Spoof Tank 3 state to ”Low Low”, and 3) Spoof Tank 4 state to ”High”. Operating curves of system under normal and attack scenarios can be found in Appendix H Figures 17 and 18.

[Impact Ratio] [Time-to Critical-State]

Figure 10: Metrics for Impact of Attacks That Turns Off Pumps in P3

From Figure 10, it could be observed that the effect of attacks on P3 and its preceding processes where instantaneous where flow was halted in these processes. The time taken for P4 to reach critical state was attributed to the presence of Tank 4, where its contents had to be depleted. When flow in P4 is disrupted, the downstream process P5 was immediately affected. From the impact ratio in Figure 10, it was found that the impact of attack for command spoofing and spoofing of Tank 4 level were identical where water flow from P1 to P3 were disrupted. The spoofing of Tank 3 level exploited the control actions that resulted in continuous inflow into Tank 3 and no outflow, which could be observed from the positive ratio.

5.6 Summary and Discussion of Design Improvements

Various attacks were simulated on the model of the testbed and the behavior of the system under attack was analyzed. The cascading effects of attacks on each process were quantified using the Impact Ratio and the time taken for each process to reach critical state was measured.

It was found that processes with tanks have increased ability to withstand the attack as the time-taken to reach critical state increases in these processes. The tanks are stored capacity and act as buffer, where normal operation can continue until the tanks are emptied. To improve the overall system’s ability to withstand attacks, it is recommended that tanks should be installed for every process to maximize the buffer to increase the overall time taken to reach a critical state, providing operators additional time to respond to attacks.

By comparing Figure 9 and 10, it was observed that the attack on P3, a later process resulted in shorter time-to-critical-state. This suggests that for sequentially connected processes, any attacks on later processes would result in decreased capability to withstand the disruptive effects of attacks, decreasing the time available for operators to respond. With this knowledge, redundancy could be introduced for later processes which are critical to operational performance in order to improve the resiliency of the system.

6 Conclusion and Future Work

Understanding the interaction of the adversary with the physical system and studying the behavior of system’s response to cyber attacks can help in the design of defences that improves the system’s resilience. To this end, a novel approach that uses a threat model with an adversary intent and capabilities was used to determine relevant control variables in a CPS. These control variables, together with their related control strategies and design specifications were used to build a model of the CPS. The resulting model was used to study cascading effects of attacks and analyze the resilience of the CPS to data-oriented attacks.

The proposed modeling approach was demonstrated on an actual water treatment testbed and the model was validated with historical data. The model was determined to be accurate in simulating the testbed behavior under normal and attack scenarios. Attacks with the goal to disrupt water flow through the testbed were simulated and the impact of attack was quantified by measuring the Impact Ratio and time taken to reach critical state. The cascading effects of attacks within the system for the various attacks were analyzed and design enhancements were proposed to address identified weaknesses to improve the system’s resilience to cyber attacks.

Further work remains for the water treatment testbed model to study the improvement in resilience from implementing the recommended design changes. The current work looks into the cascading effects of attacks within a system, however, CPSs are usually not independent systems but are connected systems-of-systems (i.e. water treatment plant is connected to a water distribution network). Future research could focus on evaluating cascading effects of attacks on inter-connected systems.

References

G Operating Curves for Validation of Attack 30 (MSMP)

[FIT-101] [FIT-201] [FIT-301] [FIT-401] [FIT-501] [FIT-502 / RO Permeate]

Figure 11: Summary of Flow Over Duration of MSMP Attack

[LIT-101] [LIT-301] [LIT-401]

Figure 12: Summary of Tank Levels Over Duration of MSMP Attack

H Operating Curves for Attacks to Disrupt Water Flow

[FIT-101] [FIT-201] [FIT-301] [FIT-401] [FIT-501] [FIT-502 / RO Permeate]

Figure 13: Summary of Flow Over Duration of Attack to Close MV-101

[LIT-101] [LIT-301] [LIT-401]

Figure 14: Summary of Tank Levels Over Duration of Attack to Close MV-101

[FIT-101] [FIT-201] [FIT-301] [FIT-401] [FIT-501] [FIT-502 / RO Permeate]

Figure 15: Summary of Flow Over Duration of Attack to Stop Pumps in P1

[LIT-101] [LIT-301] [LIT-401]

Figure 16: Summary of Tank Levels Over Duration of Attack to Stop Pumps in P1

[FIT-101] [FIT-201] [FIT-301] [FIT-401] [FIT-501] [FIT-502 / RO Permeate]

Figure 17: Summary of Flow Over Duration of Attack to Stop Pumps in P3

[LIT-101] [LIT-301] [LIT-401]

Figure 18: Summary of Tank Levels Over Duration of Attack to Stop Pumps in P3