Colonel Blotto Game for Secure State Estimation in Interdependent Critical Infrastructure

09/28/2017 ∙ by Aidin Ferdowsi, et al. ∙ Rutgers University Virginia Polytechnic Institute and State University 0

Securing the physical components of a city's interdependent critical infrastructure (ICI) such as power, natural gas, and water systems is a challenging task due to their interdependence and large number of involved sensors. Using a novel integrated state-space model that captures the interdependence, a two-stage cyber attack on ICI is studied in which the attacker first compromises the ICI's sensors by decoding their messages, and, subsequently, it alters the compromised sensors' data to cause state estimation errors. To thwart such attacks, the administrator of the CIs must assign protection levels to the sensors based on their importance in the state estimation process. To capture the interdependence between the attacker and the ICI administrator's actions and analyze their interactions, a Colonel Blotto game framework is proposed. The mixed-strategy Nash equilibrium of this game is derived analytically. At this equilibrium, it is shown that the administrator can strategically randomize between the protection levels of the sensors to deceive the attacker. Simulation results coupled with theoretical analysis show that, using the proposed game, the administrator can reduce the state estimation error by at least 50% compared to any non-strategic action. The results also show that the ICI's administrator must consider the CIs inside a city as a unified ICI for security analysis instead of assigning independent protection levels to each individual CI, as is conventionally done.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The services delivered by a smart city’s critical infrastructure (CI) such as power, natural gas, and water will be highly interdependent [1, 2, 3, 4]. CIs are cyber-physical systems (CPSs) that encompass physical infrastructure whose performance is monitored and controlled by a cyber system, typically consisting of a massive number of sensors. These CPSs exhibit close interactions between their cyber and physical components[4, 5, 6]. The state estimation of the CIs, which uses cyber elements to monitor the physical elements, is a crucial stage for controlling their functionality. However, the interdependence between CIs and the high synergy between their physical and cyber components make them vulnerable to attacks and failures [7, 8, 9, 10].

Numerous solutions have been presented for securing state estimation of CPSs as well as for CI failure detection and identification [11, 12, 13, 14]. In [11], the authors presented a control-theoretic approach for attack detection and identification in noiseless environments using centralized and distributed attack detection filters. The works in [12, 13, 14]

considered the estimation of a CPS under stealthy deception and replay cyber-attacks using a Kalman filter. Moreover, the security of

interdependent critical infrastructure (ICI) has been studied in recent works such as [15, 16, 17]. In [15], the authors assessed the security of interdependent power and natural gas systems under multiple hazards, considering the ICI’s performance as a measurement for security. In [16], the authors proposed an agent-based model to capture the effects of interdependencies and quantify the coupling strength between ICI. Also, the impact of natural and human-included disasters been studied in [17].

Furthermore, the security and protection of sensor networks, which collect data from CIs, has been studied in [18, 19, 20]. In [18], the authors proposed a novel method for physical attack protection with human virtualization in the context of data centers using sensors that detect an impending physical/human attack and, then, alarms to mitigate the attack. The authors in [19], introduced a learning algorithm to extract features from the sensor messages to detect the cyber attacks. The work in [20] proposed a distributed observer for state estimation of CIs in lossy sensor networks with cyber attacks. The works in [21, 22, 23] used game-theoretic tools to study the interactions between the defender of a single CPS and an attacker that seeks to compromise the various nodes of a CPS.

However, the works in [11, 12, 13, 14, 15, 16, 17, 18, 19, 20] do not consider the limitations of the available security resources for the protection, detection, and identification of CI attacks. For instance, in practical smart cities, resource limitations may substantially affect the security of the CIs. In a city, to prevent an attacker from breaking into the sensors of an ICI, numerous methods can be adopted such as encryption of sensor data, implementation of attack detection filters, or periodic monitoring algorithms. However, because of massive data transmission from sensors to the central processing unit, such security solutions will require a large number of computation, a high communication bandwidth, or a considerable amount of financial resources, all of which constitute limited resources for the ICI’s administrator. Therefore, unlike the idealized security solutions in [11, 12, 13, 14, 15, 16, 17, 18, 19, 20], due to resource limitations, the administrator of an ICI has to prioritize between the protection of the cyber components of ICI based on their importance in the state estimation process [24]. Another key limitation in the current literature is that the majority of the existing works, such as [11, 12, 13, 14] and [21, 22, 23] do not take into account the interdependence between the CPSs. Meanwhile, those that account for interdependencies such as in [15, 16, 17] are mostly based on graph-theoretic constructs that abstract much of the functionalities of the CIs. In practice, the functionalities of CIs in a city are largely interdependent and cannot be simply captured by a graph. For instance, power generation in generators which are supplied by natural gas requires instantaneous natural gas transmission from natural gas CI. Therefore, the sensors of each CI will collect the physical measurements of interdependent CIs and, to protect them from cyber attacks, the ICI administrator must consider realistic models for the CI interdependencies.

The main contribution of this paper is a novel game-theoretic framework for analyzing and optimizing the security of a large-scale ICI’s state estimation. To build this unified security framework, this paper makes several contributions:

  • We first introduce a novel integrated state-space model that captures the dynamics of an ICI consisting of power, natural gas, and water distribution systems. For enabling state estimation of the proposed ICI dynamics model, we implement a centralized Kalman filter that uses the sensor measurements collected to estimate the ICI’s state.

  • We consider a two-stage cyber attack that targets the sensors of the ICI so as to manipulate the state estimation. In the first stage, the attacker aims to compromise the ICI sensors by breaking their protection algorithm (e.g., end-to-end encryption or sensor attack detection filter). In the second stage, the attacker manipulates the ICI’s state estimation by altering the compromised sensors’ data to induce state estimation errors. To defend against such attacks and protect the sensors, the administrator of the ICI can assign different protection levels to each group of sensors. The protection levels can vary across sensors for two reasons: a) the available resources which are used in the protection procedure are limited and b) the state estimation sensors have different importance levels.

  • Since the actions of the attacker and the defender are interdependent, we propose a Colonel Blotto game framework[25] to analyze the interactions between the attacker and the administrator. In this game, the attacker chooses the set of sensors to compromise while the administrator assigns protection levels to the sensors. In contrast to existing works on Colonel Blotto for CPS security [21, 22, 23], our game considers the interdependence between multiple CPSs. For this game, we derive the mixed-strategy Nash equilibrium for the administrator and the attacker as a function of their available resources and the maximum state estimation error due to the attack.

  • We simulate a cyber attack to a power-gas-water ICI in which a sensor network collects data from the physical components of the ICI. Simulation results show that the mixed strategy for the administrator increases the protection of the cyber system of a large-scale ICI and reduces the estimation error of the ICI by at least a factor of compared to a baseline case in which the administrator of the ICI does not randomize between the assignment of protection levels. Theoretical and simulation results also show that the ICI administrator must consider all the CIs as a unified system while assigning the protection levels to the sensors, rather than separately protecting them without accounting for their interdependence.

The rest of the paper is organized as follows. Section II presents the dynamic ICI model and studies the state estimation using Kalman filtering. In Section III, we analyze the maximum reachability of the estimation error. The game-theoretic framework is discussed in Section IV. In Section V, we present and analyze the simulation results. Finally, conclusions are drawn in Section VI.

Ii Interdependent Critical Infrastructure and Attack Model

Consider an ICI as a CPS whose physical system consists of three interdependent power, natural gas, and water distribution CIs and whose cyber system is a network of sensors that collect data from the physical components of the CIs and transmit it to a central processing unit. We first derive a state-space model for the physical system of each CI separately and then present the general ICI model. Finally, we discuss the associated cyber system and its vulnerability to attacks.

Ii-a Physical System

The power system can be modeled as a linear dynamic system whose inputs are the electrical power demands from the load buses [26] and [27]. We focus only on generators that are supplied by natural gas [28]and we consider water as a requirement for the vapor condensation and cooling down in some of the generators[29]. Then, the dynamic model for a power infrastructure can be given by:

(1)

where is the power CI state matrix, is the power demand matrix, and and are the matrices of natural gas demand and water demand of power CI, respectively. Moreover, and

are vectors that capture power state variables, power demand, natural gas demand of power CI, and water demand of power CI.

The natural gas and water CIs are designed to supply natural gas and water to consumers in a city. Due to the pressure loss at the junctions of these two CIs, gas compressors and water pumps are used to compensate the pressure loss. We can thus capture the operation of the natural gas CI using the following dynamic model [30]:

(2)

where is the matrix of natural gas CI state variables, is the natural gas demand matrix, is the power demand matrix of the natural gas CI, and is the matrix of relationship between the natural gas output and power demand of natural gas CI. , and are the vectors of natural gas CI state variables, natural gas demand, and power demand of natural gas CI. Similarly, for the water CI, we have [31]:

(3)

where , and are, respectively, the vectors of water CI state variables, water demand, and power demand of water CI. is the state matrix of water CI, is the water demand matrix, is the power demand matrix of water CI, and is the matrix of relationship between the water output and power demand of water CI. Note that, we derived all the matrices in (1), (2), and (3) and the interdependence of CIs as summarized in Appendix A.

Considering the interdependence between power-gas-water CIs, we can derive a unified model for each CI as follows:

(4)

where

(5)

Here, ,,, and are matrices connecting the inputs and outputs of the three CIs whose elements are equal to one if the output of one CI is connected to the input of another CI or is equal to zero otherwise. By substituting (5) into (4), we will have the following state-space model for the interdependent critical gas-power-water infrastructure:

(6)

where

(7)

and and are defined in (8). (6) captures the dynamics of an ICI. In this model, the state variables of three CIs are mutually interdependent, and changes in one CI can affect the other two CIs.

Figure 1: An illustrative example of an ICI
(8)
Figure 2: Interdependence between the state variables of power, gas, and water CI.

In Fig. 1, we present an illustrative example of an ICI that can show the interdependence between the state variables of three CIs. In this example, we consider 10 generators out of which are supplied by natural gas and require water flow to operate. Here, natural gas pipelines, and water pipelines distribute natural gas and water to the demand junctions. Based on this example we find the matrices and in (6) and simulate the ICI. To illustrate how the changes in one CI can affect the state variables of other CIs, we increase the power demand in generator , , at time . Fig. 2 shows the change of state variables of the natural gas pipeline between junctions and and state variables of water pipeline between the junctions and . The reason is that any increase in power demand results in increase of electric power generation, and due to the interdependence between electric power generation and the consumption of the natural gas and water, the state variables of the natural gas and water CIs change.

Ii-B Cyber System

To monitor the state variables in (6), a cyber system is needed. For the considered ICI , the cyber system will consist of a number of sensors spread around the ICI and collecting different measurements from the ICI’s components. Sensors and meters in the power infrastructure measure the instantaneous frequency of the generator, the mechanical input power to the generator, and the line powers between the generators. In the natural gas and water CI, sensors collect the outlet pressure, and inlet flow rate of each pipeline. As shown in Fig. 1, we consider a sensor network that is used to collect data from the ICI and send it to a central server. Each area in Fig. 1 corresponds to the set of all neighboring. The sensor data collected from each CI can be expressed as a linear equation of the states of the ICI, as follows:

(9)

where is given in (6), and is a vector of all the sensor data at each time instant, and is a matrix relating the states to the sensor data. However, due to the inaccuracy in measurements and the process noise in the infrastructure, the owner of each CI must estimate the system state at each time instant. Due to the interdependence between the CIs, their owners have to share the collected data from the components with a single administrator who has access to the ICI model [32]. Note that a lack of cooperation between the owners of the CIs can yield estimation error since the administrator will not be able to capture the interdependencies. Considering the process and measurement noise, and also the discrete sensor data, we can rewrite the state-space model equations and the sensor outputs as a discrete linear dynamic system:

(10)

where , , and are defined in (6) and (9), is the vector of state variables of the ICI at time step , is the vector of external inputs of the ICI at time step , is the process noise at time , and is the measurement noise at time and is the number of sensors. Due to the discrete sensor data, hereinafter, we use (10) in our analysis which is the discrete model for the ICI. For notational simplicity, we use for the discrete state-space model, however, we transform these matrices to the discrete form using available methods such as bilinear transformation. Note that is the initial state of the ICI, and , and

are independent Gaussian random variables with

, , and .

The ICI administrator seeks to estimate the state of the ICI using (10). However, due to sensor error and operation noise, a noise-resilient method is needed to estimate the state variables. To this end, in [33], the authors showed that by using a Kalman filter, one can compute the state estimation from observations . Since the initial time of the ICI is considered

, the Kalman filter converges to a fixed gain linear estimator. To find the state estimate of the system, we first compute the Kalman state probability matrix

:

(11)

then, we compute the Kalman fixed gain as follows:

(12)

Next, we find the state estimation vector at time dependent to knowing the state estimation at time , as:

(13)

Finally, we compute the state estimate using a Kalman filter:

(14)

where the initial state is defined as .

We can define the estimation error at time as the difference between the state and its estimate :

(15)

Using (14) and (15), we have:

(16)

We also define the residue of the Kalman filter:

(17)

Because of process and measurement noise, we need to validate the estimation of the states and detect the failure of the estimation filter. We use failure detector allowing the detector computes the following value at each time step [34]:

(18)

where is the diagonal covariance matrix of the residue , which implies that the residues are independent of each other. If exceeds the threshold level, then the detector will trigger an alarm.

Ii-C Attack Model

Consider the cyber system of the ICI in Fig. 1, where sensors collect measurements from the physical components of the ICI and transmit the measurement data to a central node in their proximity. Then different central nodes will transmit the data to a central server that will calculate the estimation of ICI state variables using the presented Kalman filter in (14). We refer to the group of sensors which connect to a single central node, as a sensor cluster (SC). We consider a two-stage attack model to the cyber system of our ICI. In the first stage, the attacker aims to compromise the ICI’s state estimation sensors by collecting their data and sending it to its central processor. After compromising some of the sensors, in the second stage, the attacker alters the sensor data to increase the estimation error of the ICI. In the first stage, the attacker has to break the security solution that is implemented by the administrator of the ICI (referred to as the defender, hereinafter).

Our model can be used to capture any ICI security solution that can include an end-to-end encryption of the sensor data[35], an implementation of attack detection filter[11], or a physical protection of the sensors[18]. Therefore, to compromise any SC inside the ICI, the attacker has to collect the broadcast data from the sensors to the central nodes and compromise the implemented security solution. However, this requires processing of the collected data from the sensors across the ICI, physical presence of the attacker in the proximity of the central nodes to collect data, or communication resources for transmission of the collected data to the attacker’s central processing unit. Since processing, communication and human resources are limited, the attacker needs to prioritize between the sensors based on their importance in the state estimation of the ICI. From the defender’s point of view, implementing the aforementioned security solutions, requires computational resources, communication bandwidth, or financial resources which are restricted in availability for the defender. Therefore, the defender must also prioritize between the sensors of the ICI that it seeks to protect.

In summary, the attacker aims to maximize the number of compromised sensors and the defender seeks to protect the SCs of the ICI from this cyber attack, under strict resource limitations at both sides. To analyze this interactions between the attacker and the defender, first, we analyze the second stage of attack to find the maximum estimation error caused by the cyber attack and quantify the importance each SC in the ICI, then using these values we can formally analyze the attacker-defender interaction and derive optimal defense strategies.

Iii Maximum State Estimation Error in the Compromised Sensors

In this section, we analyze the impact of the second stage of the cyber attack in order to quantify the ability of an attacker to increase the estimation error by altering the sensor data. We assume a worst-case scenario for security analysis in which the attacker has complete knowledge about the system as done in [11] and was able to compromise some of the SCs in the first stage. We assume that the attacker can change the data of the compromised sensors to a desired value in order to disturb the ICI’s state estimation. Given the set of all compromised SCs, , we define attack vector where is the number of SCs, and is the attack vector on SC where is the number of sensors in SC . Also, if . Therefore, the linear relationship between the state variables of the ICI and the sensor data under attack will be:

(19)

where is the vector of sensor measurements under attack, and is independent from , , and . Here, we assume that the attack to the sensors starts from . When the ICI’s cyber system is under attack, the Kalman state estimation filter of the ICI in (14) changes as follows:

(20)

where is the estimate of the states under attack. The new residue and estimation error are defined as follows:

(21)

We can define the error difference between the ICI under attack and in absence of attack as follows:

(22)

Using (14) and (20), we can find the following model for the difference in error and residue:

(23)
(24)

We define cumulative error difference (CED) at time step :

(25)

where is a positive semidefinite diagonal matrix which is the relative cost of each state error. For convenience, we define which is a Hurwitz stable matrix since the ICI model is a stable system [33]. Next, we derive the maximum CED caused by an impulse attack to an SC.

Proposition 1.

If is a Hurwitz stable matrix, then the maximum CED caused by an impulse attack to a set of sensors is:

(26)
Proof.

Since is a vector with for and the attack is an impulse input, then, we have for . Therefore, using (23), we have . For a stable system, the impulse response of the state vector returns to the origin at for an arbitrary error deviation in , , or:

(27)

for any , where the columns of

are the eigenvectors and

is the diagonal matrix of eigenvalues of

[33]. Assume that where are the eigenvalues of , then, all the elements of are a linear combination of components . And since in a Hurwitz stable matrix then for . Therefore, we have:

(28)

where is the element in the -th column and -th row of matrix . Note that based on the definition of diagonal semi definite matrix, has a non-negative value. (III) states that the maximum value of for occurs in . Thus, the maximum value of is . Since the initial condition of is , this results in:

(29)

This proves (26). ∎

Proposition 1 shows that the maximum CED caused by an impulse attack occurs during first two time instants, , after the initiation of the impulse attack to the sensors. In the presence of the attack, the failure detector in (18) computes the following value in each time step:

(30)

Now, we define two new parameters for the analysis of probability of failure in the system as follows:

(31)
(32)

where and capture probabilities of failure in absence and existence of attack, respectively, and is the failure trigger threshold.

Definition 1.

An impulse attack to set is -feasible if:

(33)

for all , where and is the Kullback-Leibler (KL) distance between and .

Using [14, Theorem 1], we can directly prove the convergence of to as , as follows.

Lemma 1.

For any , there exists , such that if

(34)

for , then

(35)

for all .

Lemma 1 shows that, if the probability of alarm triggering at time increases by a value of in presence of attack, , then, there exists a value for such that an impulse attack can be designed with a KL distance lower than . Now, if the attacker wants to design an -feasible impulse attack then it should change the sensor data such that the KL distance never exceeds . Next, we find the maximum KL distance caused by an impulse attack to a set of sensors.

Lemma 2.

If is a Hurwitz stable matrix, then the maximum KL distance caused by an impulse attack to a set of sensors will be:

(36)
Proof.

From (23) and (24) we have:

(37)
(38)

therefore, all the elements of vector will be a linear combination of eigenvalues of powered by , , for and thus for . Then, we have:

(39)

where is the element in the -th row and -th column of . In addition, since is a diagonal matrix with positive entities, then its inverse has positive values, hence . (39) implies that the KL distance is decreasing for , and hence, the maximum KL distance will occur in or and this proves (36). ∎

Lemma 2 finds the maximum KL distance caused by an impulse attack to a set of sensors. We use the maximum error caused by an impulse attack and maximum KL distance to find the maximum CED caused by an -feasible attack in the following theorem. This theorem essentially quantifies the maximum CED that the attacker can cause without triggering the alarm to a set of sensors.

Theorem 1.

The maximum CED caused by an impulse -feasible attack to a set of sensors is the solution of the following quadratic program with quadratic constraints:

(40)
s.t. (41)
(42)

where

Proof.

From Proposition 1, we know that the maximum CED caused by an impulse attack which we define it as vector in time step to a sensor set is:

(43)

where for and is the -th entity of vector . From Lemmas 1 and 2, we know that the maximum KL distance caused by an -feasible attack to the sensor set can not exceed and therefore we have:

(44)

then, should maximize (43) with constraints in (44), and considering for . Also, since and are positive definite matrices then , , , , and are all semi positive definite matrices. Theorem 1 provides a method for the attacker to find the maximum CED caused by altering a set of sensors without triggering failure alarm. To solve the optimization problem in Theorem 1, known techniques such as quadratic programming can be used [36]. Using theorem 1, we can assign a value to quantify the maximum CED for each of the ICI’s SCs. To do so, for each of the ICI’s SCs we calculate the following value:

(45)

where is the value of SC in the state estimation and is the set of sensors inside SC . This value captures the importance of each SC for the attacker and the defender in the first stage of attack, because the attacker can increase the estimation error by in the second stage of attack after compromising the SC in the first stage. Based on this value both the attacker and the defender can prioritize between their actions in the first stage. Since we can now quantify the importance of different SCs under attack, next, we study how the ICI can defend against the first stage of attack during which the attacker and the defender should allocate their available resources on all the SCs based on their values.

Iv ICI Security Resource Allocation as a Colonel Blotto Game

In this section, we analyze the resource allocation of the attacker and the defender in the first stage of the cyber attack. In the considered model, the available resources for the defender and attacker are denoted by , and , respectively. Consequently, the defender and the attacker must simultaneously allocate their resources across a finite number of SCs, . Moreover, each SC in ICI has a value, given by (45) which quantifies the maximum CED caused by compromising SC . This value captures both the cyber and physical nature of the ICI as per (45). Hereinafter, we use subscripts and to denote the attacker and the defender, respectively.

Also, denotes player ’s allocation vector across SCs. In each SC , the defender assigns a protection level which requires resources. In contrast, the attacker spends some effort to break the sensor’s security mechanism, which requires resources in SC . For instance, in end-to-end encryption, for any encryption algorithm the defender must consider a number of computations in the decryption of each SC’s messages in the central server [35]. To break this encryption, the attacker must collect the messages of each SC and break the encryption using a large number of computation, which requires the attacker to assign a portion of its available computational resources for each SC. Such a resource limitation is not restricted to cases of encryption as it can also be applied to other protection methods such as attack detection filters [11].

Therefore, for any protection method, in each SC, if the defender allocates more resources than the attacker then the defender prevents that SC from being compromised. In this case, we assign the normalized value of SC to the defender and zero to attacker if the defender wins SC . In contrast, if the attacker allocates a higher number of resources in each SC, then the attacker can compromise that SC. In this case, we assign the normalized value of SC to the attacker and zero to the defender if the attacker wins SC (i.e., in this case, the CED is zero, and the defender perfectly protects its SC). Also, in case of equal allocation of resources, which has the probability of zero due to the continuous action space of the attacker and the defender, we share the normalized value of each SC equally between the attacker and the defender. Therefore, in each SC , the normalized payoff for the attacker and defender is given by:

(46)

where is the opponent of and

(47)

The total payoff of the defender and the attacker resulting from allocations across all SCs is the sum of the individual payoffs in (46) received from each individual SC:

(48)

Here, we define the total CED caused by the allocation vectors and as follows:

(49)

since captures summation of the estimation errors from all the SCs. The attacker aims to increase its utility function in (48) by maximizing the number of compromised SCs which results in maximizing the total state estimation error. Also, the defender seeks to increase its utility function in (48) by maximizing the number of protected SCs from the cyber attack to minimize the state estimation error. Moreover, the payoff for each player depends on the actions of both players and, thus, we can use a game-theoretic approach to solve this problem [37]. In particular, next, we first model the problem as a two-player Colonel Blotto game [25] between the attacker and the defender, and then present the solution for the game. The Colonel Blotto game framework is particularly suitable for the considered ICI security problem since, in this game, two colonels simultaneously allocate their available military resources on battlefields, where the winner of each battlefield is the colonel with a more allocated resources and both the colonels aim to maximize the number of won battlefields. This is similar to the problem in (48), in which SCs are the battlefields and the defender maximizes the number of protected SCs while the attacker seeks to maximize the number of compromised SCs.

Iv-a Game Formulation and Pure Strategy Nash Equilibrium

To model the interdependent decision making processes of the attacker and defender, we introduce a noncooperative Colonel Blotto game[25] defined by six components: a) the players which are the attacker and the defender in the set , b) the strategy spaces for , c) available resource for , d) number of the SCs , e) normalized value of each SC for , , and f) the utility function, , for each player. For both players, the set of pure strategies corresponds to the different possible resource allocations across the SCs:

(50)

Also, the utility function of each player, , can be defined as in (48). The utility function in (48) is a symmetric case for the Colonel Blotto game where , which indicates that the values of SCs are equal for the defender and the attacker. In the following subsection, first we present the solution of the Colonel Blotto game for a general case of , then we derive the solution of symmetric case. For notational simplicity, hereinafter, we drop the arguments in the notation of variables and .

One of the most important solution concepts for noncooperative games is that of the Nash equilibrium (NE). The NE characterizes a state at which no player can improve its utility by changing its own strategy, given the strategy of the other player is fixed. For a noncooperative game, the NE in pure (deterministic) strategies can be defined as follows:

Definition 2.

A pure-strategy Nash equilibrium of a noncooperative game is a vector of strategies such that , the following holds true:

(51)

The NE characterize a stable game state at which the defender cannot improve the protection of the ICI’s SCs by unilaterally changing its action given that the action of the attacker is fixed at . At the NE, the attacker cannot increase the state estimation error of the ICI by changing its action, , when the defender keeps its action fixed at . However, the NE is not guaranteed to exist in pure strategies. In particular, for a Colonel Blotto game, without loss of generality, if , then it can be proven that, for there exist no pure-strategy NE [25]. However, it is proven that there exists at least one NE in mixed strategies [38] for noncooperative games. When using mixed strategies, each player will assign a probability for playing each one of its pure strategies. For an ICI security problem, the use of mixed strategies is motivated by two facts: a) both players must randomize over their strategies in order to make it nontrivial for the opponent to guess their potential action, and b) the allocation of resources can be repeated over an infinite time duration and mixed strategies can capture the frequency of choosing certain strategies for both players. A mixed strategy, which can be termed as a distribution of resources, for player is an -variate distribution function with support contained in player ’s set of feasible allocations, . We also define univariate marginal distribution functions (MDFs) for each SC and can be called as distribution of resources on each SC .

Iv-B Mixed-Strategy Nash Equilibrium Solution

In a game-theoretic setting, each player chooses its own mixed-strategy distribution to maximize its expected utility. We first derive the solution for a special case of our problem in which the attacker and the defender consider the expected allocation of their resources on each SC instead of exact allocation. This is a special case of the Colonel Blotto game known as the General Lotto game [39]. In a Colonel Blotto game, the sum of allocated resources cannot exceed the limited resources for the players as in (50). In contrast, in a in General Lotto game, the sum of expected allocated resource on SCs cannot exceed the restricted resource of players:

(52)

where is the expected value of resources allocated by player on SC . In this case, the utility of each player is defined as the expected value over its mixed strategies:

(53)

Therefore, player ’s optimization problem considering its constraint on the available resource is:

(54)

where is a multiplier for player ’s expected resource allocation constraint. For each , the corresponding first-order condition for maximizing (54) is given by:

(55)

where (55) is equivalent to the necessary condition for a single all-pay auction game where player ’s value for the prize in auction is [40]. In such an all-pay auction, if the solution of (55) is described as follows:

(56)
(57)

Now, to find the multipliers , let and assume that is the set of SCs in which . Then using (54), (56), and (57), we have:

(58)
(59)

From [39, Propostion 1] we know that there exists at least one solution to system of equations in (58) and (59).

Now that we characterized the functions that maximize the expected utility of players in (53), we first define the solution concept of mixed strategy Nash equilibrium (MSNE) and then, finalize the solution of Lotto game by deriving its MSNE. The MSNE is defined as follows:

Definition 3.

A mixed strategy profile constitutes a mixed strategy Nash equilibrium if for player we have:

(60)

where

is the set of all probability distributions for player

over its action space .

The MSNE for this game characterizes a state of the system at which the defender has chosen its optimal randomization over the allocation of resources on SCs and, thus, cannot improve the protection of ICI’s SCs by changing this choice. Also, the MSNE for the attacker is a probability distribution that captures the allocation of its resources over the SCs in a way to maximize the state estimation error when the defender chooses its MSNE strategies. Using the definition of the MSNE, we define the expected CED at MSNE as follows:

(61)

It is proven in [39, Theorem 1] that for each solution for system of equations in (58) and (59), each player in a General Lotto game has a unique MSNE with univariate marginal distributions in (56) and (57). In the following proposition, we characterize the solution for our problem when the values of the ICI’s SCs for both attacker and defender are equal and, then, we find the expected state estimation error.

Proposition 2.

For the problem of resource allocation over SCs having equal values for the attacker and the defender , at the MSNE, the MDFs for the attacker and the defender, when the defender’s resources are greater than the attacker’s resources, , will be given by:

(62)
(63)

and the expected CED at MSNE will be .

Proof.

Considering and the solution for the system of equations in (59) and (58) is