Security, physical and cyber, has come to the forefront of national attention, particularly after 9/11. Among the variety of approaches that are used to tackle security problems, from risk analysis to red teaming, game theory has had a significant impact, with tools based on game theoretic analysis having been deployed in LAX airport to schedule canine patrols Paruchuri08,Jain08:Bayesian,Pita09:Using, by Federal Air Marshall Service (FAMS) to schedule the air marshals Kiekintveld09:Computing,Jain10,Jain10a, and by the US Coast Guard to schedule boat patrols Shieh12. All of these deployments, and numerous other related efforts, have cast security as a Stackelberg game between a single defender and an attacker, in which the defender leads (i.e., acts first), choosing a probability distribution over defense actions, and the attacker, upon learning this probability distribution, chooses a response Conitzer06:Computing. In many cases, the attacker is modeled as a rational agent who selects an optimal response and, in the many applications that compute a Strong Stackelberg equilibrium, an attacker is often assumed to break ties in the defender’s favor Paruchuri08,Korzhyk11.
A crucial assumption that all these efforts have in common is that a single defender is responsible for all the targets that need protection, and that she has control over all of the security resources. However, there are many domains in which there are multiple defender agencies who are in charge of different subsets of all targets. In practice, numerous parties are responsible for security; indeed, the fact that the basic framework has been deployed by different entities and agencies makes this manifest already. If security decisions made by different parties were entirely independent, both from the defender’s and the attacker’s perspective, a single-defender model would be entirely satisfactory. However, the assets protected by different entities are typically interdependent, or, more generally, have value to others who are not involved in security decisions. Additionally, attackers, insofar as they may target different sectors under the charge of different defenders, are resource constrained, implicitly coupling otherwise independent targets.
An important motivating application for our multidefender security game is security and reliability in the power grid. Independent System Operators (ISOs) and profit-driven independent utility operators are largely responsible for operating and controlling subsystems of the entire grid RAP11a. These operators are held responsible for the reliability of their system, and thus have independent, and possibly even competing, goals with neighboring ISOs. As such, their security decisions are made independently, despite the interdependencies present between subsystems. As a result of this organization, cascading failures in the power grid can present a great threat to the entire system, even when an explicit attacker is not present. This problem is exacerbated by the fact that components in the grid are controlled by multiple entities and are also dependent on other independently operated utility networks (water, communications, natural gas, etc.).
We extend the previous Stackelberg game models in two ways:
an analytic equilibrium and price of anarchy (PoA) characterization of multidefender security scenarios, in which we assume homogeneous and independent valuations of the targets for each defender; and
a computational analysis leveraging a novel mixed-integer linear program (MIP) approach for computing a defender’s best response, combined with a novel heuristic method for approximating equilibria in interdependent multi-defender security games with heterogeneous targets.
In case where there are multiple defenders, and the values of the targets are independent and homogeneous among the defenders, our analysis is focused on three models of such multi-defender games (each varying in their level of generality). We show that a Nash equilibrium among defenders in this two-stage game model need not always exist, even when the defenders utilize randomized strategies (i.e., probability distributions over target protection levels); this is distinct from a model in which the attacker moves simultaneously with the defenders, where a mixed strategy equilibrium is guaranteed to exist. When an equilibrium does exist, we show that the defenders protect all of their targets with probability 1 in all three models, whereas the socially optimal protection levels are generally significantly lower. When no equilibrium exists, we characterize the best approximate Nash equilibrium (that is, one in which defenders have the least gain from deviation), showing that over-investment is substantial in this case as well. Our price of anarchy (PoA) analysis, which relies on the unique equilibrium when it exists, and the approximate equilibrium otherwise, demonstrates a surprising finding: whereas PoA is unbounded in the simpler models, increasing linearly with the number of defenders, the more general model shows this to be an atypical special case achieved when several parameters are exactly zero. More generally, PoA is bounded by a constant.
For case , we introduce interdependencies between targets. Because closed-form analysis in this setting is intractable, we propose a novel mixed-integer linear programming approach combined with a novel heuristic method to approximate equilibrium behavior. Unlike other multi-defender models (e.g., Kunreuther03,Chan12,Bachrach12,Acemoglu13), our approach maintains the typical complexity of individual defender decision process in the multi-defender framework, with each defender responsible for securing many, possibly interdependent, targets. Our setup gives rise to two competing externalities of security decisions: a positive externality, where greater security implies reduced contagion risk to other defenders, and a negative externality, which arises because high security by one player pushes the attacker to attack someone else’s assets. We study the impact of competing externality effects of defense on the resulting Nash equilibrium outcomes as a function of network topology (using both synthetic and real networks), interdependent risk, and the level of system decentralization. One of our key findings is that the impact of system decentralization on security and welfare can be non-monotonic when systems are highly interdependent: high levels of decentralization can yield near-optimal outcomes, even as moderate decentralization results in significant underinvestment. With weak interdependencies, on the other hand, an increasingly decentralized system tends more strongly to over-invest in security.
The remainder of our paper is outlined as follows. In Section , we give an overview of related work. In Section , we briefly outline the definitions and solution concepts of our independent and interdependent multi-defender security games, respectively. Section provides an equilibrium and PoA analysis of the homogeneous, independent security game models. Section further explains the interdependent multi-defender model, and presents results on well-studied synthetic networks, as well as on real-world power grid networks.
2 Related Work
Our work, like much work in the recent security game literature, builds on the notion of Stackelberg games [Osborne RubensteinOsborne Rubenstein1994], which model commitment in strategic settings. The first thorough computational treatment of randomized (mixed strategy) commitment was due to Conitzer06:Computing. In this line of work, of greatest relevance to our effort are multiple-leader Stackelberg games Sherali84,DeMiguel09,Leyffer07,Rodoplu10,Kulkarni14,Sinha14. In many cases, these approaches leverage specialized problem structure, and are not immediately applicable to our setting. In particular, Sherali84 and DeMiguel09 focus on relatively simple models with firms setting production quantity (a single variable), aiming to maximize profit. Both show existence and uniqueness of equilibria in their setting, and leverage these characterization results to obtain solutions to the games. Similarly, Rodoplu10 consider a relatively simple model of network competition in which leaders are nodes setting prices for packets transmitted through them; again, each leader only sets a single variable, the utility functions are problem-specific, and algorithms are specialized to the particular problem structure (and are inapplicable to our setting).
Sinha14 propose an evolutionary algorithm for solving bi-level Stackelberg problems, but their problem structure is also highly specific to the domain of interest (firms choosing production, investment, and marketing, and maximizing profit), and the evolutionary algorithm leverages significant simplifications, such as the assumption that the market eventually clears. Leyffer07 present a very generic multi-leader multi-follower setting and solution framework in the context of shared complementarity constraints (which is the case for our problem, where a single follower attacks a single target), but rely on separability of objective functions in leader and follower variables, the assumption that does not obtain in our setting (in addition, their approach only scales to 2-4 leaders, whereas we are able to approximately solve games with 64 leaders). Kulkarni14 offer a deep theoretical treatment of a relatively broad class of multi-leader multi-follower games, but much of their analysis and positive results are restricted to potential games, and they do not offer specific algorithmic suggestions. Like us, they leverage shared constraints to resolve the issue of incompatible leader assumptions about the follower tie-breaking behavior.
Our point of departure is a class of Stackelberg games specifically pertinent to security: commonly, these are simply known as security games Korzhyk11,Paruchuri08,Jain10a,Vorobeychik11. In these games, a single defender allocates a set of resources among potential targets of attack in a randomized fashion (that is, the defender commits to a probability distribution over resource-to-target mappings), with an attacker choosing a single target to attack after observing the defender’s strategy. Almost universally in this domain, a Strong Stackelberg equilibrium (SSE) is a solution concept of choice. In SSE, the follower (attacker) is assumed to break ties in the defender’s favor. As we will see below, this solution concept presents conceptual and technical problems in a multi-defender setting.
A similar, though mostly orthogonal, line of work are network interdiction problems Cormican98,Woodruff03, in which a leader attempts to interdict a network over which the follower subsequently solves a variation of a network flow problem. Unlike the literature on security games, as well as our setting, network interdiction problems are almost universally zero-sum (minimax).
Another somewhat related line of work considers the problem of coordination and teamwork among multiple defenders in a purely cooperative setting [Jiang, Procaccia, Qian, Shah, TambeJiang et al.2013, Shieh, Jiang, Yadav, Varakantham, TambeShieh et al.2014a]. This work, however, is entirely unlike ours: in particular, our primary focus is on the impact of incentive misalignment among the defenders with different (though certainly related) motivations, rather than coordination issues and teamwork. While often effective coordination among multiple defenders can be achieved, just as often (if not predominantly) decentralization of decision making processes and resources inherently give rise to distinct, and often conflicting, incentives among defenders.
Among the earliest multi-defender models is the literature on interdependent security games Kunreuther03, in which interactions among multiple defenders are modeled as an -player,
-action game, where a player decides whether to invest in security; however, no attacker is considered. More recently, time-dependent scenarios where coordination of defender resources amongst multiple defenders is assumed have been studied using Markov decision processes Shieh14a. Since total cooperation is assumed, this model effectively reduces to a single defender game in which the defender controls all resources. A recent extension,interdependent defense games Chan12, does consider an attacker who acts simultaneously with the defenders, rather than after observing the joint defense configuration, as in our model. Interdependent defense games have also been studied in the context of traffic infrastructure defense Alderson11a. Two recent efforts studying multi-defender games explicitly model interdependence among targets through a probabilistic contagion process Bachrach12,Acemoglu13. Like our paper, they consider attackers who observe the joint defense prior to making a decision, but each defender is restricted to secure a single node, and strategy space is assumed to be continuous. Vorobeychik11 is, to our knowledge, the only other attempt to study strategic settings related to security in which each player’s decision space is combinatorial. However, this work does not consider a strategic attacker.
3 Multidefender Models
Our modeling effort proceeds in four steps, each generalizing the previous. As we see below, each generalization step reveals new and surprising insights about the multi-defender security setting, allowing us to appreciate the fundamental incentive forces. The first three models deal with homogeneous, independent targets and will be analyzed exactly using Nash equilibrium and price of anarchy (PoA) analysis. Our final model introduces interdependencies, and will be analyzed using computational methods.
3.1 The Baseline Model
We start with a model which most reflects the related literature: in particular, this model involves defenders and a single attacker, with each defender engaged in protecting a single target. Each target has the same value to the defender . We suppose that the defender has two discrete choices: to protect the target, or not. In addition, the defender can randomly choose among these; our focus is on these coverage probabilities (i.e., the probability of protecting, or covering, the target), which we denote by for a given defender
. The attacker is strategic, can observe the defenders’ coverage probabilities, and chooses a target that maximizes the damage. We assume that attacker is indifferent among the targets, and attacks the target with the lowest coverage probability, breaking ties uniformly at random. In a given scenario, for all defenders, the attacker’s strategy is a vector of probabilities, where is the probability of attacking the target protected by defender , with .
We assume that if the attacker chooses to attack a target corresponding to defender and defender chooses to protect the target, then the utility of both is , and if the attacker attacks the target but it is not protected, then the utility of the defender is while the attacker’s utility is . If a defender chooses to cover a target, it will incur a cost . Additionally, we assume that targets are independent, i.e., if defender is successfully attacked, all other defenders receive 0 utility. We can thus define the expected utility of a defender as
where is the utility of if it is attacked, and is the utility of if it is not attacked. By the assumptions above,
3.2 The Multi-Target Model
Our key conceptual departure from related work is in allowing each defender to protect multiple targets, aligning it better with practical security domains. Specifically, suppose that there are defenders, each protecting targets. Then the strategy of defender will be a vector . The strategy profile of the attacker can then be described as a matrix of probabilities,
in which and for each and . The expected utility of a defender in this model is
where is the utility of target to defender if it is attacked, and is the utility of target to if it is not attacked. Using the notation introduced earlier, we have
3.3 The General Model
Generalizing further, we assume that if the attacker chooses to attack a target protected by defender and the defender chooses to protect the target, then the utility of the target to defender is , and if the attacker attacks the target but it is not protected, then the utility of the target to the defender is . It is reasonable to assume that . If the target of defender is not attacked, then we assume that the utility of the target for defender is . Other assumptions are the same as those in the multi-target model. In the general model, therefore,
3.4 The Interdependent Model
The previous three models featured three important restrictions: first, that target values are homogeneous, second, that targets are independent, and third, that defenders protect the same number of targets. We now relax these restrictions. Suppose that a defender can choose from a finite set of security configurations for each target , where is the set of targets under ’s direct influence. Let be the set of all targets, that is, , with . A configuration for target incurs a cost to the defender . If the attacker attacks a target while configuration is in place, the expected value to a defender is denoted by , while the attacker’s value is . We assume in this model that each player’s utility depends only on the target attacked and its security configuration Kiekintveld09:Computing,Letchford12. We denote by the probability that the defender chooses at target .
While the problem we study assumes that that the utility of any player for a given target depends only on its security configuration , there is a rather natural way to model interdependencies while retaining this structure, proposed by Letchford12. Specifically, suppose that dependencies between targets are represented by a graph , with the set of targets (nodes) as above, and the set of edges (), where an edge from to means that a successful attack on may have impact on . Each target has associated with it a value, , for the defender , which is the loss to if is affected (e.g., compromised, broken). The security configuration determines the probability that target is affected if the attacker attacks it directly and the defense configuration is . We model the interdependencies between the nodes as independent cascade contagion Kempe03,Letchford12. The contagion proceeds starting at an attacked node , affecting its network neighbors each with probability , the contagion then spreads from the newly affected nodes to their neighbors, and so on. The contagion can only occur one time along any network edge, and once a node is affected it stays affected through the diffusion process. Each player’s valuation for each target is then updated based on the probability of a failure cascading to one of the player’s owned targets.
3.5 The Weakness of Strong Stackelberg Equilibrium
By far the most important solution concept in Stackelberg security games is a Strong Stackelberg equilibrium (SSE). A SSE is characterized by an assumption that the attacker breaks ties in defender’s favor. When there is a single defender, this is well defined, and quite reasonable when the defender can commit to a mixed strategy: a slight adjustment in the defense policy will force the attacker to strictly prefer the desired option, with little loss to the defender. As we now illustrate, however, SSE is fundamentally problematic in a multi-defender context, because the notion of “breaking ties in defender’s favor” is no longer well defined in general, as we must specify which defender will receive the favor.
To see concretely what goes wrong, consider the example in Figure 1. In this example there are two defenders,
one who defends the target on the left, while the other defends the target on the right. Both defenders value their respective targets at , and have no value for the counterpart’s target. The cost of defending each target is . Now, consider a strategy profile in which for both targets , and let us focus on the best response of the first (left) defender. If this defender attempts to compute an SSE by fixing the strategy of the second player, he perceives his utility under the current strategy profile to be , since he would assume that the attacker breaks ties in his favor and, thus, attacks the defender on the right. By the same logic, the defender on the right will assume that the attacker will attack his counterpart, and perceive to be the best response. Since the attacker actually attacks one of them, the best response of the defender being attacked is to defend with a small probability, pushing the attacker towards the other target. What goes wrong here is that both players assume that the attacker attacks the other (breaks ties in their favor), which is inconsistent with the assumption that the attacker will certainly attack some target.111The problem we observe is similar to the issue of inconsistent conjectures the leaders in a multi-leader Stackelberg game could have about follower behavior noted by Kulkarni and Shanbhag [Kulkarni ShanbhagKulkarni Shanbhag2014]. The solution we propose below—ASE—has the effect of imposing a shared constraint, the idea introduced by Kulkarni and Shanbhag generically. We note, however, that our concern here is not merely the fact that inconsistent conjectures lead to disequilibrium; rather, they could lead to nonsensical equilibria!
3.6 Solution Concepts
Since the classic (two-player) SSE solution concept used in Stackelberg security games does not conceptually extend to be an individual defender best-response problem in the multi-defender setting, we need to consider an alternative. One option is to compute an arbitrary subgame perfect equilibrium. However, we wish to impose a natural constraint on the solution concept that the attacker’s best response be computed consistently for any joint defense policy, just as it is in a SSE (in other words, we wish to fix a tie-breaking rule). One natural tie-breaking rule is that the attacker chooses a target uniformly at random from the set of all best responses. We call the corresponding solution concept (which is a refinement of the subgame perfect equilibrium of our game) the Average-case Stackelberg Equilibrium (ASE). The crucial property of this solution concept that we desire is that the attacker’s behavior presumed by a defender’s best response problem is independent of that defender’s identity, a property that SSE violates. As we demonstrate below, ASE is not guaranteed to exist, in which case we focus on -equilibria, in which no defender gains more than by deviating; in particular, we will consider -equilibria with the smallest attainable .
To measure how the efficiency of the game degrades due to selfish behavior of the defenders, we consider Utilitarian Social Welfare and -Price of Anarchy in our paper. Utilitarian Social Welfare is the sum of all defenders’ payoffs. For the smallest attainable , we define -Price of Anarchy (-PoA) as follows:
where is the optimal (utilitarian) social welfare that can be obtained (i.e., if there was a single defender), and is the worst-case (utilitarian) social welfare in -equilibrium. An underlying assumption of this definition is that the value of and are both positive. If they are both negative, then -PoA will be the reciprocal of above equation. Note that the ordinary Price of Anarchy is a special case of -Price of Anarchy with .
4 Equilibrium Analysis of Independent Multi-Defender Security Games
In this section, we consider scenarios in which the values of the targets are independent and homogeneous among the defenders. Our equilibrium and Price of Anarchy analysis will show that a Nash equilibrium among defenders in the two-stage game model (equivalently, ASE) need not always exist, even when the defenders utilize randomized strategies (i.e., probability distributions over target protection levels). For cases when there is no Nash equilibrium, we make use of approximate Nash (ASE) equilibrium and the associated ()-Price of Anarchy.
4.1 The Baseline Model
Our first result presents necessary and sufficient conditions for the existence of a Nash equilibrium among defenders in the baseline model, and characterizes it when it does exist.
In the Baseline model, Nash equilibrium exists if and only if . In this equilibrium all targets are protected with probability 1.
Firstly, we claim that Nash equilibrium among defenders can appear only if all targets have the same coverage probability . Otherwise, some defender who has probability 0 of being attacked has the incentive to decrease her . To find the Nash equilibria, we need only consider strategy profiles in which all targets have the same coverage probability.
When all defenders use the same coverage probability , each defender’s expected utility is
If , some defender could increase to , where is a small positive real number, to avoid being attacked, and receive utility , so that
As can be arbitrarily small, when . Consequently, when , a defender always has an incentive to deviate, which implies that the only possible Nash equilibrium can be for all players to play .
When all defenders use coverage probability , each earns an expected utility of
If a defender decreases her coverage probability to , then she will be attacked with probability 1, and receive expected utility , so that
If , then , and is indeed a Nash equilibrium. If , however, , which implies that a Nash equilibrium does not exist. ∎
Thus, if a Nash equilibrium does exist, it is unique, with all defenders always protecting their targets. But what if the equilibrium does not exist? Next, we characterize the (unique) -equilibrium with the minimal that arises in such a case. We will use this approximate equilibrium strategy profile as a prediction of the defenders’ strategies.
In the Baseline model, if , the optimal -equilibrium is for all defenders to cover their target with probability . The corresponding is .
We firstly consider strategy profiles in which all targets have the same probability of being protected. Then each defender’s expected utility is
Suppose . If a defender slightly increases to , she could receive a utility , with
Suppose . If a defender slightly decreases to , she could receive utility , with
Let , . If , a defender could deviate to increase utility by at most . If , a defender could deviate to increase utility by at most . When , we have that , and it is -equilibrium. When , we have that , and it is -equilibrium.
Putting everything together, there is an -equilibrium with
Moreover, gives us the unique minimal when all defenders use the same coverage probabilities.
We now claim that the -equilibrium could only exist when all defenders play an identical coverage probability. Suppose defenders use different coverage probabilities. Then there are defenders for who have the same minimal probability of protecting their targets. The expected utility for each defender among these defenders is:
When , some defender among these defenders could decrease her probability to to get a utility of , with
When , some defender among these defenders could increase her probability to to get utility with
where the first inequality follows because can be made arbitrarily small. ∎
Armed with a complete characterization of predictions of strategic behavior among the defenders, we can now consider how this behavior related to socially optimal protection decisions. Since the solutions are unique, there is no distinction between the notions of price of anarchy and price of stability; we term the ratio of socially optimal welfare to welfare in equilibrium as the price of anarchy for convenience.
First, we characterize the socially optimal outcome.
In the Baseline model, the optimal social welfare is
We firstly claim that we could get optimal social welfare only if all defenders use the same coverage probability . If their coverage probabilities are different, and some defender has probability of being attacked, we could decrease to improve social welfare. Therefore we need only to consider identical coverage probabilities in determining optimal social welfare. Welfare, as a function of symmetric coverage is
When , is optimal, whereas is optimal otherwise, giving the desired result. ∎
From this result, it is already clear that defenders systematically over-invest in security, except when values of the targets are quite high. This stems from the fact that the attacker creates a negative externality of protection: if a defender protects his target with higher probability than others, the attacker will have an incentive to attack another defender. In such a case, we can expect a “dynamic” adjustment process with defenders increasing their security investment well beyond what is socially optimal. To see just how much the defenders lose in the process, we now characterize the price of anarchy of our game.
If , social welfare in the unique equilibrium with is
The associated PoA is then
Figure 2 shows the relationship among PoA, the number of defenders, and the ratio . From the figure we can see that when number of defenders and are small (e.g. 5 and ), the price of anarchy is close to . Otherwise, PoA is unbounded, growing linearly with .
When , there is no Nash equilibrium. However, the optimal -equilibrium features all defenders with the same coverage probability for their targets. The corresponding Social Welfare is
and the associated -PoA is , which is, again, linear in .
4.2 The Multi-Target Model
Armed with observations from the model with a single target for each defender, we now extend the model to a case not as yet considered in the literature in a theoretical light: each defender protects a set of targets. This gives rise to a combinatorial set of possible decisions for each defender, so that even computing a best response is not necessarily easy. Remarkably, we are able to characterize equilibria and approximate equilibria in this setting as well. The proofs for this subsection are in the appendix.
Our first result is almost a mirror-image of the corresponding result in the baseline model: when a Nash equilibrium exists, all defenders protect all of their targets with probability 1.
In the Multi-Target model, Nash equilibrium exists if and only if . In this equilibrium all targets are protected with probability 1.
Next, we consider scenarios when , in which there is no Nash equilibrium. Our next result characterizes optimal (lowest-) approximate Nash equilibria.
In the Multi-Target model, if , then in the optimal -equilibrium all targets are protected with probability . The corresponding is .
Thus, as increases, the optimal approximate equilibrium approaches a Nash equilibrium. Figure 3 illustrates the relationship between and the number of targets each defender protects when and . In this figure, when , which means that an exact Nash equilibrium exists; increases with when , but at a decreasing rate, converging to as .
Finally, we characterize socially optimal welfare, and, subsequently, put everything together in describing the price of anarchy.
In the Multi-Target model, the optimal social welfare is
Thus, just as in the baseline case, the defenders will generally over-invest in security.
If , there is a unique Nash equilibrium with all targets protected with probability 1. The corresponding social welfare is
Because it is the only Nash equilibrium, the Price of Anarchy is
If , there is no Nash equilibrium. The optimal approximate equilibrium features identical coverage probability of for all targets. The corresponding Social Welfare is
and the associated -Price of Anarchy is . Clearly, in either case, and just as in the baseline model, the price of anarchy is unbounded, growing linearly with .
We now consider how PoA changes as a function of , i.e. the number of targets each defender has. When , a Nash equilibrium exists and the PoA is ; when , PoA increases linearly in with slope . However, when , a Nash equilibrium does not exist and the approximate PoA is , which increases very slowly with , and is bounded by when . Figure 4 illustrates the relationship between (approximate) Price of Anarchy and for . When is very small, PoA = 1. For intermediate , PoA increases linearly, and when is sufficiently large, Nash equilibrium no longer exists, and -PoA increases quite slowly, converging to 3 when .
4.3 The General Model
Both the baseline and the multi-target models made rather strong assumptions about the structure of the utility functions of the players. In the general model, we relax these assumptions, allowing for arbitrary utilities for the players when the target is attacked or not, and when it is protected or not (when attacked). Quite surprisingly, our findings here are qualitatively different: the special case of the baseline and multi-target models turns out to be an exception, rather than the rule when more general models are considered.
Just as before, we start by characterizing Nash and approximate Nash equilibria.
In the General model, Nash equilibrium exists if and only if . In this equilibrium all targets are protected with probability 1.
We firstly claim that Nash equilibrium can appear only if coverage probabilities of all of targets are identical. Otherwise, there will be a target which has the probability 0 of being attacked, and the defender has an incentive to decrease . To determine a Nash equilibrium, we therefore need only consider scenarios in which all targets have the same coverage probability.
When all targets have the same coverage probability to be protected, the utility of each defender is
If , then some defender could increase to for all of her targets to ensure none of them are attacked, and obtain utility of , so that
As , , and can be arbitrarily small, when , which means that this cannot be a Nash equilibrium. Thus, the only possible equilibrium can be for all targets .
When all targets have the same coverage probability , each defender’s utility is
We claim that if a defender has an incentive to deviate, it is optimal for this defender to use the same coverage probability for all her targets. Otherwise, for some target which has probability of being attacked, she could decrease to obtain higher utility. If probabilities of targets protected by defender are all , then her expected utility is , and
We therefore have two cases:
If , then , and for all targets is a Nash equilibrium.
If , the maximal value of corresponds to :
If , , it is a Nash equilibrium; otherwise, it is not.
To sum up, a Nash equilibrium exists if and only if , and the equilibrium corresponds to all targets having probability 1 of being protected. ∎
Next, we characterize the optimal approximate equilibrium when no Nash equilibrium exists.
In the General model, in the optimal -equilibrium all targets are protected with probability . The corresponding is .
When all targets have the same coverage probability , the expected utility of each defender is
Suppose . If some defender increases to for target , then she would obtain utility , and
Now we consider scenarios in which a defender could obtain higher utility by decreasing protection probability. We claim that if a defender has an incentive to deviate, it is optimal for this defender to use the same coverage probability for all her targets. Otherwise, for some target which has probability of being attacked, she could decrease to obtain higher utility. Thus, we need only consider cases in which a defender deviates by decreasing coverage probabilities for all her targets to . Her utility will become . Since , (the maximal value of ) maximizes :
When , we get the minimal .
We claim that the -equilibrium can appear only if all targets have the same coverage probability . We prove this by contradiction. Suppose that targets have different coverage probabilities. This gives rise to two cases: Each defender uses an identical coverage probability for each target she owns (these may differ between defenders); and Some defender has different coverage probabilities for her targets. In case , there exist defenders () who have the same minimal coverage probability . The expected utility for each defender among these is
When , some defender among these could decrease the coverage probability of all her targets to and obtain the utility of , so that
When , some defender among these can increase coverage probabilities of all her targets to to obtain utility of , with
where the inequality holds because can be arbitrarily small. Thus, no profile in case can be a -equilibrium. In case , any defender who has different coverage probabilities among her targets can always increase her payoff by decreasing the coverage probabilities of the targets with higher coverage to yield identical coverage for all targets. Consequently, no profile in case can be a -equilibrium. ∎
As the final step towards characterizing the Price of Anarchy, we derive optimal social welfare in this model.
In the General model, the optimal social welfare is
We firstly claim that we could get optimal social welfare only if all targets have the same coverage probability . Otherwise, some target has probability of being attacked, and we can decrease to improve social welfare. Consequently, we need only to consider an optimal symmetric coverage probability to maximize social welfare, which can be done in a manner similar to that for the baseline case. ∎
If , the Nash equilibrium is unique, with all targets protected with probability 1. The corresponding social welfare is
So far we have not yet added any constrains to value of , , and (except that ). In order to make Price of Anarchy well-defined, we need to add constraints that values of , , and are all non-positive (just as in the previous two models) or all non-negative. To be consistent with previous models, we add constraints that , and are all non-positive (little changes if all are non-negative).
In the case of a unique Nash equilibrium, the price of anarchy is
If , there is no Nash equilibrium. The Social Welfare in the optimal approximate equilibrium is
and the -Price of Anarchy is .
We now analyze the relationship between (-)PoA and the values of and . Here are the key differences from the Multi-Target Model. First we consider (-)PoA as the function of . If , the result is same as that in the Multi-Target Model: (-)PoA linearly increases in , and is therefore unbounded. However, if , while PoA and -PoA are increasing in , as , they approach and , respectively. In other words, PoA (exact and approximate) is bounded by a constant, for a constant !
Consider now approximate price of anarchy as a function of . If , it is bounded by . However, if , when , it is an increasing function of . When , it may at first increase or decrease in , depending on the the values of the model parameters. However, when is large enough, price of anarchy will invariably be decreasing in , and as , -PoA . Figure 5 provides an example of the relationship between -PoA and . Observe that all the curves begin to decrease when , and they all approach 1 as . Thus, price of anarchy in the general model is only unbounded in the special case when , whereas when , price of anarchy is always bounded by a constant. This observation is particularly surprising and significant considering the fact that the baseline and simplified multi-target models are quite natural, and seemingly innocuous, restrictions of the general case.
5 Analysis of Interdependent Multi-defender Security Games
We now develop and analyze a computational framework for approximating Nash equilibria in interdependent multi-defender security games. A crucial step in computing (or approximating) a Nash equilibrium of a game is to consider the problem of computing a best response for an arbitrary player (in our case, defender, since the attacker’s best response is straightforward). Next, we develop a novel mixed-integer linear programming formulation for computing ASE best response, and then propose a hightly effective heuristic method for approximating ASE in multi-defender games.
5.1 Computing Defender Best Response: A Mixed-Integer Linear Programming Formulation
While ASE seems a very natural alternative to SSE even in two-player security games, we are not aware of any proposals for computing it. Below, in equations 3-13, we present the first (to our knowledge) mixed-integer linear programming formulation for computing ASE which, in our case, would compute a best response for an arbitrary defender when the strategies of all other players, , are fixed.
where is a very large number and
While constraint 13 is non-linear, we can linearize it using McCormick inequalities. Constraints 4 and 5 ensure that the defender’s strategy is a valid probability distribution. Constraint 7 ensures that at least one target is chosen by the attacker. Constraints 8 and 9 compute the optimal attacker utility ; alone, they ensure that this utility corresponds to some attack target. Constraints 10 and 11 compute an auxiliary variable , which is if and only if attacking a target yields an optimal utility to the attacker. These variables, together with constraints 12 and 8-9
ensure that the binary variableif and only if the attacker (weakly) prefers to attack target ; that is, these jointly compute the set of optimal attack targets. Finally, constraint 13 computes the expected utility to the defender if the attacker chooses one of his most preferred targets uniformly at random.
If is infinite and numbers can be computed to arbitrary precision, the above formulation is correct. In practice, of course, numerical precision and stability are an issue, and they arise with this formulation. Consider constraints 10 and 11, which compute for all targets . These require that the expected value of the target exactly equal the optimal attacker utility computed in constraints 8-9; even a slight error will technically violate our requirement that at an optimal target. Moreover, even if the difference is, indeed, non-zero, from an attacker’s perspective it seems intuitive that sufficiently small differences from optimal utility are ignored. We address these problems by adding a fixed small quantity to the right-hand-side of constraints 8, 9, while subtracting it from the right-hand-side of constraint 12. Because the constraints are interrelated, we cannot simply choose an arbitrary , but must ensure that and satisfy the constraint that .
5.2 Approximating ASE
Previously, Vorobeychik08 presented a convergent equilibrium approximation algorithm based on simulated annealing (SA) that would be applicable in our setting. They additionally showed in simulation that SA is actually outperformed by a simple heuristic based on iterated best response (IBR) dynamics. Here, we interpret IBR as a local search heuristic, with the property that if the starting point is a Nash equilibrium, IBR will never deviate from it (i.e., Nash equilibrium is a fixed point). Clearly, then, the choice of a starting point can be significant for the performance of IBR, making it natural to consider coupling it with random restarts. Our main contribution in this section is to present evidence that IBR with random restarts is a highly effective equilibrium approximation approach in our setting (and outperforms several alternatives). This is both of broad significance, and of particular importance in our setting, as we use this algorithm for our analyses below.
We compare the following Nash equilibrium approximation algorithms executed for 1000 iterations: random search (RS), which simply generates 1000 strategy profiles randomly, computes the game theoretic regret of each, and chooses a profile with the smallest regret; simulated annealing (SA), with the temperature exponentially increasing with iterations; and iterated best response (IBR) with no restarts. We also include in the comparison two additional variations of IBR: the first uses SA for the first 100 iterations, and then switches to IBR for the remainder (starting with the best approximation produced by SA); the second is IBR with random restarts, which we term RIBR. RIBR includes initial corner cases that may be hard to converge to in a limited amount of time (i.e., all defenders not defending, all defenders defending completely). We execute our comparison on games with 2 players and 10 targets and games with 5 players and 20 targets. In all cases, targets are divided evenly among the players, and values over the targets are generated uniformly at random. The cost of defense is fixed at , and the targets are assumed to be independent (but players may have values for targets under the control of other defenders). Figure 6 demonstrate that in both settings, RIBR outperforms other alternatives.
5.3 Analysis of Multi-Defender Games on Synthetic Networks
For our first set of experiments, we use RIBR on 3 artificially generated networks, with samples for each parameter variation. First, we will illustrate and compare the results of our interdependent multi-defender game on artificial networks. We use 3 commonly analyzed network structures: a grid, Erdős-Rényi networks, and preferential attachment networks. In all of the generated networks, there are nodes or targets. For the latter two, we use the Metis graph partitioning software to partition the nodes (targets) among defenders. This software partitions nodes to minimize connectivity among the targets belonging to different defenders, a property that we expect to common hold in real networks due to efficiency considerations.
We begin by considering average strategies, as well as social welfare, for the three different synthetic networks (grid, Erdős-Rényi, and preferential attachment), as a function of the number of players (degree of decentralization) and the cascade probability (interdependent risk). The results are shown in Figure 7. We do not show these as a function of defense cost as increasing defense cost roughly mirrors decreasing cascade probability . The first rather stark observation is that network structure makes little difference when each node is controlled by a single player, but it makes a significant qualitative difference both for social welfare and actual strategies utilized by the players in all other cases.
Looking at the results in greater detail, let’s consider first social welfare (Figure 7, top). First, when interdependent risk is low (), social welfare follows a relatively simple pattern: increasing decentralization makes initially almost no difference, until sufficiently many players are involved, at which point social welfare falls rather dramatically; this pattern is roughly monotonic with increasing decentralization, with worst outcomes emerging when each player controls a single node, and mirrors previous findings [Vorobeychik, Mayo, Armstrong, RuthruffVorobeychik et al.2011]. Both Erdős-Rényi and preferential attachment networks are less susceptible to the negative effects of decentralization in this case than the grid network, where the dropoff occurs with fewer players (less decentralization). This may be largely a consequence of the fact that network partitioning tools we use attempt to minimize interdependence among players—something that is likely to mirror reality—and far more opportunities for doing so exist in Erdős-Rényi and preferential attachment models.
When is higher (greater interdependencies), the results exhibit an entirely new phenomenology. Across all three network models, for a sufficiently large , the impact of decentralization is non-monotonic: an intermediate level of decentralization has the most detrimental impact on security, while a highly decentralized system becomes near-optimal!
|Power Network 1||Power Network 2||Power Network 3|
|1 player||4 players||64 players|
Investigating actual (average) strategic decisions by the players yields deeper insights into the findings above. When interdependencies are weak, optimal decision is to invest relatively little in security, in any generative model. Increasing decentralization, therefore, gives rise primarily to over-investment, mirroring our analytical results for the limiting case when targets were independent, although the tendency to over-invest is quite weak until the network is extremely decentralized, except in the grid network. When is high, on the other hand, the predominant phenomenon is underinvestment. This in itself is not surprising: after all, a high level of interdependencies should imply that positive externalities of security should be dominant. What is surprising is, again, non-monotonicity in the level of decentralization: when decentralization is moderate, underinvestment can be quite dramatic. On the other hand, a high level of decentralization often appears to dull this effect, and the level of investment in security becomes much closer to optimal.
5.4 Results on power grid networks
The grid network studied above is arguably the most “artificial”, in the sense that both Erdős-Rényi and preferential attachment models were developed in part to resemble real networks (this is particularly true of the latter, which aims to replicate the scale-free properties of observed networks). Surprisingly, however, the approximate equilibrium results applied to three snippets of actual power networks most resemble the phenomena observed for the grid, as can be seen in Figure 8. In particular, just as in the grid above, over-investment in security appears to dominate, even at relatively high levels of interdependence, but only when decentralization is significant, while most levels of decentralization are relatively robustly near-optimal (these are, in fact, more robust to decentralization than the grid network above).
To dig somewhat deeper into the rather complex phenomenology we have observed, Figure 9 shows several examples of actual strategy realizations. First, consider the top series of plots for the grid with cascade probability . As previously described, we can clearly see that the optimal security configuration involves no security investment (leftmost grid), whereas an increasing level of decentralization gives rise to increased security investment, culminating, ultimately, with full protection in the extreme level of decentralization. The contrast between the two extremes offers some guidance: even though optimal global configuration involves no security, when each player controls (and cares about) only a single node, the best response of an attacked node is to defend it just enough to force the attacker to attack another; for example, slightly more than the next weakest node. Iterating on this idea, strategies “cascade” to full defense. When the player controls more than one node, however, there is suddenly strategic tension: higher security on one node may well push the attacker to attack another node under this player’s control. Positive externalities become more significant as well: pushing the attacker to attack another node “nearby” is likely to gain little when cascade probabilities are high and multiple nodes owned by the defender could be affected. For sufficiently high cascade probabilities, and sufficiently low number of players, such positive network effects can actually sway players to under-invest in security, as we can see both in the middle and last rows of Figure 9 (the 4-player case). Here, strategic complementarities make security investment not worthwhile in equilibrium: the nodes that need to be defended are relatively central, and cut across different players (i.e., the critical central nodes create a kind of “buffer” between defenders). This behavior diminishes as decentralization increases.
5.5 Security and Network Centrality
Finally, we consider the relationship between network centrality and strategic choice of a node. The results, shown in Figure 10, plot each node’s closeness centrality and corresponding security across instances for each network. When the interdependence of networks is low, security investment appears to correlate rather strongly with closeness in synthetic networks: in other words, nodes more central in the network invest more in security. A similar relationship is seen with node degrees. This correlation, however, largely disappears with greater interdependence, likely because the difference between being one-hop vs. two-hops away from an attacked node (i.e., being a low-connected neighbor of a high-connected node) becomes considerably less in such a case.
For the power networks, the correlation between centrality and strategy remains even with higher cascade probabilities. This relationship can be seen by comparing the strategy profile in the second column of Figure 9 and the closeness plot for for the power network. The structure of the power network has several small ”chains” that have low centrality. The node that connects the chain to the rest of the network (higher centrality) incurs most of the defense cost, since a failure cascade would have to pass through this node to reach the rest of the chain. As the amount of interdependence increases (column 3 in Figure 9), these chains become partitioned between defenders, weakening the relationship between centrality and strategy.
In this work, we have extended the current state of Stackelberg security games to include multiple defenders in non-cooperative scenarios with independent and interdependent targets.
For the independent case, we provided complete characterizations of Nash and approximate equilibria, socially optimal solutions, and price of anarchy (PoA) for three models of varying generality. Our analysis showed that defenders generally over protect the targets, but different modelling assumptions give rise to qualitatively different outcomes: a simpler model gives rise to an unbounded PoA, whereas a more general model sees PoA converge to a constant when the number of defenders increases.
For the interdependent case, we developed a novel computation framework to overcome the difficulties of providing a concise formal analysis of such a complex model. Our simulations characterize a broad space of strategic predicaments, varying cascade probability, network structure, and system decentralization. In contrast to the independent models, our results show differing behavior in terms of security investment dependent on the strength and structure of interdependencies. One of our most stark findings is the non-monotonicity of welfare and strategic choices as a function of the number of players: in a number of cases, higher levels of decentralization become near-optimal, even while intermediate decentralization leads to very poor outcomes. As security decisions are almost universally decentralized, and often highly interdependent, our findings enable a deeper understanding of practical security considerations, highlighting the importance of both, over- and under-investment in security, and the dependence of each on network structure, the magnitude of network externalities, and the level of decentralization. Finally, we have shown how security behavior in our model on real-world power networks relates to those in synthetic networks, highlighting similar behaviors with grid networks, and unveiling structural differences with Erdős-Rényi and preferential attachment networks using the relationship between strategy and centrality.
- [Acemoglu, Malekian, OzdaglarAcemoglu et al.2013] Acemoglu, D., Malekian, A., Ozdaglar, A. 2013. Network security and contagion. Working paper.
- [Alderson, Brown, Carlyle, WoodAlderson et al.2011] Alderson, D. L., Brown, G. G., Carlyle, W. M., Wood, R. K. 2011. Solving defender-attacker-defender models for infrastructure defense INFORMS Computing Society Conference.
- [Bachrach, Draief, GoyalBachrach et al.2013] Bachrach, Y., Draief, M., Goyal, S. 2013. Contagion and observability in security domains In Allerton Conference.
[Chan, Ceyko, OrtizChan et al.2012]
Chan, H., Ceyko, M., Ortiz, L. E. 2012.
Interdependent defense games: Modeling interdependent security
under deliberate attack
In Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 152–162.
- [Conitzer SandholmConitzer Sandholm2006] Conitzer, V. Sandholm, T. 2006. Computing the optimal strategy to commit to In Proceedings of the 7th ACM conference on Electronic commerce, EC ’06, 82–90, New York, NY, USA. ACM.
- [Cormican, Morton, WoodCormican et al.1998] Cormican, K. J., Morton, D. P., Wood, R. K. 1998. Stochastic network interdiction Operations Research, 46(2), 184–197.
- [DeMiguel XuDeMiguel Xu2009] DeMiguel, V. Xu, H. 2009. A stochastic multiple-leader stackelberg model: analysis, computation, and applications Operations Research, 57(5), 1220–1235.
- [Jain, Kardes, Kiekintveld, Tambe, OrdonezJain et al.2010] Jain, M., Kardes, E., Kiekintveld, C., Tambe, M., Ordonez, F. 2010. Security games with arbitrary schedules: A branch and price approach In Twenty-Fourth National Conference on Artificial Intelligence.
- [Jain, Pita, Tambe, Ordonez, Paruchuri, KrausJain et al.2008] Jain, M., Pita, J., Tambe, M., Ordonez, F., Paruchuri, P., Kraus, S. 2008. Bayesian stackelberg games and their application for security at los angeles international airport SIGecom Exch., 7, 10:1–10:3.
- [Jain, Tsai, Pita, Kiekintveld, Rathi, Tambe, OrdonezJain et al.2010] Jain, M., Tsai, J., Pita, J., Kiekintveld, C., Rathi, S., Tambe, M., Ordonez, F. 2010. Software assistants for randomized patrol planning for the lax airport police and the federal air marshal service Interfaces, 40, 267–290.
- [Jiang, Procaccia, Qian, Shah, TambeJiang et al.2013] Jiang, A. X., Procaccia, A. D., Qian, Y., Shah, N., Tambe, M. 2013. Defender (mis)coordination in security games In Twenty-Third International Joint Conference on Artificial Intelligence, 220–226.
- [Kempe, Kleinberg, Éva TardosKempe et al.2003] Kempe, D., Kleinberg, J. M., Éva Tardos 2003. Maximizing the spread of influence in a social network In Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 137–146.
- [Kiekintveld, Jain, Tsai, Pita, Ordonez, TambeKiekintveld et al.2009] Kiekintveld, C., Jain, M., Tsai, J., Pita, J., Ordonez, F., Tambe, M. 2009. Computing optimal randomized resource allocations for massive security games In Proceedings of the Eighth International Conference on Autonomous Agents and Multiagent Systems.
- [Korzhyk, Yin, Kiekintveld, Conitzer, TambeKorzhyk et al.2011] Korzhyk, D., Yin, Z., Kiekintveld, C., Conitzer, V., Tambe, M. 2011. Stackelberg vs. nash in security games: An extended investigation of interchangeability, equivalence, and uniqueness Journal of Artificial Intelligence Research, 41, 297–327.
- [Kulkarni ShanbhagKulkarni Shanbhag2014] Kulkarni, A. A. Shanbhag, U. V. 2014. A shared-constraint approach to multi-leader multi-follower games Set-Valued and Variational Analysis.
- [Kunreuther HealKunreuther Heal2003] Kunreuther, H. Heal, G. 2003. Interdependent security Journal of Risk and Uncertainty, 26(2-3), 231–249.
- [LazarLazar2011] Lazar, J. 2011. Electricity regulation in the US: A guide , Regulatory Assistance Project.
- [Letchford VorobeychikLetchford Vorobeychik2012] Letchford, J. Vorobeychik, Y. 2012. Computing optimal security strategies for interdependent assets In Conference on Uncertainty in Artificial Intelligence, 459–468.
- [Leyffer MunsonLeyffer Munson2007] Leyffer, S. Munson, T. 2007. Solving multi-leader-common-follower games , Argonne National Laboratory.
- [Osborne RubensteinOsborne Rubenstein1994] Osborne, M. Rubenstein, A. 1994. A Course in Game Theory. MIT Press.
- [Paruchuri, Pearce, Marecki, Tambe, Ordonez, KrausParuchuri et al.2008] Paruchuri, P., Pearce, J. P., Marecki, J., Tambe, M., Ordonez, F., Kraus, S. 2008. Playing games with security: an efficient exact algorithm for bayesian stackelberg games In Proceedings of the Seventh International Conference on Autonomous Agents and Multiagent Systems, 895–902.
- [Pita, Jain, Ordonez, Portway, Tambe, Western, Paruchuri, KrausPita et al.2009] Pita, J., Jain, M., Ordonez, F., Portway, C., Tambe, M., Western, C., Paruchuri, P., Kraus, S. 2009. Using game theory for los angeles airport security AI Magazine, 30(1), 43–57.
- [Rodoplu RajRodoplu Raj2010] Rodoplu, V. Raj, G. S. 2010. Computation of a nash equilibrium of multiple-leader stackelberg network games In International Conference on Systems and Networks Communications, 232–237.
- [SheraliSherali1984] Sherali, H. D. 1984. A multiple leader stackelberg model and analysis Operations Research, 32(2), 390–404.
- [Shieh, Jiang, Yadav, Varakantham, TambeShieh et al.2014a] Shieh, E., Jiang, A. X., Yadav, A., Varakantham, P., Tambe, M. 2014a. Unleashing dec-mdps in security games: Enabling effective defender teamwork In European Conference on Artificial Intelligence.
- [Shieh, Jiang, Yadav, Varakantham, TambeShieh et al.2014b] Shieh, E., Jiang, A. X., Yadav, A., Varakantham, P., Tambe, M. 2014b. Unleasing dec-mdps in security games: Enabling effective defender teamwork European Conference on Artificial Intelligence.
- [Shieh, Yang, Tambe, Baldwin, DiRenzo, Maule, MeyerShieh et al.2012] Shieh, E., Yang, R., Tambe, M., Baldwin, C., DiRenzo, J., Maule, B., Meyer, G. 2012. Protect: A deployed game theoretic system to protect the ports of the United States In Proceedings of the Eleventh International Conference on Autonomous Agents and Multiagent Systems, 13–20.
- [Sinha, Malo, Frantsev, DebSinha et al.2014] Sinha, A., Malo, P., Frantsev, A., Deb, K. 2014. Finding optimal strategies in a multi-perdio multi-leader-follower stackelberg game using an evolutionary algorithm Journal of Computers and Operations Research, 41, 374–385.
- [Vorobeychik, Mayo, Armstrong, RuthruffVorobeychik et al.2011] Vorobeychik, Y., Mayo, J., Armstrong, R., Ruthruff, J. 2011. Noncooperatively optimized tolerance: Decentralized strategic optimization in complex systems Physical Review Letters, 107(10), 108702.
- [Vorobeychik WellmanVorobeychik Wellman2008] Vorobeychik, Y. Wellman, M. P. 2008. Stochastic search methods for nash equilibrium approximation in simulation-based games In Seventh International Conference on Autonomous Agents and Multiagent Systems, 1055–1062.
- [WoodruffWoodruff2003] Woodruff, D. L.. 2003. Network Interdiction and Stochastic Integer Programming. Kluwer Academic Publishers.
We firstly claim that a Nash Equilibrium must have identical coverage probabilities for all targets . Otherwise, there will be a target which has the probability 0 of being attacked, and defender has an incentive to decrease .
When all targets have the same coverage probability , the expected utility of each defender is
If , then some defender can increase to for all of her targets to make sure none are attacked, obtaining utility , with
As can be arbitrarily small, when , and it cannot be a Nash equilibrium.
When all targets have the same coverage probability , the expected utility of each defender is . If defender want to deviate, then one of her targets will be attacked. Suppose this target is , with . Then her expected utility is
We now have two cases:
If , then and , and the defender has an incentive to deviate, and there is no Nash equilibrium.
If , then, because for all ,
In case 2, therefore, iff , which yields the desired result. ∎
When all defenders have the same coverage probability , the expected utility of each defender is
Suppose . If some defender increases to for target , then she would obtain a utility of , and