Attribute Evaluation on Attack Trees with Incomplete Information

12/27/2018 ∙ by Ahto Buldas, et al. ∙ 0

Attack trees are considered a useful tool for security modelling because they support qualitative as well as quantitative analysis. The quantitative approach is based on values associated to each node in the tree, expressing, for instance, the minimal cost or probability of an attack. Current quantitative methods for attack trees allow the analyst to, based on an initial assignment of values to the leaf nodes, derive the values of the higher nodes in the tree. In practice, however, it shows to be very difficult to obtain reliable values for all leaf nodes. The main reasons are that data is only available for some of the nodes, that data is available for intermediate nodes rather than for the leaf nodes, or even that the available data is inconsistent. We address these problems by developing a generalisation of the standard bottom-up calculation method in three ways. First, we allow initial attributions of non-leaf nodes. Second, we admit additional relations between attack steps beyond those provided by the underlying attack tree semantics. Third, we support the calculation of an approximative solution in case of inconsistencies. We illustrate our method, which is based on constraint programming, by a comprehensive case study.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Attack trees are a useful and intuitive graphical modeling language introduced by Bruce Schneier [46, 47] in 1999. Since then, it has enjoyed popularity in the security industry, as well as in the research community. Attack trees have been equipped with various semantics [40, 25, 23] and supported by tools [2, 21]. They have also been enhanced with various methods for quantitative analysis [27, 4, 33, 5, 43, 10, 12, 36, 37, 35], which allow determining for a given attack tree, for example, an organisation’s losses due to an attack, the probability that such an attack succeeds, or the cost of a successful attack [22].

The underlying assumption upon which all these quantification methods are based is similar to the popular divide and conquer paradigm, in which a problem is recursively broken down into smaller problems that are theoretically simpler to reason about and solve.

A quantification method for attack trees often reduces to the assignment of attribute values to basic attack steps (leaf nodes in the tree). Such assignments are used in a bottom-up propagation manner to determine the value at the root node, which is a quantification measure for the scenario expressed in the tree [27]. It is largely believed that it is relatively easy to assign a reliable attribute value to a basic attack step, which is precise and refined enough. Popular tools operating with attack trees, such as ADTool [28, 21] and SecurITree [2], work exactly under this premise.

In practice, the assumption that attribute values for more concrete attack steps are easier to obtain has proven incorrect. Indeed, most companies manage to obtain statistical data for abstract attacks, e.g. frequency of skimming attacks, while they might struggle to come up with similar data for more refined attacks, e.g. frequency of stereo skimming attacks based on audio technology. For security consultants, it might be feasible to obtain reliable estimations for (at least some) abstract attacks in relevant domains, but precise historical data for low-level attack steps might be out of reach. Thus, we observe that there is a tension between the limited availability of data and the requirement to provide data values for all leaves in an attack tree before proceeding with a quantification method.

Today, existing quantitative approaches for attack trees cannot handle values of intermediate nodes in the tree that may become available from historical data. Moreover, they do not support the use of additional constraints over nodes in the tree, which are obtained from external sources of information rather than from the attack tree model itself. For example, the analysts may be confident that card skimming attacks are more frequent than physical attacks on card holders. Such a relation cannot be captured in an attack tree model, because it is not a hierarchical relation, hence it is ignored in current quantitative approaches for attack trees.

There is clearly a need for novel computation methods on attack trees that account for available historical data and domain-specific knowledge. In this paper, we formulate a general attack-tree decoration problem that treats assignment of values to tree nodes as a problem of finding a set of data values satisfying a set of predicates. These predicates arise from the attack tree structure and the target attribute to be computed (i.e. semantics) and from the attainable historical values and domain knowledge (i.e. available data). Our methodology to solve the attack-tree decoration problem accounts for scenarios in which the set of predicates cannot be jointly satisfied, due to inconsistencies or possible noise in the data.

Contributions. In this work

  • We transform an attack tree semantics together with an attribute interpretation into a constraint satisfaction problem (Section 3). If the attack tree semantics is consistent with the attribute interpretation, this allows us to determine appropriate attribute values for all nodes in the tree.

  • Because confidence in the available historical data and domain-specific knowledge may vary, we provide a methodology to deal with inconsistencies (Section 4). The usefulness of our approach is that any consistent valuation is better than no valuation, as it will enable the follow up process of using the attack tree for what-if analysis. The standard bottom-up approach would result in absence of any valuation until all leaf node values can be assigned.

  • We introduce two concrete approaches to deal with inconsistencies (Section 5). The first one determines the smallest subset of constraints that makes the decoration problem inconsistent, which is useful to find contradictory or wrong assumptions. The second one is suitable for constraints that are expressed in the form of inequalities. In this approach constraints are regarded consistent and an optimal decoration is always found. The proposed methodology has been implemented as proof-of-concept software tools.

  • We validate our methodology and the implementations through a comprehensive case study on the security of Automatic Teller Machines (Sections 6-7).

2 Related Work

Research articles on quantitative security analysis with attack trees in all their flavors (attack-defense trees, defense trees, etc. [30]) often focus on providing extensions to attack trees enabling more complex scenarios [31, 5, 26, 19, 3] and defining metrics for evaluating scenarios captured as attack trees. Various metrics have been considered in the literature, for instance, the probability/likelihood of an attack [7, 35], expected time until a successful attack [7], attacker’s utility [34, 12, 36, 37, 35], return on security investment [10], or assessment of risks [20, 42]. Bagnato et al. [7] present a list of metrics found in security literature that can be computed on attack trees. Yet, all these approaches assume that the data values to perform quantitative analysis of system security are readily available. Indeed, to the best of our knowledge, no methodologies have been developed for integrating historical data and domain-specific knowledge in quantitative analysis of attack trees. At the same time, even the inventor of attack trees Bruce Schneier has acknowledged the painstaking work for data collection that is a prerequisite for quantitative analysis on the trees [47, Chap. 21].

Benini and Sicari [9] have proposed a framework for attack tree-based security risk assessment. The approach relies on identification of security vulnerabilities, that are placed in the leaves of an attack tree. Quantitative parameters of the vulnerabilities, such as exploitability and damage potential, allow to estimate security risks to a system. In [9], the exploitability parameters are initially evaluated based on the CVSS scores111Common Vulnerability Scoring System of the respective vulnerabilities, and then they are adjusted based on the expert judgement about the system context and mutual effect of vulnerabilities on each other expressed in a vulnerability dependency graph. While this methodology offers more precise quantitative risk assessment with attack trees, it assumes that the attack tree is constructed in a bottom-up manner. All system vulnerabilities have to be identified using a suitable technology, and accommodated in an attack tree. In complex environments it will likely be impractical to apply the bottom-up approach due to the huge amount of potential vulnerability combinations that can be exploited in various attacks. Indeed, in practice attack trees are typically designed in a top-down manner, when the analyst starts by conjecturing the main attacker’s goal and iteratively breaks it down into smaller subgoals [40, 46, 17, 47].

Recently, de Bijl [15]

studied the use of historical data values to obtain attribute values for attack tree nodes. He proposed several heuristics to deal with missing data values, including the standard bottom-up algorithm to infer parent node values, the reuse of values for recurring nodes, and the use of various data sources to estimate certain attributes. For example, the paper refers to

distance to the police station as a hidden variable influencing probability of attack. Yet, [15] does not define a precise methodology to perform computations on attack trees with missing leaf node values.

Quantitative analysis of fault trees. Fault trees are close relatives of attack trees that are widely used in the reliability domain. There exists a large body of literature dedicated to quantitative analysis of fault trees. Yet, the standard bottom-up approach in attack trees is not the standard approach in fault trees, where min-cut set analysis and translation into more intricate models are common [44].

Fuzzy fault trees address evaluation of fault trees under uncertainty [38], when failure statistics are not fully available. The main difference is that fuzzy fault trees have been developed to serve the needs of the reliability community and fault tree application methods (fuzzy probability functions, error propagation estimates, etc.), while the security community and attack tree application methods have different needs (bottom-up computation for a large variety of attributes). Therefore, solutions designed for fault trees do not fully address the problems in the attack tree space.

Data issues in quantitative risk assessment. Attack trees are typically used for threat modeling and security risk assessment [48]. Thus, it is necessary to evaluate the data availability perspective also in the more general context of security risk assessment. Indeed, the general question of data validity in quantitative security risk assessment (QSRA for short) and the reliability of QSRA results in presence of uncertainty in data values has been raised by many practitioners and researchers [50].

QSRA enables decision making based on quantitative estimations of some relevant variables (e.g. probability of an event, cost, time, vulnerability, etc.). These quantitative estimations are typically aggregated in a model that can then be utilised by a decision maker [50]. Many studies, books on security, and industry reports have acknowledged that the quality of quantitative risk analysis, and, correspondingly, the decisions made based on it, heavily depends on the quality of data used [50, 24, 47, 8]. Notably, it has been established that probabilities of particular loss events and costs associated to security spending can be hard to obtain from historical data [24, 41, 6, 11, 1]. This body of knowledge serves as evidence of inherent difficulty to obtain meaningful estimates for probability and cost of detailed attack steps, i.e., values for leaf nodes in attack trees.

Nevertheless, it has been acknowledged that, for instance, for insurance companies it might be feasible to get meaningful data, because they have access to an entire population, i.e. they have good statistics [41]. It has also been demonstrated, for example, that breach statistics can be used to predict future breaches in different segments [45], and that statistics pertinent to different user profiles can be applied to estimate success rates of intrusions [14].

Furthermore, for security assessment, it has been long established that external data sources, such as threat level indicators (e.g. malware numbers) can be helpful to update quantitative risk assessment models [11]. Therefore, enabling better usage of available historical data, which may not directly correspond to information about low-level attack events (leaf nodes), will be a valuable enhancement for quantitative analysis of attack trees. From this review of the relevant scientific literature, we can conclude that there is a strong need for an approach to perform quantitative analysis on attack trees in case the analyst cannot confidently assign values to all leaf nodes. Furthermore, this approach needs to integrate available historical data that can come in form of values for some abstract attacks (intermediate nodes) or constraints (equalities and inequalities) on combinations of attack tree node values. In the remainder of this paper, we propose such an approach.

3 Attack-Tree Decoration

In this section we give, to the best of our knowledge, the first formulation of the attack-tree decoration problem as a constraint satisfaction problem. We start by introducing the necessary attack tree basics. The interested reader can find more details about the attack tree formalism in the paper by Mauw and Oostdijk [40].

3.1 Attack Trees

In an attack tree the main goal of the attacker is captured by the root node. This goal is then iteratively refined into subgoals, represented by the children of the root node. Leaf nodes in an attack tree are called atomic subgoals, as they are not refined any further. Non-leaf nodes, instead, can be of two types: disjunctive () or conjunctive (). A conjunctive refinement expresses that all subgoals must to be achieved in order to succeed on the main goal, while in a disjunctive refinement the achievement of a single subgoal is already enough.

Figure 1: An attack tree representing stealing money from someone’s bank account.
Example 1.

Consider the simple attack tree in Figure 1. The root node of this tree represents the main goal of the attack: to steal money from a bank account. This goal is disjunctively refined into two alternative sub-attacks: the attacker may try to get money from an automated teller machine (ATM), or they might attempt to hack the online bank account system. The sub-goal that explores getting money at an ATM is further conjunctively refined into two complementary activities: the attacker must steal the credit card of the victim and they also needs to obtain the PIN code by shoulder-surfing. Note that a conjunctive refinement is denoted graphically with an arc spanning the child nodes.

Definition 1 (Attack tree).

Given a set of labels , an attack tree (ATree) is constructed according to the following grammar (for ):

Our grammar above slightly differs from the grammar used in other notations to represent attack trees [40, 29], as we require every node in the tree to be annotated with a label . The reason for this is that, as opposed to standard attack tree semantics that focus on the leaf nodes, we render every node in the tree equally important.

To provide a definition of our running example we will use shorter labels than those in Figure 1. The actual mapping between labels should become clear through a quick visual inspection.

We say that an attack tree has unique labels if it does not contain two distinct nodes with the same label. We use to denote the universe of attack trees. We also use the auxiliary functions and to obtain, respectively, the root node’s label and all labels of a given tree. Formally,

  • for some

  • if , otherwise when for some

For example, given the tree from Fig. 1, we have that and .

3.2 The Attack-Tree Decoration Problem

We proceed by formulating the attack-tree decoration problem as a constraint satisfaction problem. Intuitively, we map an attack tree to a set of boolean expressions whose variables are drawn from the set of labels of the tree. Such a set of boolean expressions, defined over a given domain, can be seen as a constraint satisfaction problem whose solutions correspond to solutions of the attack-tree decoration problem. The remainder of this section is dedicated to formalising this intuition.

Decorating an attack tree is a process whereby nodes in the tree are assigned with values. Given an attack tree , we use a total function from labels of the tree to values in a domain to represent the decoration process, and to denote the universe of such functions. To that effect, we often refer to labels as variables and to as a valuation. The co-domain of a valuation is determined by the attribute of the tree under evaluation. For example, minimum time of a successful attack uses the natural number domain to express discrete time, while required attacker skill to succeed typically uses a discrete and categorical domain, such as .

Definition 2 (Attribute semantics).

Given an attack tree and a domain , an attribute semantics is a set of valuations with domain and co-domain .

A semantics provides an attribute with the set of valuations that the attribute regards as valid in a given tree. Because defining an attribute semantics by exhaustive enumeration of its valuations might be cumbersome, we consider in this article attributes whose semantics can be derived from a constraint satisfaction problem.

An attribute constraint is defined as a boolean expression over the set of labels of a tree. To that effect, when we use labels in expressions we will consider them as variables over a given domain . For example, if the attribute minimum time taken by an attack is being computed over an attack tree of the form , it is typically required that  [31]. The intuition for such constraint is that, because is disjunctively refined, the minimum time needed by an attacker to meet the goal is considered to be the least time required by any of ’s children.

We use predicates as short-hand notations for boolean expressions. For example, can be used to denote the boolean expression . We say that a predicate is valid under interpretation , denoted , if evaluates to true. Likewise, a set of predicates is said to be valid under , denoted , if all predicates in are valid under . When it does not lead to confusion, we will often refer to a predicate as .

Definition 3 (Attribute constraint-set).

Given an attack tree and a domain , an attribute constraint-set is a set of predicates over whose variables range over . Its semantics is defined by .

There exist in literature various ways to relate the value of a parent node in an attack tree to the values at its children [31], of which the bottom-up approach is the most common one [40, 29]. This bottom-up approach starts from an assignment of concrete values to the leaf nodes of the tree and uses two functions (one for disjunctive refinement and one for conjunctive refinement) to recursively calculate the value of a parent node from the values of its children. We will next define how an attribute constraint-set can be recursively derived from two unranked aggregation operators associated with a bottom-up approach. The actual values of the leaf nodes will have to be defined by the analyst through additional constraints.

Definition 4 (Bottom-up attribute constraint-set).

Let be an attack tree, and and two unranked function symbols (symbols without fixed arity) with domain . We use to denote the boolean expression . Similarly, the predicate denotes the boolean expression . The bottom-up attribute constraint-set of is recursively computed as follows:

  • If for some label , then .

  • If with for , then ;

  • If with for , then .

Definition 4 is based on the standard bottom-up approach [40, 29] for attack trees where child nodes are aggregated together based on two functions: for children of a conjunctive refinement and for children of a disjunctive refinement. In literature there exist concrete definitions of and for various attributes. For example, when computing probability of success it is usually considered that and , for cost and , and for minimum time and .

Definition 5 (The attack-tree decoration problem).

Given an attack tree and an attribute constraint-set for , the attack-tree decoration problem consists in finding a valuation in .

The attack-tree decoration problem corresponds to the well-known Constraint Satisfaction Problem (CSP) [49], where a solution is a valuation that satisfies a set of constraints. Finding a solution or even deciding whether there exists a solution for CSP is a well-known and complex computational problem.

We say that the attack-tree decoration problem is:

  • Determined: If the cardinality of is one, i.e. there exists a single valid valuation only.

  • Inconsistent: If , i.e. there does not exist a valid valuation.

  • Undetermined: If the cardinality of is larger than one, i.e. the problem is neither inconsistent nor determined.

We illustrate these concepts with the following example that utilises a subtree from our running example. Consider the tree depicted in Figure (a)a whose set of labels is

(a) An example of undeterminism.
(b) An example of inconsistency.
(c) A determined attribute domain.
Figure 2:

Further, we consider subsets of the following set of constraints:

Predicate follows from the standard interpretation of minimum attack time in an attack tree [27], where and defined over the natural numbers . This leads to the bottom-up attribute constraint-set . The attack-tree decoration problem with this attribute constraint-set is clearly undetermined given that, for example, the valuations and both satisfy .

Now, consider predicates , and . This type of boolean expressions represent variable assignments. We observe that the attribute constraint-set leads to an inconsistent decoration problem (see Figure (b)b), while the decoration problem defined by is determined as there exists a unique valuation satisfying all three predicates, namely (see Figure (c)c). Finally, we remark that the attribute constraint-set also leads to a decoration problem that is determined. Moreover, it corresponds to the standard bottom-up calculation in attack trees.

This last example illustrates the general observation that, given an assignment of values to the leafs of an attack tree and a bottom-up attribute constraint-set (Def. 4), the attack-tree decoration problem is determined. This is formalized in the following proposition, which can be easily proved by induction on the structure of the tree.

Proposition 1.

Let be an attack tree with unique labels, let be the set of labels of its leaf nodes and let be a domain. Let be a set of constraints assigning values to the leaf nodes and let be the bottom-up attribute constraint-set of . Then the attack-tree decoration problem for and constraint-set is determined.

4 A Methodology for Attack-Tree Decoration

As indicated above, our approach extends the rather rigid bottom-up way in which attack trees are currently decorated. Our methodology consists of two main steps that complement each other: (1) generation of the attribute constraint-set and (2) analysis of valid valuations. The former boils down to the definition of predicates over the set of labels of a tree. We make a distinction between two types of predicates: hard predicates and soft predicates.

4.1 Hard Predicates

Hard predicates are derived from the attack tree refinement structure rather than from knowledge databases or an expert’s opinion. This choice establishes that all predicates derived from the attack tree structure should be satisfied, as otherwise the attribute semantics and the tree contradict each other. The term hard stems from the notion of hard and soft constraints in satisfaction problems. Soft constraints represent desirable properties, while hard constraints are a must.

In this article, we consider hard predicates those contained in the bottom-up attribute constraint-set of the tree (see Def. 4). This is a conservative choice that allows us to extend existing bottom-up quantification methods based on the refinement relation of the tree [40, 29]. In fact, it follows from Prop. 1 that, if all labels in a tree are unique, then the resulting attack-tree decoration problem based on this bottom-up attribute constraint-set is either undetermined or determined. We remark, nonetheless, that our methodology can also be used to model other computational approaches such as the Bayesian reasoning proposed by Kordy, Pouly, and Schweitzer [32]. It is ultimately the analyst who decides what constitutes a set of hard predicates, although we require the analyst to come up with hard predicates that are satisfiable; as we do in this article.

4.2 Soft Predicates

Statistical data and constraints extracted from industry-relevant knowledge-bases and experts are too valuable to ignore. And our methodology treats them as first-class citizens. As usual, we encode this information in predicates. For example, assume that comprehensive empirical data indicates that the probability of a bank account being hacked is less than . The semantics of such attribute in our running example tree can be defined by a set containing the predicate .

In our methodology, predicates obtained from experts and knowledge-bases are regarded as soft. The reason is that, when it comes to opinion and empirical data, inconsistencies are common. Hence we do allow these predicates to be violated up to some extent. For example, consider that for a particular attack tree we obtain that the probability that an account is hacked is . Although such an outcome violates the predicate , one may find it acceptable and not far from the considered empirical data.

4.3 Analysis of Attribute Semantics with Hard and Soft Predicates

Given an attack tree and attribute constraint-set over labels of and domain , we use and to denote the partition of into hard and soft predicates, respectively. As described in the previous section, we analyse an attribute constraint-set by looking at solutions of the corresponding constraint satisfaction problem. Formally, given an attribute constraint-set , we aim at finding a valuation such that . However, such formulation makes no distinction between hard and soft predicates, which is a feature we regard important in our methodology. For example, it may be the case that no valuation satisfying exists, while we can still find such that . Note that, although the constraint-set is an oversimplification of the original attribute constraint-set with all soft constraints being removed, satisfies all hard constraints and thus may be worth considering.

In our methodology, when the original attack-tree decoration problem has no solution we propose to solve a weaker version: the relaxed attack-tree decoration problem. This new problem allows soft predicates to be weakened, which consists in replacing any soft predicate by a predicate that logically follows from . We define this type of entailment on predicates over an attack tree by:

Using this notation, we can define the notion of a weakening relation on sets of predicates.

Definition 6 (Weakening relation).

Let be a partial order on sets of predicates. Then we say that is a weakening relation if and only if for all sets of predicates and it holds that

If , we say that is a weakening of under the weakening relation . We provide three examples of weakening relations.

  1. Set equality (), which is the trivial weakening relation.

  2. Set inclusion (), which allows one to weaken a set of predicates by deleting one or more of its elements.

  3. The maximal weakening relation (), which is defined by

The proofs that these are indeed weakening relations and that is maximal are straightforward.

Using this notion of a weakening relation we reformulate the attack-tree decoration problem as an optimisation problem in the following way.

Definition 7 (The relaxed attack-tree decoration problem).

Let be an attack tree and an attribute constraint-set over and domain , where and are hard and soft predicates, respectively. Let be a weakening relation. The relaxed attack-tree decoration problem consists of two stages:

  1. Finding a set of predicates over and domain such that:

    • ,

    • , and

    • .

  2. Solving the attack tree decoration problem with constraint-set .

A solution is a pair , such that .

The choice of the weakening relation is relevant in an instantiation of the relaxed attack tree decoration problem, as we show in the next section. In particular, we analyse two relevant decoration problems resulting from two concrete weakening relation definitions, namely the set inclusion () and maximal weakening relation ().

5 Decoration Algorithm for Specific Classes of Predicates

Solving a constraint satisfaction problem is in general NP-hard. Thus, this section is devoted to instantiating each component of the developed theory into concrete predicate languages that can be used in standard solver tools to find solutions for the relaxed attack-tree decoration problem. In Section 7 below we show how those instantiations of the theory can be used to analyse a comprehensive attack tree case study.

5.1 Maximal weakening over inequality relations

Here we address the question of whether there exists a meaningful predicate language and constraint satisfaction solver that can be used to solve the relaxed attack tree decoration problem with respect to the maximal weakening relation. Note that, among all possible weakening relations the maximal one is the less restrictive. Hence it leads to more fine-grained solutions than other weakening relations.

The chosen predicate language defines predicates of three types, all based on comparing one or two labels to a constant value. The three types of predicates are:

where and are labels and is a real number (positive or negative). We denote the set of all such predicates by and we will often use to denote a predicate in this set with constant value . It is easy to verify that can only hold if predicates and are of the same type.

Lemma 2.

Let be two predicates, then implies that, for some labels , , and some real numbers , ,

The three types of predicates have been chosen in such a way that the maximal weakening relation on single predicates can be easily characterised by the numerical order of their constant values . This characterisation will allow us later to define the distance between two predicates as the difference between their constant values and .

Lemma 3.

Let , be labels and , be real numbers. Then the following properties hold.

From this lemma it follows, for instance, that for , if and , then there exists , such that . Hence we consider the set containing all total functions such that . The Euclidean distance between two predicate is given by if and are of the same type, otherwise. Given , we define , and the Euclidean distance between two sets of predicates by

We restrict the distance measure above to bijective functions only. That is to say, we consider from now on to be the set containing all bijective functions such that . A consequence of such restriction is that predicate sets with different cardinality have distance , which simplifies the proof of Theorem 4 below.

Next we provide sufficient conditions for a set of predicates to be part of a solution of the relaxed attack-tree decoration problem. It states that a set , which minimizes its distance to , where is the set of soft predicates for a given tree , and that satisfies , where is the set of hard predicates, leads to a solution of the relaxed attack-tree decoration problem.

Theorem 4.

Let be an attack tree and an attribute constraint-set for , where . Let be a set of predicates such that and is minimum and defined, i.e. . Then there exists a valuation such that is a solution of the relaxed attack-tree decoration problem with respect to .


First, if , then there exists and , which is minimum. Thus in the remainder of the proof we assume that .

Now, notice that , otherwise . Thus we obtain that , implying that satisfies the first condition of the relaxed attack tree decoration problem. Moreover, we also conclude that , given that . Next, we will show that satisfies the third condition of the problem definition as well.

Suppose we have , but . Because , there must exist such that no satisfies that . Let us analyse the three possible predicate types of .

  1. Assume . Because , there must exist in with . Now, let be a bijective function such that . Such a function exists given that and . Let be the predicate in such that . Overall we obtain that and . Now, given that does not imply , according to Lemma 3 it must be the case that . Therefore we obtain the order . Consider the set of predicates . On the one hand, because it follows that and . On other hand, because and , we obtain that . Considering that , then , which contradicts the assumption that is minimum.

  2. The case is analogous to the previous one.

  3. Finally assume . As in the first case we obtain that there exist predicates and in and , respectively, such that and . Given that does not imply , then we obtain the following order, . Again, such an order implies that . Therefore, the set of predicates satisfies that , which contradicts the assumption that is minimum.

The proof concludes by remarking that satisfies the second condition of the relaxed attack tree decoration problem, as stated in the body of the theorem. ∎

This theorem demonstrates a reduction of the relaxed attack-tree decoration problem for the maximal weakening relation on the given types of predicates to an optimisation problem that we solve via nonlinear programming. The formulation of the optimisation problem is given below.

Corollary 5.

Let be an attack tree and an attribute constraint-set for where contains only predicates from . We create the set of predicates by replacing every predicate by.

where are variables. A solution to the following optimisation problem leads to a solution of the relaxed attack tree problem.

subject to

Implementation. To find a valuation function and a set of weakening predicates so that the distance function is minimised, we relied on the Sequential Quadratic Programming (SQP) problem interpretation of the relaxed attack-tree decoration problem. We further refer to this tool as the SQP-based tool. It is implemented using the Python scipy library that provides the Sequential Least Squares Programming (SLSQP) algorithm for solving optimisation problems of this type. Our implementation222Code available at does not impose any burden on the analyst, as it allows a loose interpretation of all constraints together. In case the set of constraints is satisfiable, our tool finds an optimal solution. In case of an unsatisfiable set of constraints, our implementation will find an optimal solution that minimises the distance function between set of predicates.

5.2 Set inclusion weakening over propositional logic

As the analyst may require a predicate language richer than the one described above, we provide tool support for predicates written in the propositional logic. We do so via a transformation of a relaxed attack tree-decoration problem instance into a Satisfiability Modulo Theories (SMT) instance [16]. SMT is the problem of determining whether a formula in the first-order logic, where some operator symbols are provided with a theory, is satisfiable.

Given a set of predicates , we use to denote the first-order logic formula formed by all predicates in in the conjunctive form. Then the SMT instance resulting from the relaxed attack-tree decoration problem instance with the attribute constraint-set is defined by . If is satisfiable, it follows that the decoration problem does not need to be relaxed. Otherwise, we use Algorithm 1 to find a subset of soft predicates that solves the relaxed attack tree decoration problem with respect to the inclusion relation.

1:The relaxed attack-tree decoration problem defined by attack tree , attribute constraint-set .
2: is a set of maximum cardinality such that and .
3:Let = be the soft constraint set.
5:for  do
7:     if  is not satisfiable then
9:return where .
Algorithm 1 Solving the decoration problem w.r.t. the set inclusion weakening relation.

Initially, the set is empty. Algorithm 1 works by iteratively adding predicates to the set , until the formula is satisfiable, while the formula is unsatisfiable for any . It is easy to prove that such procedure provides a solution to the relaxed attack-tree decoration problem with respect to the subset inclusion weakening relation, based on the assumption that hard predicates are satisfiable.

Implementation. We implemented our transformation relying on the well-known theorem prover Z3 from Microsoft333 Z3 can be utilised as a constraint solver, i.e., it can find a solution satisfying a set of constraints expressed as equalities and inequalities. Our current implementation444Code available at can handle all attribute domains on real numbers that are defined in the ADTool [28], e.g. probability, minimal cost of attack, and minimal time; and it is trivially extensible to attribute domains defined on Boolean values, e.g. satisfiability of a scenario. The analyst can further specify additional constraints, if desired. The resulting set of constraints is passed to the Z3 prover, which reports whether the problem is solvable or not. In case the constraint satisfaction problem is solvable, the prover will report a complete consistent valuation for the given tree. This valuation will satisfy the tree structure and the constraints expressed by the analyst, and it will agree with the given initial valuation. If the constraint satisfaction problem is not solvable (i.e. the initial valuation is inconsistent), we use Algorithm 1 to find a subset of soft predicates of maximum cardinality that is satisfiable, and report a solution satisfying the found maximal predicate set.

6 ATM Case Study

We now proceed to show how our approach to attack-tree decoration can be applied to a real-life security scenario. The goal is to show that decoration can be performed in a systematic way even when the analyst only has partial information about attribute values.

In this section, we introduce a case study related to capturing automated teller machine fraud scenarios as an attack tree, and we describe the historical data available for decorating this tree.

6.1 ATM Security: a Case Study

Automated teller machines (ATMs) are complex and expensive systems used daily by millions of bank customers worldwide. Because each carries a significant amount of cash, ATMs are the target of large-scale criminal actions. Only in 2015 more than ATM incidents were reported in Europe, causing over million Euros loss555

In an attempt to provide structure to the risk assessment process and catalogue ATM threats, Fraile et al. created a comprehensive attack-defence tree capturing the most dangerous attack scenarios applicable to ATMs [17]. The tree modelled in [17] contains three main branches: brute-force attacks, fraud attacks, and logical attacks. The attacks of the logical type make use of malicious software, while a brute-force attack typically ends up destroying the ATM. Differently from these two attack scenarios, ATM fraud attacks involve conventional electronic devices (such as card skimmers) and require the participation of the victim.

Figure 3: An attack tree modelling ATM fraud. The tree is loosely based on the attack-defence tree published by Fraile et al. [17].

In this empirical evaluation section we focus on ATM fraud, because more empirical data is available for these attacks than for the other types of attacks. Figure 3 presents an attack tree characterising such attacks that is loosely based on the attack-defence tree published by Fraile et al. [17]. In ATM fraud, criminals need covert access to the ATM, as this attack typically requires opening the machine’s case either by force or with a generic key, and installing a special device (e.g. a skimmer). Then the attacker waits until a victim uses the ATM and, as a consequence, enables the installed device. Lastly, the attacker gets cash from the victim’s account by means of various techniques, such as cash trapping, card cloning, etc.

6.2 Decorating the ATM Fraud Attack Tree

The decoration process we propose in this paper consists of three independent steps that are executed for a given attack tree. First, an attribute is chosen. In this case study, we focus on probability of success, that is, the probability that a given ATM machine is used to successfully execute ATM fraud. The attack tree structure jointly with the attribute rules determine the hard constraint set, i.e. the standard bottom-up constraints derived from the attack thee structure. Second, statistical information (historical data) related to the chosen attribute is gathered. Such statistical values are used to provide tree nodes with probability values. For the ATM fraud scenario, the available statistical data is presented in Section 6.2.1. Lastly, relations among nodes in the tree are established based on the analyst’s insight and domain knowledge. Section 6.2.2 presents the corresponding analysis for the ATM fraud scenario. The full set of constraints for the ATM fraud scenario is summarised in Section 6.3.

6.2.1 Statistical Analysis

The statistical values we consider here have been derived from the ATM Crime Report 2015 (EAST). In our case, we analyse ATM fraud incidents in Lisbon, which hosts 300 ATMs. We remark, however, that these values have been derived for illustrative purposes only and may not be accurate.

Between and , ATM fraud attacks have been reported in Lisbon. This gives a

probability of an ATM to be the target of fraud within a calendar year, if we assume the uniform distribution of these attacks. Because the report categorises ATM fraud into different attack types, we can provide probability values for some attack types by analysing the attack frequency as reported in the EAST report. The results can be found in Table 

1. For the results reported in this table, we assume that historical attacks were uniformly distributed, and we rely on frequencies of attacks over long time periods to estimate probabilities, like in the OCTAVE method [13].

Node Prob. Source
ATM Crime Report 2015 (EAST)
ATM Crime Report 2015 (EAST)
ATM Crime Report 2015 (EAST)
ATM Crime Report 2015 (EAST)
ATM Crime Report 2015 (EAST)
Table 1: Historical data values identified for some attack tree nodes from the ATM Crime report.

6.2.2 Domain Knowledge Constraints

In the previous subsection, we have shown how available statistical data can be used as a constraint in our decoration process. Another novelty of our approach is that we allow for domain knowledge constraints, that is, facts that must be additionally satisfied in the attack tree. The following list of facts is based on the previously mentioned ATM Crime Report 2015 and also on the European Central Bank report on card fraud (2015).

  •  is more likely than . Moreover,  is more likely than , which is more likely than .

  •  is more likely than .

  • ,  and  are all equally likely.

  •  and  are equally likely.

6.3 Full Set of Predicates

We now list the full set of predicates that will be used by the tools to solve the decoration problem. To simplify the presentation, we use a short-hand notation. All predicates listed in this section can be straightforwardly transformed into our predicate notation.

Hard predicates. Considering the attribute domain of probability of success, the attack tree shown in Fig. 3 corresponds to the following set of hard predicates:

  •  = (, ),

  •  = (, ),

  •  = (, , ),

  •  = (, ),

  •  = (, , ),

  •  = (, , ),

  •  = (, ),

  •  = (, ).

Soft predicates. Historical data values from Table 1 are encoded in the form of soft predicates:

  •  = ,

  •  = ,

  •  = ,

  •  = ,

  •  = .

We will subsequently refer to the soft predicates listed above as historical data constraints.

Domain knowledge from the ATM Crime report is encoded in the form of soft predicates as well:

  •   ,

  •   ,

  •   ,

  •   ,

  •  = ,

  •  = ,

  •  = ,

  •  = ,

Subsequently, we will refer to the set of soft predicates above as domain knowledge predicates.

6.4 Goals of the Analysis

We consider that the analyst has designed an attack tree covering ATM fraud scenarios as presented in Figure 3. This attack tree gives them the set of hard constraints given in Section 6.3. Furthermore, the analyst has elicited a set of soft constraints based on their knowledge of the problem space and the information available in the ATM Crime Report (also listed in Section 6.3). However, the analyst is not able to find enough data to estimate probabilities for all leaf nodes in the attack tree, what prevents them from straightforwardly applying the bottom-up evaluation procedure to compute the probabilities for all intermediate nodes and, ultimately, for the root node.

The analyst can, however, apply our methodology and decorate the attack tree. We consider the following possible analysis questions that can be investigated with our approach:

  • Are the attack tree and the set of constraints elicited by the analyst compatible? I.e. does the corresponding decoration problem have a solution? In our notation, for an attack tree , and attribute constraint-set , is ?

  • What is a solution for the given decoration problem? In our notation, the analyst is interested in finding a solution .

  • If the decoration problem has no solution, what is a solution that is the closest to satisfying all constraints? This question corresponds to solving the relaxed attack tree decoration problem formulated in Definition 7, for a chosen weakening relation.

We will demonstrate in the next section how our two implementations solve the relaxed attack tree decoration problem of the ATM case study, for the maximal and set inclusion weakening relations.

Note that one may argue that in our case study the analyst already has the probability of ATM fraud (the root node) from the historical data, and therefore, they can skip decorating the whole tree. However, the analyst may still want to perform the so-called what-if analysis, which consists in analyzing different but related scenarios. For example, the analyst could answer questions such as: What if the probability of this attack is in fact higher than I envisage? How will this affect my security posture? The what-if analysis requires a fully annotated tree, which can be provided using our decoration technique even over partially available data. Furthermore, in general, it cannot be assumed that the root node value will always be available from the historical data.

7 Empirical Evaluation Results

We now show how our two implementations can be applied to analyse the ATM case study introduced previously.

Node Label Probability of attack success for tool and set of predicates
hard predicates + hard predicates + hard predicates +
historical data historical data + historical data +
domain knowledge domain knowledge
Table 2: Solutions of the ATM fraud attack tree decoration problem found by our tools
Figure 4: A valuation identified by the CSP-based tool for the attack tree in Fig. 3 with the hard constraints and the historical data predicates used as soft constraints (predicates are listed in Section 6.3).
Figure 5: A valuation identified by the CSP-based tool for the attack tree in Fig. 3 with the predicates listed in Section 6.2.2.
Figure 6: A valuation identified by the SQP-based tool for the attack tree in Fig. 3 with all hard and soft predicates listed in Section 6.2.2.

7.1 The CSP-based Implementation Showcase

We first exemplify the results of the CSP-based implementation on the ATM case study tree presented in Fig. 3. The first solution in Table 2 presents a possible valuation for this attack tree found by the CSP-based tool with the hard predicates and the historical data predicates listed in Section 6.3. Figure 4 visualises this solution in the ATM fraud attack tree. We have used the open source ADTool software [28, 21] for visualising attack trees.

If the analyst introduces the domain knowledge predicates as an additional set of constraints, the CSP tool indicates that the constraint satisfaction problem becomes unsatisfiable. Indeed, the constraints on  and  are contradictory, because historical data does not indicate that the probabilities of these attacks are exactly equal.

The Z3 solver that we use as the underlying constraint satisfaction engine is capable of finding an unsatisfiable core of the problem. In the ATM fraud case, the solver reports that three predicates constitute the unsatisfiable core:

  •   ,

  •  ,

  •  .

The solver then proceeds to identify a maximal set of predicates that is still satisfiable. In our case, the solver is able to satisfy all predicates but