1 Introduction
Industrial Control Systems (ICS) are cyberphysical systems that are responsible for maintaining normal operation of industrial plants such as water treatment, gas pipelines, power plants and industrial manufacture. Modern industrial organizations perform an increasing large amount of operations across IT and Operational technology (OT) infrastructures, resulting in interconnected ICS. It also creates new challenges for protecting such integrated industrial environments, and makes cyberphysical security threats even more difficult to mitigate [1]. Therefore, many industrial organizations started looking for methods to converge IT and OT infrastructures in more secure and resilient ways. In this paper, we consider software diversification as a way of deploying products across ICS and improving the resilience of the integrated systems. However, there are various realworld constraints we might encounter when finding an optimal diversification strategy, for instance, limited flexibility of diversification for legacy systems, strict configuration policies and other (un)desirable configuration requirements. Therefore, our approach particularly considers these constraints into optimization and evaluates the impact of these constraints on the optimal diversification.
Software monoculture has been recognized as one of the key factors that promote and accelerate the spread of malware. It is widely acknowledged that diversifying network resources (e.g. software packages, hardware, protocols, connectivity etc.) significantly mitigates the infection of malware between similar products and reduces the likelihood of repeating application of single exploits [2]. When facing attacks using zeroday exploits (i.e. unknown exploits), the situation becomes even worse as there are no available defense countermeasures to stop them. Stuxnet, as the first cyber weapon against ICS, leveraged four zeroday vulnerabilities. Until September 2010, there were about 100,000 hosts over 155 countries infected by Stuxnet [3]. The invariability or high similarity of products used in most affected hosts accounts for the rapid infection and prevalence of Stuxnet. Therefore, diversityinspired countermeasures have been introduced to improve the resilience of a network against malware propagation. However, it is not very clear about (i) how much diversification is required to reach an optimal/maximal resilience, (ii) how exactly to deploy diverse resources across a network, and (iii) how configuration constraints would harm the optimal diversification. In this paper, we aim to mitigate stuxnetlike worm propagation by optimally diversifying resources. We consider a variety of offtheshelf products to provide services at each host, and find the optimal assignment of them to maximize the network resilience.
There are two main trends of research investigating diversity as an effective defense mechanism. One trend seeks for solutions from software development such as nversion programming [4], program obfuscation [5] and code randomization [6]. The other trend studies diversityinspired defense strategies from the perspective of security management. Specifically, the goal of this trend is to find an optimal assignment of diverse products for each host in a network. For example, O’Donnell and Sethu proposed to diversify products in a network by a distributed coloring algorithm in [7]. Newell et al. focused on diversifying routing nodes and found an efficient way to compute the optimal solutions in [8]. A set of security metrics have been introduced by Zhang et al. [9] to evaluate network diversity and its impact on the resilience against zeroday attacks. More details about related work are provided in Section 2.
Our work lies in the second trend of research, in which we aim to find the optimal assignment of products to diversify a network. Most of the existing work has made three critical assumptions: (i) there is no configuration constraints when searching for an optimal assignment of products. (ii) each node (or host) in a network was modelled by a single
label, indicating that there is only one vulnerable product (or service) running on a node, namely there is only one attack vector on each node that requires diversification.
(iii) each individual product shares no vulnerability with any other, which implies that unique exploits are necessary to compromise different products. Nevertheless, we contend that these assumptions are unrealistic, and thus we drop these assumptions in this work. We specifically defined any configuration constraints into the process of optimization. Also we considered a more realistic infection model of malware. In the following subsection, a simple example demonstrates how these assumptions prevent us from modelling the actual infection model of malware.In this paper, we start with formally defining the similarity of vulnerabilities between products to reflect the similar exploitability of products. We conduct a statistical study to estimate such vulnerability similarities by using public vulnerability databases such as Common Vulnerabilities and Exposures (CVE) [10] and the National Vulnerability Database (NVD) [11]. Furthermore, we represent each host in a network by a multilabel node, which can be formally mapped to a discrete Markov Random Field (MRF) model. By combining the similarity metric and the MRF model, we can construct the corresponding infection model of potential zeroday exploits across a network with a given product assignment. We then focus on computing an optimal product assignment to minimize the prevalence of zeroday exploits. Before our main contributions are enumerated in Section 1.2, we present an illustrative example in Section 1.1 to further explain the motivation.
1.1 Motivational Example
We use a simple example in Figure 1 to explain the motivation of this work, where a simplified network with 8 hosts is presented. Most of the existing work models the network as in Figure 1(a), where each host is represented by a singlelabel node. A zeroday exploit breaks into the network from the entry node. In order to prevent the exploit (which exploits circle labels) from infecting more hosts, most existing work suggests to diversify all hosts in the way indicated by triangle and circle labels respectively in Figure 1(a). The illustrated configuration is effective because the spread of the exploit is stopped after it compromises the entry node and hence the chance of the target node being infected is 0.
Nevertheless, it is assumed that different products share no vulnerabilities between each other, which is however not always the case according to our statistical study on the CVE/NVD database in Section 3. We discover that most vulnerabilities reported to NVD can actually affect multiple products. Therefore, we improve the model by considering the vulnerability similarities between different products. Figure 1(b) demonstrates the zeroday propagation when the two products (circle and triangle labels) have a 0.5 vulnerability similarity between each other, namely there is a 50% chance that the same zeroday vulnerability exploited at circle labels can also be exploited at triangle labels, and vice versa
. In this case, the probability of the target node being breached is increased to approximately 0.125.
In most realistic scenarios, a host is supposed to deliver multiple services (e.g. operating systems, web servers, email servers, databases, etc), each of which is potentially vulnerable to zeroday attacks. That means each host actually offers several alternative attack vectors, and as a result, sophisticated attackers can choose the vulnerability with higher success rate to exploit the host. Therefore, we represent each host by multiple labels corresponding to different services on the host. As shown in Figure1(c), we add another shape of labels (i.e. red squares) to some hosts, and introduce a sophisticated attacker in possession of two zeroday exploits (one for round labels and the other for square labels). It can be seen from Figure1(c), the attacker uses the square label exploit (rather than the round label exploit) to infect its adjacent node, which gives a greater chance of success. Consequently, with the collaboration of two zeroday exploits, the risk of the target node being compromised is further increased to approximately 0.5.
1.2 Main Contributions
From the example above, we learn that in order to find the optimal way to diversify network resources, we first need to model the resources accurately, based on which we can determine precisely the infection model of potential exploits across a network and find the optimal assignment of products to minimize the prevalence of exploits. We summarize the main contributions of this paper as follow:

We demonstrate that our optimization approach is directly applicable in practice to find the optimal diversification strategy when integrating ICS with modern IT infrastructure. We use a realworld case study inspired by Stuxnet propagation, to find optimal diversification solutions to ITOT convergence of ICS, particularly accommodating realworld configuration constraints and limited flexibility of diversification in certain areas.

To the best of our knowledge, our work is the first attempt to explicitly consider the vulnerability similarity between products when finding the optimal diversification solutions. Specifically, we represent each host by a multilabelling model with each label corresponding to a service on the host. A variety of products for each service is also modelled to render different assignments of products. By means of the vulnerability similarity between assigned products, the infection of malware across the network can be accurately estimated.

In order to compute the optimal assignment of products, we model the network by a discrete Markov Random Field (MRF), which then can be optimized by an efficient sequential treereweighted message passing (TRWS) algorithm [12]. In this way, our approach can scale up well to analyze largescale and highdensity networks.
1.3 Paper Organization
The rest of the paper is organized as follows. We discuss related work in Section 2. The similarity metric is introduced in Section 3, as well as the statistical analysis based on CVE/NVD databases. We formally represent the network and the addressed research problem in Section 4. The computation of optimal solutions is given in Section 5
. A diversity metric based on Bayesian Networks (BN) is given in Section
6 to evaluate our solutions. The case study about mitigating Stuxnet propagation in integrated ICS is presented in Section 7 to demonstrate the practical usage of our optimization approach, and an indepth evaluation of our optimization approach is given in Section 8. The scalability analysis can be found in Section 9. The paper finishes with a discussion and conclusion in Section 10.2 Related Work
Software diversity has long been recognized as a mechanism for improving resilience and security of networked computing systems [13, 2, 14]. The rationale is that it forces attackers to develop an unique exploit to compromise an individual product at each node in a network, thus substantially increasing the attacking time and cost needed to penetrate a networked system at a massive scale.
A variety of methodologies for software diversity have been studied in the literature, among which the first direction of research focuses on software development diversity. Examples include nversion programming [4], execution environment diversity [15] and code randomization [6].
The second direction, which is also the focus of this paper, is the strategies for diversified deployment of resources in a networked system. For instance, based on the assumption that different variants of products share no common vulnerabilities, O’Donnell and Sethu [7] proposed to assign diverse software packages in a communication network by a distributed coloring algorithm to limit the total number of nodes an attacker can compromise using a limited attack toolkit. Newell et al. [8] found an efficient approach to compute the optimal solution for placing diverse software/OS variants on routing nodes to improve overall network resilience given the assumption that each variant is compromised independently with some probability metrics. Besides, there were some work defining formal security metrics for software diversity. For example, Wang et al. [16] defined a network security metric, kzero day safety, for measuring the risk of unknown vulnerabilities based on the number of unknown vulnerabilities required for compromising network assets. Furthermore, Zhang et al. [9] introduced three security metrics to evaluate the resilience against zeroday attacks using different diversity strategies based on the number and distribution of distinct resources inside a network, the least attacking effort required for compromising certain important resources, and the average attacking effort required for compromising critical assets, respectively. Borbor et al. [17] explicitly considered cost constraints on optimizing software diversity strategies. It is noticed that most existing work assumes that there is a very limited attack surface provided at each host, namely there is only one vulnerable product at each host for attackers to exploit. By contrast, we explicitly model various attack vectors (offered by multiple products) at each host.
Vulnerability databases such as CVE/NVD can provide statistical evidence for measuring software diversity. For example, Garcia et al. [18] presented a study on the overlap of vulnerabilities in 11 different OSes with OS vulnerability data from NVD. In [19], Bozorgi et al.
trained classifiers to predict whether and how soon a vulnerability is likely to be exploited by applying machine learning techniques on CVE data. On the validity issue of CVE/NVD, Johnson
et al. conducted the assessment of several wellknown vulnerability databases and concluded that NVD was actually the most trustworthy database [20]; we used NVD in this paper.Some existing work [16][9][21] studied malware propagation based on attack graphs to assess the risk of malware along with specific attack paths and network topology. Attack graphs have been extensively studied in the community to express the exploitation conditions of vulnerabilities. However, due to the unknown nature of zeroday vulnerabilities, we contend that such approaches are not always feasible to model zeroday malware. In contrast to existing work using attack graphs, our work focuses on the speed of zeroday exploits across a network configured by similar products. Highly similar configurations (in terms of potential vulnerabilities) would accelerate the prevalence of zeroday exploits. Instead of producing specific propagation paths, we use more general undirected edges to symbolize the connections (rather than directed information flow) between different hosts. We then use the proposed similarity metric to estimate the infection of zeroday exploits on each edge and find optimal diversification solutions.
3 Similarity of Product Vulnerability
In this section, we formally introduce the notion of vulnerability similarity between a pair of products, namely the likelihood of an exploit compromising both products.
Definition 1 (Similarity of Product Vulnerability)
Let , be a pair of products, and are sets of vulnerabilities of and respectively. The vulnerability similarity between and can be obtained by the Jaccard similarity coefficient [22]:
Given a pair of products, the vulnerability similarity is estimated by the ratio of the number of shared vulnerabilities between the two products to the total number of vulnerabilities. The rationale for this is to capture statistically how similar the vulnerabilities found on two products are.
To provide a more realistic sense, we can use the data from the NVD database [11] to calculate the similarity metric for any pair of offtheshelf products. There are more than 92,492 vulnerabilities published by NVD at the time of this paper. Each NVD vulnerability feed contains information about a specific vulnerability. An example of an NVD entry is given in Table I, which includes a unique identifier by the Common Vulnerability Enumeration (CVE) with the format “CVEYEARNUMBER”, the attack scenarios using the vulnerability, and the affected products sorted by Common Platform Enumerations (CPEs). CPE provides a wellformed naming scheme for IT systems, platforms and packages.
CVEID  CVE20167153  

Overview  The HTTP2 protocol does not consider the role of the TCP congestion window in providing information about content length, which makes it easier for remote attackers to obtain cleartext data by leveraging a webbrowser configuration in which thirdparty cookies are sent, aka a ”HEIST” attack.  
Vulnerable software
&Versions 

Given the large number of vulnerabilities in NVD, CPE serves the role of sorting vulnerabilities according to their affected products. We developed a program based on cvesearch[23] to fetch necessary data from NVD, filter out vulnerabilities for each studied product, and calculate the similarity of vulnerabilities between products. The pairwise similarities are stored as Similarity Tables. In this way, we can calculate the similarity of vulnerabilities between any pair of products listed in NVD.
For the purpose of illustration, here we use operating systems and web browsers as examples. We collect vulnerabilities for 9 common OS products and 8 common web browsers reported in the period between 1999 and 2016. Table II enumerates the pairwise similarity between the chosen OS products and Table III for the chosen web browsers. The reason for choosing these products is mainly because they have been ranked as most vulnerable products by CVE Details [24]. Each entry of the two tables contains the pairwise similarity calculated by Def.(1) and the number of shared vulnerabilities between products in brackets. The diagonal entries in tables are the total number of vulnerabilities of the row/column product. As the pairwise similarity is symmetric, the other half of a similarity table is omitted. For reserving the generality and flexibility of our study, we implicitly consider each different release/version of a product as a distinct product to compare. For instance, Windows 8.1 and Windows 7 are treated as two individual products and sorted by two different CPE queries cpe:/o:microsoft:windows_7 and cpe:/o:microsoft:windows_8.1.
From the tables, we can observe that products of the same vendor tend to have higher similarities. Two exceptions are observed in Table II: Mac OS X 10.5 and Windows 7 shares 8.1% vulnerabilities, and Ubuntu 14.04 and Debian 8.0 have 20.8% vulnerabilities in common, despite these two pairs of products being from different vendors. It is also noticed that products with a longer gap between their release dates have a lower similarity. Based on the statistical studies in both tables, we conclude that a single vulnerability is likely to affect multiple products across different versions, different vendors and different platforms, which implies that a single zeroday vulnerability could probably be exploited on heterogeneous hosts in a network. Therefore, to maximize the resilience of a network against zeroday exploits, it is desirable to use the uptodate products from diverse vendors across a network. For instance, Windows 10 has much lower similarities of vulnerabilities with the other Windows OS, and even shares no vulnerability with Windows XP. However, it is not always feasible to deploy the latest and diverse products due to their incompatibilities with other services. For instance, SIMATIC WinCC is one of the most widely applied SCADA systems, but it can only operate on Windows OS, and most releases of WinCC do not fully support Windows 10 yet.
It is worth mentioning that the versions of selected software in both tables are constrained by the granularity of CPE search engine. The CPE entries for many vulnerabilities in NVD are not complete or of different granularities.
In this section, we demonstrated the usage of CVE data to calculate the vulnerability similarity. The NVD database is one of the most wellknown publicly accessible vulnerability databases, which also covers most offtheshelf products and uptodate vulnerability information.
WinXP2  Win7  Win 8.1  Win10  Ubt14.04  Deb8.0  Mac10.5  Suse13.2  Fedora  

WinXP2  1.00 (479)  
Win7  0.278 (328)  1.00 (1028)  
Win8.1  0.009 (10)  0.228 (298)  1.00 (572)  
Win10  0 (0)  0.124 (164)  0.697 (421)  1.00 (453)  
Ubt14.04  0 (0)  0 (0)  0 (0)  0 (0)  1.00 (612)  
Deb8.0  0 (0)  0 (0)  0 (0)  0 (0)  0.208(195)  1.00 (519)  
Mac10.5  0 (0)  0.081 (109)  0 (0)  0 (0)  0 (0)  0 (0)  1.00(424)  
Suse13.2  0 (0)  0 (0)  0 (0)  0 (0)  0.170(161)  0.112 (102)  0 (0)  1.00(492)  
Fedora  0 (0)  0 (0)  0 (0)  0 (0)  0.083(75)  0.049 (41)  0.001(1)  0.116 (89)  1.00(367) 
IE8  IE10  Edge  Chrome  Firefox  Safari  SM  Opera  

IE8  1.0 (349)  
IE10  0.386 (240)  1.0 (513)  
Edge  0.014 (7)  0.121 (73)  1.0 (194)  
Chrome  0 (0)  0 (0)  0.001 (2)  1.0 (1661)  
Firefox  0 (0)  0 (0)  0.001 (2)  0.005 (15)  1.0 (1502)  
Safari  0 (0)  0 (0)  0.002 (2)  0.009 (21)  0.003 (6)  1.0 (766)  
SeaMonkey  0 (0)  0 (0)  0 (0)  0.001 (3)  0.450 (683)  0.001(1)  1.0(492)  
Opera  0 (0)  0 (0)  0.003 (1)  0.003 (6)  0.004 (7)  0.004(4)  1.00(492)  1.00(225) 
4 Diverse Product Assignment
In this section, we present the formal model of a product assignment for a given network, which is to find a diversification solution to assigning products to each host such that the malware propagation between similar products can be effectively mitigated.
Each host has to provide a set of services , such as an operating system, a web browser and a database server. Each service can be provided by a range of diverse products . Therefore, we formally define a network in terms of hosts, links, services and products as below.
Definition 2 (Network)
Let be a network,

= is a set of hosts.

captures the links between a pair of hosts,

= is a set of services, and denotes a set of services provided by a host .
(1) 
is a set of offtheshelf products, and hence each service can be provided by a range of diverse products,
(2)
Assigning one product for each service on a host is termed as an assignment of products for a host.
Definition 3 (Product Assignment)
Given a network , an assignment of products is captured by , such that is the product assignment for a service at the host : . The assignment for all services at a host can be derived by :
Therefore allocates products to all services running on a host, whilst assigns a product to a specific service of a host. An example network is illustrated in Figure 2, where a network consisting of 6 hosts is modelled. Each host provides up to two essential services web browser and database. Three diverse web browser products {} and three database products {} are available to choose. Each host might have different ranges of products to choose. A possible product assignment is highlighted by red circles in Figure 2
Now the problem is to find an optimal assignment which allocates most diverse products for each pair of connected hosts, so that the likelihood of a malware propagation between two hosts can be minimized. Nevertheless, some configuration requirements might hinder us from choosing the most optimal product assignment in practice. Therefore, we formally introduce local and global constraints to represent those requirements into the optimization process.
A local constraint indicates that for a particular host, a product is required to either configure with another product (expressed by ), or avoid the product (expressed by ). Such requirements can also be applied to all hosts by using global constraints.
Definition 4 (Configuration Constraints)
Given a network , a set of constraints expresses any (un)desirable product combinations in the solution. A constrained solution allocates products subject to .

a local constraint is applied to a specific host in the form of: or such that the constrained solution satisfies:

a global constraint is applied to all hosts in in the form of: or such that satisfies:
The usage of constraints is demonstrated in the later case study (Section 7.2). Now we can define the optimal assignment of products and the constrained optimal assignment as follows.
Definition 5 (Optimal Diversification)
Given a network , an optimal assignment of products is captured by , such that is the optimal product assignment for a host . A constrained optimal solution is denoted by which provides an optimal product assignment subject to a set of local and global constraints .
We adopt the following notation convention throughout this paper. denotes an assignment of products for a network in general. is for an optimal assignment without constraints, and is for a constrained optimal assignment. Specifically, includes the products assigned to a host , and is the product assigned to a particular service at the host .
In the next section, we focus on finding such an optimal assignment of products for a given network, as well as computing constrained optimal solutions in Section 5.2.
5 Finding the Optimal Diversification
First of all, we need a model to accurately represent a network in which each host has multiple services and each service can be provided by a range of products. More importantly, this model has to offer sufficient flexibility, because each host runs a customized set of services and even the same service has various selections of products at different hosts due to any compatibility requirements. Furthermore, we have to consider whether there is any existing efficient optimization algorithm to such a model. For these purposes, the optimal diversification problem can be represented by using a discrete Markov Random Field (MRF), which is converted into an optimal assignment problem of MRF that can be solved by an efficient message passing algorithm.
Specifically, we model this problem as a discrete MRF where each host has up to services, and there are up to products for each service . The optimization assigns up to products – one for each service on each host – to reach the global minima of the propagation. Given a network , we derive the energy function to denote the unary cost for each host and pairwise cost between a pair of connected hosts.
(3) 
where denotes how likely a product is preferred by a host to deliver the service , and is a pairwise cost between the products assigned to a pair of connected hosts, which in our context would be the pairwise similarity between products. Our problem is then mapped to the context of Conditional Random Fields [25], with regard to a minimum of energy corresponding to a maximum aposteriori (MAP) labeling of the service . In the following subsections, we discuss the formulation of the unary cost and pairwise cost in more detail.
5.1 Unary Cost
The unary cost is derived from the preference of a specific product for a host. By considering one product being assigned to each host, our unary cost is expressed as
(4) 
where presents the probability that a product is assigned to . In general cases, there is no specific preference amongst available products for each host to deliver a service. Therefore this term can be replaced by a small constant for optimization. Although such a unified cost provides fast convergence in optimization, the realworld networks would be more complex and constrained by practical requirements as discussed in Section 4. Therefore, the unary cost is further refined subject to any constraints in the next subsection.
5.2 Unary Cost with Constraints
In our system, constraints are implemented as conditional patches to our energy function, in particular the unary cost. For a local constraint expressing an undesirable requirement or a desirable requirement , our unary cost can be represented as follows:
For unconstrained services, there is no preference amongst products and the unary cost is given by a small constant . For the constrained services (when ), the unary cost is given by which is interpreted as below according to the two types of local constraints above:
where the desirable constraint contributes no additional cost whilst the undesirable constraint introduces a large cost. In this case, the optimization is induced to reach desirable assignments, but avoid undesirable ones in energy minimization. Note that such customized unary cost can also be applied for any global constraint, which is equivalent to applying a local constraint to all hosts.
5.3 Pairwise Cost
The pairwise cost is derived from the similarity between the assigned products that provide the same service. As mentioned previously, a pair of connected hosts being assigned with more similar products would have greater infection rate, namely a zeroday exploit at one host is more likely to infect the other. When defining the pairwise cost, we penalize such similarities in order to provide a more diverse product assignment for the network. To achieve that, we define the pairwise cost term as:
(5) 
where and denote a pair of connected hosts, and presents the similarity between two products providing the same service on a pair of connected hosts. It serves as a strong regularization on the product assignment as it ideally prevents the same product from being assigned to connected hosts.
5.4 Energy Optimization
Based on the unary cost and pairwise cost, we can determine the optimal assignment for by minimizing the energy function as below:
Solving such an energy is NPHard, and the alternative way is to use an approximate minimization algorithm to achieve a solution. The wellknown techniques for solving such problems are based on graphcuts and belief propagation (BP). The former is currently considered as the most accurate minimization approach for energy functions arising in many complex scenarios but it can be applied to a limited range of energy forms. If the form is outside the class, like our energy function in Eq. 3, BP is the common alternative. However, BP might not converge when applying to a wide range of convex functions. Instead, we employ a sequential treereweighted message passing algorithm (TRWS) [12]. Similar to BP, TRWS can be applied to the type of problems with the energy form in Eq. 3. It is also guaranteed to give an optimal MAP solution in most cases [12]. TRWS outperforms BP and graphcuts on many heavy tasks. It also demonstrates a great potential for the cases of labeling of nearly flat probabilities, as well as the cases of largescale networks.
Our optimization scheme mainly follows [12], which is also extended to a multilevel fashion to better fit our problem. Specifically, we enable the possibility of the parallel computation and even GPU acceleration. In addition, the optimization of the constrained energy is straightforward because our constraints are efficiently encoded into the unary cost by manipulating the cost for specific hosts and assignments. More details about the scalability analysis are given in Section 9. A case study using our optimization approach in practice can be found in the later Section 7.
6 Evaluation of Network Diversity
The purpose of this section is to evaluate how much diversity a specific product assignment can bring about into a network, and we achieve this by using a network diversity metric based on Bayesian Networks [9]. Given a network and a specific product assignment , we first construct its corresponding Bayesian Network (in Section 6.1) to estimate the infection rate on each edge between hosts, based on which we can evaluate the network diversify by calculating the value of the metric (in Section 6.2).
6.1 Bayesian Network Evaluation Model
Before we define the complete Bayesian Network, we need a way to capture the impact of the attacker’s behavior on malware propagation. From an attack entry host, there are different ways to reach the final target by continuously exploiting a number of steppingstone hosts. At each attack step from one host to another, there are often more than one vulnerable products to exploit and induce further spread of the malware. Therefore, we introduce the notion of exploitation paths and attack nodes. A conventional attack path chains a number of hosts from an entry to the final target, whereas an exploitation path explicitly illustrates the product that is exploited at each host along with an attack path. Attack nodes then capture which product is chosen to exploit between a pair of connected hosts.
Definition 6 (Attack Nodes)
Given a network with a specific product assignment , denotes a set of attack nodes connecting a pair of connected hosts. Each attack node includes a set of products on the destination host , which can be exploited from a source host , and a silent action (i.e. none). Therefore the domain states of an attack node is .
Attackers can choose one of the products to exploit or keep silent. Different choices lead to different propagating rates to the destination host.
Now we can formally define the Bayesian Network (BN) for a given network subject to a specific assignment . Attack nodes are added into the BN. defines for each attack node the likelihood of a product being selected to exploit. For instance, defines the conditional probability tables (CPT) for attack nodes in Figure 3 and in this example attackers choose products to exploit uniformly. defines the risk distribution of each host without considering the vulnerability similarities of products, i.e. products share no vulnerabilities with each other, while takes the similarities into account. Therefore, is constant for a given static network, regardless of the assigned products, while is directly influenced by .
Definition 7
Let be a Bayesian Network for a given network with a specific product assignment , where

is the associated set of attack nodes.

is the prior probability distribution of root hosts.

is the average infection rate of a zeroday exploit.

includes conditional probability distribution (CPT) for all attack nodes such that
denotes , the probability distribution over a set of products to exploit next. 
includes conditional probabilities of all nonroot host nodes given their preceding attack nodes. denotes , and subject to noisyOR operator [26],
where is the probability of the host being compromised from .

includes conditional probabilities of all nonroot host nodes by considering the vulnerability similarity between products. denotes , and subject to noisyOR operator,
where is the probability of the host being compromised by considering the exploited products at the preceding hosts and , and the probability is estimated as follows:
Without considering the similarity, the probability of a host being infected only depends on the products being chosen to exploit at the host and the infection rate is set to the average zeroday propagation rate . To model the scenario of reusing exploits on similar products, we introduce an extra set of links into the BN, which are indicated by red dashed lines in Figure 3. These links connect the preceding attack nodes with the current host such that we can represent . As demonstrated in Figure 3, the probability distribution of is conditional on the current attack node as well as its preceding attack node . If both attack nodes exploit the same type of products such as and , then the chance of being compromised is the vulnerability similarity of and , which is assumed to be 0.9. If different types of products are exploited, such as and , then the is used. Here we use the same value for as in the existing work [9][27]. It is a nominal value but reasonably low for zeroday vulnerabilities, which is subject to change, depending on the assessment of the actual application scenarios.
From this simple example, we can see that different assignments would yield BN models with different infection rates on edges. Given a product assignment, we can construct the corresponding BN model to estimate the risk of the target node, which can be used to evaluate the current diversification introduced by the given assignment.
6.2 Network Diversity Metrics
In this subsection, we present the diversity metric used in this paper to evaluate product assignments for a network. The network diversity metric was proposed by Zhang et. al. [9] to evaluate a diversified network by measuring the average attacking effort needed to compromise the network. We adapt the metric to fit our model considering the vulnerability similarity of products.
Definition 8 (BNbased Diversity Metric )
Given a Bayesian Network
constructed for a diversified network , and a specific target host , the network diversity based on can be defined as below in term of the probability of the target host being compromised:
where () is the probability of being infected with (without) considering the vulnerability similarity of products w.r.t Definition 7.
The probabilistic metric estimates the average attacking effort by combining all valid exploitation paths. Naturally, the diversity metric is always less than 1.0 and the greater value indicates higher diversity. With the help of Bayesian Networks, captures the risk of the target host when the vulnerability similarity of products is considered. reflects the current robustness of the network, which is provided by the given product assignment, against repeating uses of zeroday exploits. indicates the maximum potential of the network diversity. More explanations about this metric can be found in [9].
7 Case Study  Upgrading Legacy ICS with Modern Industrial Networks
In this section, we present a case study on upgrading legacy control systems with interconnected IT systems, to achieve the convergence of IT and OT in modern industrial networks. Such an integration can facilitate highly interconnected Industrial Internet of Things (IIoT) applications, but also leave ICS more vulnerable by introducing more attack vectors, i.e. as the control networks are no longer isolated, malware can propagate itself across IT systems to breach the core control units causing physical damage.
Therefore, in this case study, we demonstrate the usage of our approach in finding an optimal diversification strategy to improve the resilience of the integrated systems. Particularly we consider three main constraints that might arise when applying the approach in practice:

Most hosts in OT networks run legacy software, which have no flexibility to diversify or upgrade.

Some hosts in various networks are required to run specific software and hence cannot be diversified.

Some desirable and undesirable product combinations should be taken into account.
We start with a brief description of the case study in Section 7.1. An optimal solution and two constrained optimal solutions are then computed and illustrated in Section 7.2. In Section 7.3 we evaluated the produced optimal solutions in terms of (i) the diversity metric discussed in Section 6, and (ii) the Meantimetocompromise (MTTC) obtained from NetLogo simulations we have developed.
7.1 Experiment Configuration
The example is adapted from the Stuxnetlike worm propagation analysis in [28]. Figure 4 depicts a typical ICS architecture integrating existing OT zones (e.g. Operation Network, Control Network) with new IT zones (e.g. Corporate SubNetwork, Clients Network, Vendors Support Network). We use gray shading to indicate that OT zones have no flexibility to diversify or upgrade deployed software. Specific firewall whitelist access rules are also given in Figure 4 to provide perimeter protection between different zones.
We use the example to demonstrate the Stuxnet worm propagation across an ICS. The primary intrusion can be introduced from the Corporate Network, Clients Network or Vendors Support Network. Once a host has been exploited as a foothold, the worm can continue scanning other connected hosts for similar vulnerabilities, by which the worm can propagate itself through the network. Stuxnet eventually breached the hosts in Control Network, such as t4, t5 and t6 in Figure 4, which can access field devices.
In the following experiments, we consider the optimal assignment of products to provide three key services, i.e. an Operating System (OS), a Web Browser(WB) and a Database Server(DB). These services are distributed across all the hosts in the network according to the key role each host plays (indicated in Figure 4). For instance, the host c1 in the Corporate Network is configured as a WinCC Web Client, which runs WinCC V7.x as the main application. The essential requirements for this application are a Windows OS and an IE web browser [29], and hence a range of available products that we can choose to install on c1 is provided in Table IV. The host z2 in DMZ requires a Windows OS and a Microsoft Database Server to run the WSUS server, which is reflected accordingly in the table. As a result, Table IV lists essential services required at each host and the corresponding selections of products for each service.
Serv.  Products  c1  c2  c3  c4  z1  z2  z3  z4  p1  p2  p3  t1  t2  t3  t4  t5  t6  e1  e2  e3  e4  r1  r2  r3  r4  r5  v1  v2  v3 
Windows XP  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
Windows 7  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
Ubuntu 14.04  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
Debian 8.0  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
IE8  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
IE10  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
Chrome 50  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
MS SQL 2008  ✓  ✓  ✓  ✓  
MS SQL 2014  ✓  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
MySQL 5.5  ✓  ✓  ✓  ✓  ✓  ✓  ✓  
MariaDB 10  ✓  ✓  ✓  ✓  ✓  ✓ 
We highlight the legacy hosts in grey in Table IV, which run outdated software and cannot be diversified (e.g. the host p2, p3 in the Operations Network). The example also includes several outdated versions of software running on legacy hosts such as Windows XP, MS SQL 2008. All of these introduce extra constraints in finding the optimal diversification strategy. The other chosen products in Table IV are either frequently suggested in WinCC manuals or rated as one of the most vulnerable products by CVE Details [24].
The similarities of web browsers and operating systems refer to Table II and III, and the similarities for DB products are obtained in the same way as described in Section 3. Given the products for each host in Table IV, we can compute the optimal solution to diversifying the networked ICS by the approach discussed in Section 5. It is worth noticing that our approach offers high generality and flexibility, by which each host can have a customized range of services, and each service can have various ranges of products to deploy.
7.2 Optimal Assignment of Products
The optimal assignment for the case study is computed by the approach introduced in Section 5 and illustrated in Figure 5(a). The assignment indicates the optimal strategy to deploy the software in IT networks when integrating with OT systems. The solution attempts to minimize the vulnerability similarity between each pair of connected hosts. From the figure, we can find that each pair of connected hosts is generally assigned with different products from each other.
As mentioned in the beginning of this section, the second type of constraints we might encounter is that some hosts are required to run specific software according to certain company policies. For this case study, we specify that the host z4, e1, r1 and v1 are required to run specific products. We outline fixed choices for these hosts in Table IV in grey. Having adding those constraints into the optimisation, we now compute the constrained optimal assignment , which is given in Figure 5(b). It can be seen that whilst we fixed the products of the four hosts, the new solution accordingly updates assignments of products for several hosts to find new optimal diversification.
We can also specify undesirable product combinations to avoid during optimisation. For instance, the solution in Figure 5(b) uses the IE10 on Ubuntu14.04 at host v2. If we want to eliminate such undesirable assignments, we can specify and embed product constraints in the computation of optimal solutions, as introduced in Definition 4.
For instance, the following set of global constraints captures the exclusive requirement between Ubuntu (Debian) OS and IE across all hosts:
With the constraints in , we can compute the constrained optimal solution that is illustrated in Figure 5(c). It can be found that the web browsers at c2 and v2 are changed to Chrome as required.
The optimal solution is produced by minimizing the energy function presented in Section 5.4, and hence it guarantees the minimal infection rate of the worm and the most diverse product assignment possible. In order to accommodate the host and product constraints, the constrained solutions and have to sacrifice a certain amount of diversity. In the next section, we evaluate all these optimal solutions and quantify the compromised diversity of the constrained solutions in terms of the diversity metric proposed in Section 6.2 and MTTC by our NetLogo simulation.
7.3 Case study analysis
7.3.1 Evaluation by Network Diversity Metric
First of all, we construct a Bayesian Network for the case study with a given assignment of products in order to estimate the propagation of malware. In the following experiments, we consider an attacker breaks into the system from c4 in Corporate Network, and hence we set c4 to be the root being infected with a prior probability 1.0. The final target of the attack is set to host t5 which has the direct access to controlling the critical field devices. Therefore, the probability of the target t5 being infected becomes the key element to calculate the network diversity metric , as defined in Definition 8.
Given an assignment of products (e.g. the optimal one ), we can determine the possible infection rate of zeroday malware at each edge with the help of the constructed Bayesian Network. As we investigate the infection of multiple zeroday exploits, we assume that the attacker is in possession of three unique zeroday exploits, each of which exploits a particular type of product respectively (i.e. OS, WB and DB). Once a host is infected, attackers search for similar products/vulnerabilities to exploit amongst the connected hosts and proceed. When multiple exploits are feasible, attackers evenly choose one to use, which defines the CPT for attack nodes associated with each edge. The similarity between the source and chosen product decides the likelihood of infecting the chosen product.
Label  Description 



optimal assign.  3.151  3.062  0.81457  
host constr.  3.151  2.838  0.48590  
product constr.  3.151  2.833  0.48119  
random assign.  3.151  2.576  0.26622  
mono assign.  3.151  1.978  0.06709 
The first row of Table V is the evaluation of the optimal assignment which reaches a very high diversity . The constrained optimal solutions and produce lower diversities as the two solutions are required to accommodate certain constraints. Discussion about the impact of such constraints on the optimal diversification continues in Section 8.3.
For the purpose of comparison, we also generate a homogeneous assignment which generally allocates the same operating system, the same web browser and the same database server for all hosts. Such monoassignment provides the worst possible diversity for the ICS case. It also shows how vulnerable the network would become if we use homogeneous products. Besides, a randomly diversified assignment is also provided, which delivers a limited diversity that is significantly lower than our optimal solution.
The notation denotes the probability of the target t5 being infected without considering the vulnerability similarities between products. Therefore, has a constant value for all different assignments. When we take similarities into consideration, the probability of t5 being infected increases with less diverse assignments of products.
7.3.2 Evaluation by NetLogo Simulation
NetLogo is an agentbased modelling tool that enables a programmable modelling environment for simulating natural phenomena and behaviours of complex systems over time [30]. We use NetLogo to construct the networked ICS as shown in Figure 4 and simulate the propagation of malware. After breaking into the system from a host, attackers can further spread the worm across the network. Given an assignment of products (e.g. the one in Figure 5(a)), we can determine the possible infection rate of zeroday exploits at each edge in NetLogo. By deploying the simulation with a given product assignment, we can determine how much time is required by attackers to penetrate the diversified network, which implies the average effort required to compromise the network. More optimal assignment should provide more resilience to the network against the penetration.
To test the resilience provided by the diversification, we designed five sets of experiment to simulate the malware propagation from five different entry points respectively – c1 and c4 from the Corporate Network, e3 from the Clients Network, r4 from the Remote Clients, and v1 from the Vendors Support Network. Once the entry host is infected, attackers search for similar products/vulnerablities to exploit from the connected hosts. We looked at the sophisticated attackers who conduct reconnaissance activities before launching attacks, and hence at each step this type of attackers always chooses the exploits with the highest success rate.
Assignment 







45.313  37.561  52.663  52.491  24.053  
28.041  16.812  44.359  48.472  15.243  
14.549  15.817  45.118  46.257  14.749 
For each set of experiments, we deploy the network according to the three optimal assignments , and respectively, and run the simulation for 1,000 times. The average MTTC for each test is given in Table VI. The MTTC is the time steps (i.e. ticks in NetLogo) consumed by attackers to successfully reaching the final target. The results show that the optimal assignment provides the strongest resilience to the network, as it requires the longest period of time to be compromised in the all five attack scenarios, while the other two constrained optimal assignments can be compromised in a shorter period of time.
8 Network Diversity Analysis
In addition to the vulnerability similarity of products, there are other factors affecting the optimization of network diversity. In this section, we focus on three key factors – the network structure (Section 8.1), the variety of products per service (Section 8.2) and the number of configuration constraints (Section 8.3). We use an artificial network in which 30 hosts are created, each host runs three different services and at least three products are provided for each service to choose. We set three entry nodes and a target . The diversity metric is calculated in terms of the probability of being compromised. In the following sections, we create a number of variants of the network to examine the impact of the aforementioned factors on the optimal network diversity.
8.1 Impact of Network Structure
We create different network structures with various numbers of routing nodes. Routing nodes are generally hosts with heavy traffic flows. Diversifying routing nodes is of great importance to improve the network robustness [8]. We define routing nodes as the hosts of at least degree 3, i.e. there are at least three edges connecting to the host. We create different numbers of routing nodes by randomly adding a number of edges to the network. The number of routing nodes in the following experiments are 6, 10, 12 and 14 respectively.




6  15  3.896  3.814  0.82682  
10  48  6.239  5.496  0.18095  
12  60  6.238  5.495  0.18093  
14  66  5.829  5.238  0.25628 
Table VII gives the evaluation of the optimal diversification for each case. At a low number of routing nodes (i.e. 6), the optimal diversification provides a remarkable improvement to network diversity with . As the increasing number of attack paths, the optimal diversity metric has been reduced. Despite the growing number of attack paths, our optimal solutions can still maintain the similar network diversity at the 10, 12 and 14 routing nodes. It should be noticed that the increasing number of routing nodes are created by adding random edges to the network. Therefore, the trend of the diversity could be fluctuate as the number of routing nodes increases. The optimal solutions also improve the network robustness against the expanding attack vector, which is reflected by the generally decreasing probability of the target being infected . Another noticeable observation is that the optimal diversification tends to be more necessary when protecting dense network with larger number of exploitation paths.
8.2 Impact of the Variety of Products
The variety of products can also influence the optimal diversity. A wide variety of candidate products can introduce more diversity and flexibility when assigning products, which more importantly, can reduce the chance of a pair of connected nodes being assigned highly similar products. We still use the network of 30 hosts with 10 routing nodes, 3 services per host. The variety of products we tested is from 3 to 7 and the detailed evaluation is provided in Table VIII.



3  6.239  5.062  0.06653  
4  6.239  5.216  0.09481  
5  6.239  5.392  0.14226  
6  6.239  5.655  0.26053  
7  6.239  5.950  0.51437 
The decreasing value of indicates that with a higher variety of products, the network can provide better protection to the target. We also observe that the diversity is improved with more available products. However, when applying the optimal diversification in practice, as demonstrated by the ICS case study in Section 7.2, a number of configuration constraints can stop us from using the most optimal assignment. In the next section, we study the impact of the increasing number of constraints on the optimization.
8.3 Impact of Configuration Constraints
We specify different numbers of local constraints (i.e. 5, 10, 15) to compute the constrained optimal solution. The added constraints for each run of the experiment are randomly generated. It is possible that the added constraints are not in conflict with the optimal solution, in which case the resulting constrained optimal solution can still provide the same diversity as the optimal solution. Therefore, we intentionally add constraints to against the optimal solution generated in Table VII (at 10 routing nodes), so that we can evaluate how the constraints compromise the optimal diversity.
# Local Constraints  

0  6.239  5.497  0.18095 
5  6.239  5.490  0.17838 
10  6.239  5.443  0.15996 
15  6.239  5.224  0.09669 
The results are then compared with the optimal solution with no constraint in Table IX. The results clearly show that the increasing number of constraints can compromise the optimized diversity and reduce the protection to the target.
9 Scalability Analysis
In this section, we focus on analyzing the scalability of our optimization approach. We run the optimization against a series of randomly generated networks. Figure 6 illustrates a numerical analysis when optimizing networks of varying numbers of hosts in Figure 6(A), varying degrees of hosts (edges per host) in Figure 6(B) and varying numbers of services per host in Figure 6(C).
# deg.  # serv.  # hosts  

100  200  400  600  800  1000  2000  4000  6000  
middensity  20  15  0.239  0.438  1.099  1.478  1.944  2.784  6.706  16.517  33.392 
highdensity  40  25  0.640  1.766  3.553  5.881  8.135  10.999  27.484  82.500  151.110 
# hosts  # serv.  # degree  

5  10  15  20  25  30  35  40  45  50  
midscale  1000  15  0.759  1.577  1.954  2.693  3.294  4.040  4.652  5.174  5.758  6.309 
largescale  6000  25  21.239  40.940  59.216  77.583  95.750  117.810  144.470  152.040  167.190  189.710 
# hosts  # deg.  # edges  # services  

5  10  15  20  25  30  
midscale  1000  20  20,000  0.603  1.608  2.709  4.008  5.253  6.974 
largescale  6000  40  240,000  10.306  27.214  51.587  90.407  134.340  188.050 
For the best performance, our optimizer is implemented using C++ and enables the multithreading mechanism to provide high convergence speed in multilevel optimization. We apply a GPUfriendly compute unified device architecture to gain extra efficiency on complex matrix operation. All the experiments run on a midrange computer with an Intel i5 2.8GHz CPU, a 8GB RAM and an Nvidia GTX 750.
We observe that the number of hosts has a major impact on the computational consumption. As shown in Figure 6(A), the time increases nonlinearly with the increasing number of hosts. However, our method still provides a high efficiency on midscale networks – the optimization converges within 0.24 seconds when the network size is up to 1000 hosts. A reasonably high speed is also provided on the largescale networks – the optimal assignment for 10,000 hosts can be obtained within 7.342 seconds on average.
Figure 6 (B) and (C) show that the computational time of our approach increases linearly along with the increment of either the degree of nodes or the number of services. By fixing the network size (#100) and the number of services (#3) of each host, our optimization converges within 0.253 seconds for the large degree (#50) that results in more than 2900 edges. Our method behaves similarly on the experiments with varying numbers of services. The optimal solution is found within 1.01 seconds given a large number of services (#30) on a network with 100 hosts. Such computational time is highly promising for most networks in the real world.
We further test our optimization approach against high density and largescale networks that we might encounter in practice. Table X provides the computational time of optimizing networks with the middle (20 degrees and 15 services per host) and high density (40 degrees and 25 services per host). Again we observe that the number of hosts has a major impact on the computational time, but our method still finds the optimal solution within 3 minutes for largescale (6000 hosts) highdensity network. Moreover, we run experiments on midscale and largescale networks with various densities and the results in Table XI show that the degree has less influence on the computational time than the number of hosts. Finally, we vary the number of services for each host on both midscale and largescale networks in Table XII. For a largescale network of 6000 hosts with up to 240,000 edges and 30 services per host, our method still performs well and converges at about 3 minutes.
10 Discussion and Conclusion
Moving towards integrated ICS enables an efficient way to operate these systems, but also provides new attack vectors for adversaries to breach them. It is now a challenging and urgent issue for many industrial organizations to find a secure way to converge OT and IT systems to provide an efficient and also resilient industrial environment. Furthermore, there are other constraints hindering us from finding an optimal solution, such as outdated legacy systems, strict company policies and other configuration requirements. In this paper, we proposed an approach based on software diversification to increase the system resilience of the integrated ICS against malware propagation.
We introduced the similarity metric to capture how similar the vulnerabilities of two products are, which was then applied in a statistical study on CVE/NVD databases. The study showed that most vulnerabilities could affect multiple products, even from different vendors.
The similarity metric can estimate the likelihood of a zeroday exploit successfully propagating itself between two products. By assigning diverse products for a pair of connected hosts, such propagation can be effectively reduced.
We formally represented the network by a MRF model with different services (encoded as labels) and products (encoded as values) for each host. Such a model can then be efficiently optimized by the TRWS algorithm. Thus, we can obtain an optimal assignment of products for a given network. The optimal solution is able to maximize the defense strength of the network against malware propagation. Compared to random diversification plans, the optimal solution is more effective in cutting off valid attack paths. In the scalability analysis, we illustrated that our method scaled well in largescale highdensity networks.
We contend that our approach has great value and potential in practical applications, by which we can advise on the best diversification strategy for a system operator to decide the most robust way to upgrade an existing ICS. We also demonstrated the practical usage of our optimization approach in a realistic case study. Furthermore, we provided a way to specify configuration constraints that we might encounter in practice. Constrained optimal solutions can be produced to accommodate these constraints.
There are several promising lines of research to carry on. The vulnerability similarity of products in this work is estimated by data from CVE/NVD database. We are aware of the potential “publication bias” of CVE/NVD. However, as discussed in [20], NVD is currently the most trustworthy database, compared to the others. One of our key contributions is the introduction of the similarity metric and the actual use of the metric in our optimization in a way that we can more accurately capture the spread of zeroday malware. We will keep working on finding more convincing sources to provide values for this metric. Besides, we are also working on a more systematic way to estimate the vulnerability similarity between two products, such as (i) from the perspective of software engineering to analyze difference of the exploits for different products [31]; or (ii) by estimating how diverse two products are [32]. Besides, as our approach provides highly competitive efficiency and scalability, we are looking at optimal diversification for dynamic networks.
References
 [1] K. Stouffer, V. Pillitteri, S. Lightman, M. Abrams, and A. Hahn, “Guide to industrial control systems (ics) security,” NIST Special Publication, vol. 800, p. 82, 2015.
 [2] K. J. Hole, “Diversity reduces the impact of malware,” IEEE Security & Privacy, vol. 13, no. 3, pp. 48–54, 2015.
 [3] N. Falliere, L. O. Murchu, and E. Chien, “W32. stuxnet dossier,” White paper, Symantec Corp., Security Response, vol. 5, 2011.
 [4] A. Avizienis, “The nversion approach to faulttolerant software,” IEEE Transactions on software engineering, no. 12, pp. 1491–1501, 1985.
 [5] S. Bhatkar, D. C. DuVarney, and R. Sekar, “Address obfuscation: An efficient approach to combat a broad range of memory error exploits.” in USENIX Security Symposium, vol. 12, no. 2, 2003, pp. 291–301.
 [6] V. Pappas, M. Polychronakis, and A. D. Keromytis, “Smashing the gadgets: Hindering returnoriented programming using inplace code randomization,” in Security and Privacy (SP), 2012 IEEE Symposium on. IEEE, 2012, pp. 601–615.
 [7] A. J. O’Donnell and H. Sethu, “On achieving software diversity for improved network security using distributed coloring algorithms,” in Proceedings of the 11th ACM conference on Computer and communications security. ACM, 2004, pp. 121–131.
 [8] A. Newell, D. Obenshain, T. Tantillo, C. NitaRotaru, and Y. Amir, “Increasing network resiliency by optimally assigning diverse variants to routing nodes,” IEEE Transactions on Dependable and Secure Computing, vol. 12, no. 6, pp. 602–614, 2015.
 [9] M. Zhang, L. Wang, S. Jajodia, A. Singhal, and M. Albanese, “Network diversity: a security metric for evaluating the resilience of networks against zeroday attacks,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 5, pp. 1071–1086, 2016.
 [10] MITRE, Common vulnerabilities and exposures, available at https://cve.mitre.org/, last acceessed on February 09, 2018.
 [11] NIST, National Vulnerability Database, available at https://nvd.nist.gov/, access date: February 09, 2018.
 [12] V. Kolmogorov, “A new look at reweighted message passing,” IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 5, pp. 919–930, 2015.
 [13] P. Larsen, A. Homescu, S. Brunthaler, and M. Franz, “Sok: Automated software diversity,” in Security and Privacy (SP), 2014 IEEE Symposium on. IEEE, 2014, pp. 276–291.
 [14] B. Baudry and M. Monperrus, “The multiple facets of software diversity: Recent developments in year 2000 and beyond,” ACM Computing Surveys (CSUR), vol. 48, no. 1, p. 16, 2015.
 [15] P. Pal, R. Schantz, A. Paulos, and B. Benyo, “Managed execution environment as a movingtarget defense infrastructure,” IEEE Security & Privacy, vol. 12, no. 2, pp. 51–59, 2014.
 [16] L. Wang, S. Jajodia, A. Singhal, and S. Noel, “kzero day safety: Measuring the security risk of networks against unknown attacks,” in Computer Security–ESORICS 2010. Springer, 2010, pp. 573–587.
 [17] D. Borbor, L. Wang, S. Jajodia, and A. Singhal, “Diversifying network services under cost constraints for better resilience against unknown attacks,” in IFIP Annual Conference on Data and Applications Security and Privacy. Springer, 2016, pp. 295–312.
 [18] M. Garcia, A. Bessani, I. Gashi, N. Neves, and R. Obelheiro, “Os diversity for intrusion tolerance: Myth or reality?” in Dependable Systems & Networks (DSN), 2011 IEEE/IFIP 41st International Conference on. IEEE, 2011, pp. 383–394.

[19]
M. Bozorgi, L. K. Saul, S. Savage, and G. M. Voelker, “Beyond heuristics: learning to classify vulnerabilities and predict exploits,” in
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2010, pp. 105–114.  [20] P. Johnson, R. Lagerstrom, M. Ekstedt, and U. Franke, “Can the common vulnerability scoring system be trusted? a bayesian analysis,” IEEE Transactions on Dependable and Secure Computing, 2016.
 [21] T. Li and C. Hankin, “Effective defence against zeroday exploits using bayesian networks,” in International Conference on Critical Information Infrastructures Security. Springer, 2016.
 [22] S.S. Choi, S.H. Cha, and C. C. Tappert, “A survey of binary similarity and distance measures,” Journal of Systemics, Cybernetics and Informatics, vol. 8, no. 1, pp. 43–48, 2010.
 [23] P.J. Moreels and A. Dulaunoy, cvesearch, gitHub repository at https://github.com/cvesearch/cvesearch, access date: February 09, 2018.
 [24] CVEDetails, Top 50 Products By Total Number Of ”Distinct” Vulnerabilities, available at http://www.cvedetails.com/top50products.php and http://www.cvedetails.com/top50versions.php, access date: February 09, 2018.
 [25] S. Geman and D. Geman, “Stochastic relaxation, gibbs distributions, and the bayesian restoration of images,” IEEE Transactions on pattern analysis and machine intelligence, no. 6, pp. 721–741, 1984.
 [26] J. Pearl, Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, 2014.
 [27] L. Wang, M. Zhang, S. Jajodia, A. Singhal, and M. Albanese, “Modeling network diversity for evaluating the robustness of networks against zeroday attacks,” in Computer SecurityESORICS 2014. Springer, 2014, pp. 494–511.
 [28] E. Byres, A. Ginter, and J. Langill, How Stuxnet Spreads – A Study of Infection Paths in Best Practice Systems, available at https://www.tofinosecurity.com/howstuxnetspreads, access date: February 09, 2018.
 [29] SIEMENS, WinCC v7.4: General information and installation, available at https://cache.industry.siemens.com/dl/files/216/109736216/att_879785/v1/WinCC_GeneralInfo_Installation_Readme_enUS_enUS.pdf, access date: February 09, 2018.
 [30] U. Wilensky, NetLogo, available at http://ccl.northwestern.edu/netlogo/., access date: February 09, 2018.
 [31] A. Calleja, J. Tapiador, and J. Caballero, “A look into 30 years of malware development from a software metrics perspective,” in International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, 2016, pp. 325–345.
 [32] K. Nayak, D. Marino, P. Efstathopoulos, and T. Dumitraş, “Some vulnerabilities are different than others,” in International Workshop on Recent Advances in Intrusion Detection. Springer, 2014, pp. 426–446.
Comments
There are no comments yet.