Enterprise Cyber Resiliency Against Lateral Movement: A Graph Theoretic Approach

05/03/2019 ∙ by Pin-Yu Chen, et al. ∙ University of Michigan ibm PNNL Colorado State University 0

Lateral movement attacks are a serious threat to enterprise security. In these attacks, an attacker compromises a trusted user account to get a foothold into the enterprise network and uses it to attack other trusted users, increasingly gaining higher and higher privileges. Such lateral attacks are very hard to model because of the unwitting role that users play in the attack and even harder to detect and prevent because of their low and slow nature. In this paper, a theoretical framework is presented for modeling lateral movement attacks and for proposing a methodology for designing resilient cyber systems against such attacks. The enterprise is modeled as a tripartite graph capturing the interaction between users, machines, and applications, and a set of procedures is proposed to harden the network by increasing the cost of lateral movement. Strong theoretical guarantees on system resilience are established and experimentally validated for large enterprise networks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Cyber security is one of the most critical problems of our time. Notwithstanding the enormous strides that researchers and practitioners have made in modeling, analyzing and mitigating cyber attacks, black hats find newer and newer methods for launching attacks requiring white hats to revisit the problem with a new perspective. One of the major ways

111http://www.verizonenterprise.com/DBIR that attackers launch an attack against an enterprise is by what is known as “lateral movement via privilege escalation”. This attack cycle, shown in Fig. 1, begins with the compromise of a single user account (not necessarily a privileged one) in the targeted organization typically via phishing email, spear phishing or other social engineering techniques. From this initial foothold and with time on his side, the attacker begins to explore the network, possibly compromising other user accounts until he gains access to a user account with administrative privileges to the coveted resource: files containing intellectual property, employee or customer databases or credentials to manage the network itself. Typically the attacker compromises multiple intermediate user accounts, each granting him increasing privileges. Skilled attackers frequently camouflage their lateral movements into the normal network traffic making these attacks particularly difficult to detect and insidious.

Such lateral attacks are particularly insidious because authorized users play the role of an unwitting accomplice. End users have often been recognized as the “weakest links” in cyber security [1]. They do not follow security advice and often take actions that compromise themselves as well as others. While efforts to educate and train end users for cyber security are important steps, anecdotal evidence shows that they have not been as effective. Clearly, there is a need for designing large enterprises that are resilient against such lateral movement attacks. Our current work takes a step in this direction.

Fig. 1: An illustration of a cyber attack using privilege escalation techniques.
(a) A tripartite network
(b) Segmentation
(c) Edge hardening
(d) Node hardening
Fig. 2: (a) Illustration of a tripartite network consisting of a set of users, a set of hosts and a set of applications. (b) Segmentation - the user Charlie modifies his access configuration by disabling the access of the existing account (Charlie-2) to host H3 and by creating a new user account (Charlie-1) for accessing H3 such that an attacker cannot reach the data server H5 though the printer H3 if Charlie-2 is compromised. (c) Edge hardening via additional firewall rules on all network flows to H5 through HTTP. (d) Node hardening via system update or security patch installation on H5.

Resilient systems accept that not all attacks can be detected and prevented; nonetheless, the system should be able to continue operation even in the face of cyber attacks and provide its core services or missions even if in a degraded manner [2]. To build such a resilient system it is important to be proactive in understanding and reasoning about lateral movement in an enterprise network, its potential effects on the organization, and identify ways to best defend against these threats. Unfortunately, a theoretical framework for such risk analysis is currently missing. Our goal in this paper is to establish the theoretical foundations of a systematic framework for building networks resilient to lateral movement attacks.

We model lateral movement attack on an enterprise’s mission as a graph walk on a tripartite user-host-application network that logically comprises of two subgraphs: a user-host graph and a host-application graph. Fig. 2 illustrates the model and our methodology. The user-host-application paradigm entails richer information than single (homogeneous) graph models (e.g., host-host communication networks). For instance, the host-host communication network can be derived from the host-application subgraph. The user-host-application paradigm also allows us to develop an abstraction of a mission in terms of concrete entities whose behavior can be monitored and controlled, which captures interactions between diverse categories of users, software and hardware resources (e.g., virtual machines, workstations, mobile devices) and applications.

System Heterogeneity Hardening Methods Theoretical Guarantees
User-Host Algorithms 1, 2 Theorem 2, Corollary 1
Host-Application Algorithms 3, 4 Theorem 5, Corollary 2
User-Host-Application all of the above all of the above
TABLE I: Utility of the proposed algorithms and established theoretical results.

Defining lateral movements as graph walks allows us to determine which nodes in the tripartite graph can be reached

starting at a given node. From an attacker’s perspective, these nodes that can be “reached” are exactly those mission components that can be attacked and compromised via exploits. The larger the number of nodes that can be reached by the attacker, the more “damage” he/she can cause to the mission. Given a system snapshot and a compromised workstation or mobile device, we can define the “Attacker’s Reachability” as a measure that estimates the number of hosts at risk through a given number of system exploits. Now, from a defender’s perspective, putting some defensive control on one of these nodes (or edges) allows the walk to be broken at that point. Intuitively, such a walk can also be used to identify mission hardening strategies that reduce risk. This central idea is illustrated in Fig.

2. The heterogeneity of a cyber system entails a network of networks (NoN) representation of entities in the system as displayed in Fig. 2, allowing us to devise effective hardening strategies from different perspectives, which differs from works focusing on manipulating the network topology under the assumption that the graph is homogeneous, that is, all nodes have an identical role in a cyber system.

As our model considers the heterogeneity of a cyber system and incorporates several defensive actions for enhancing the resilience to lateral movement attacks, to assist reading the utility of the proposed approaches and the established theoretical results are summarized in Table I, and the proofs of the established mathematical results are placed in the appendices in the supplementary file222Supplementary material: https://goo.gl/h8XHZX.

The research contributions of this paper are listed as follows.

  1. By modeling lateral movements as graph walks on a user-host-application tripartite graph, we can specify the dominant factors affecting attacker’s reachability (Sec. III), setting the stage for proposing greedy hardening and segmentation algorithms for network configuration change recommendation to reduce the attacker’s reachability (Sec. IV and Sec. V).

  2. We characterize the effectiveness of three types of defensive actions against lateral movement attacks, each of which can be abstracted via a node or edge operation on the tripartite graph, which are (a) segmentation in user-host graph (Sec. IV), (b) edge hardening in host-application graph (Sec. V), and (c) node hardening in host-application graph (Sec. V).

  3. We provide quantifiable guarantees (e.g., submodularity) on the performance loss of the proposed greedy algorithms relative to the optimal batch algorithms that have combinatorial computation complexity (Theorem 2 and Theorem 5).

  4. We apply our algorithms to a collected real tripartite network dataset and demonstrate that the proposed approaches can significantly constrain attacker’s reachability and hence provide effective configuration recommendations to secure the system (Sec. VI).

  5. We collect traces of real lateral movement attacks in a cyber system for performance evaluation (Sec. VII). We benchmark our approach against the NetMelt algorithm [3] and show that our approach can achieve the same reduction in attacker’s reachability by hardening nearly 1/3 of the resources as recommended by NetMelt.

Ii Background and Related Work

Laterally moving through a cyber network looking to obtain access to administrator’s credentials or confidential information is a common technique in an attacker’s toolbox [4]. Particularly, privilege escalation through lateral movement is a critical challenge for the security community [5, 6, 7]

. For anomaly detection the authors in

[8] employ graph clustering to group activities with similar behavioral pattern and make change recommendations when the access control methods in place deviate from the real-world activity patterns. The authors in [9] use community structure to detect anomalous insiders in collaborative information systems. For attack prevention the authors in [10] use a graph partitioning approach to fragment the network to limit the possibilities of lateral movement. For risk assessment the authors in [11, 12] use epidemic models for modeling and controlling malware propagation.

Our work fits into two emerging areas of study, 1) Network of networks (NoN) representing multiple inter-related networks as a single model, and 2) studies on resilience of networks. Recently NoN has been an active area of research with diverse topics such as cascading analysis and control in interdependent networks [13, 14], improved grouping or ranking of entities in a network [15], and mapping of domain problems into the NoN paradigm [16]. Network resilience is a long studied topic [17], primarily focusing on the physical topology of communication networks. There has been a surge in focus on enterprise-level cyber resilience [2, 18], where the entire enterprise structure is modeled as a NoN.

Recently researches have focused on altering the network structure to improve its resilience, as measured in terms of the spectral properties [19, 20]. Preventing contagion in networks is another attribute for resilience, and approaches such as [21] suggest algorithms that immunize a subset of nodes as a preventive measure. We contribute to this research area by unifying multiple data sources (e.g., different perspectives of user behaviors) into a single model. Integration of multiple data sources such as user access control and application traffic over the network makes the model more comprehensive and resulting recommendations more profound [22]. This paper is tailored to providing action recommendations for enhancing the resilience of a heterogeneous cyber system based on the associated NoN representation, which differs from previous works that focus on manipulating the topology of a simple (homogeneous) network where each node in the graph has an identical role [3, 23]. To the best of our knowledge, this paper proposes the first representation of a cyber system using the NoN model for designing algorithms that improve resiliency against lateral movement attacks.

Iii Network Model and Iterative Reachability Computation of Lateral Movement

Iii-a Notation and Tripartite Graph Model

Throughout this paper a scripted uppercase letter (e.g., ) denotes a set, a boldfaced uppercase letter (e.g., or ) denotes a matrix, and its entry in the -th row and the -column is denoted by , a boldfaced lowercase letter (e.g., or

) denotes a column vector, and its

-th entry is denoted by , and a plain uppercase or lowercase letter (e.g., or ) denotes a scalar unless specified. The expression denotes the number of elements in the set . The expression denotes the Euler’s number, i.e., the base of the natural logarithm. The expression denotes the canonical vector of zero entries except its -th entry is . The expression denotes the identify matrix. The expression denotes the column vector of ones. The expression denotes the -th column of . The expression

denotes the largest eigenvalue (in magnitude) of a square matrix

. The operation denotes matrix or vector transpose. The operation denotes the Kronecker product which is defined in Appendix A. The operation denotes the Hadamard (entry-wise) product of matrices. The operator is a threshold function such that if and if . The operator is an entry-wise indicator function such that if , and otherwise. The tripartite graph in Fig. 2 can be characterized by a set of users , a set of hosts , a set of applications , a set of user-host accesses , and a set of host-application-host activities . The cardinality of , and are denoted by , and , respectively. The main notation and symbols are listed in Table II.

number of users / hosts / applications
largest eigenvalue of matrix
Kronecker product
column vector of ones
User-host graph matrix
Host-application graph matrix

Compromise probability matrix

/ Reachability / Hardening level vector
Threshold function on vector
Comparator function of and
TABLE II: List of main notation and symbols.

Iii-B Reachability of Lateral Movement on User-Host Graph

Let with denoting the user-host bipartite graph. The access privileges between users and hosts are represented by a binary adjacency matrix , where if user can access host , and otherwise. Let be an binary vector indicating the initial host compromise status, where if host is initially being compromised, and otherwise. Given , we are interested in computing the final binary compromise vector when attackers leverage user access privileges to compromise other accessible hosts. The vector specifies the reachability of a lateral movement attack, where reachability is defined as the fraction of hosts that can be reached via graph walks on starting from . Therefore, reachability is used as a quantitative measure of network vulnerability to lateral movement attacks. Furthermore, studying allows us to investigate the dominant factor that leads to high reachability and more efficient countermeasures.

The computation of can be viewed as a cascading process of repetitive walks on starting from a set of compromised hosts. Let denote the binary compromise vector after -hop walks and let be the number of -hop walks starting from and . The hop count of a walk between two hosts in is defined as the number of traversed users. We begin by computing from : the number of -hop walk from to host is Let , an induced adjacency matrix of hosts in , where is the number of common users that can access hosts and . Then we have and . Generalizing this result, we have

(1)
(2)

The term in (2) accounts for the accumulation of compromised hosts up to hops. Note that based on the property of , (2) can be simplified as

(3)

The recursive relation of reachability in (3) suggests that the term is the dominant factor affecting the propagation of lateral movement. Moreover, from (3) we obtain an efficient iterative algorithm for computing that involves successive matrix-vector multiplications until converges.

Iii-C Reachability of Lateral Movement on Host-Application Graph

The host-application graph contains the information of host-host communicating through an application. Let be an binary matrix representing the host-to-host communication through application , where means host communicates with through application ; and otherwise. The binary matrix is the concatenated matrix of host-application-host matrices for . Let denote the compromise probability matrix, which is a matrix where its entry specifies the probability of compromising host through application . In addition, each host is assigned with a hardening value indicating its security level.

Similar to Sec. III-B, we are interested in computing the reachability of lateral movement on the host-application graph. The hop count of a walk between two hosts in the host-application graph is defined as the average number of paths between the two hosts through applications. Let be an matrix where is the average number of one-hop walk from host to host . Then we have . Let be an vector representing the average number of -hop walks of hosts and . Then the -th entry of the -hop vector is

(4)

Stacking (4) as a column vector gives

(5)

The -hop compromise vector is defined as . In effect the operator compares the thresholded average number of walks with the hardening level for each host, which means a host can be compromised only when the thresholded average number of -hop walk is greater than its hardening level . Generalizing this result to -hop, we have

(6)
(7)

The term in (7) has an equivalent expression

(8)

which is proved in Appendix E. Consequently, for lateral movement on the host-application graph the matrix is the dominant factor, and (8) leads to an iterative algorithm for reachability computation.

Iii-D Reachability of Lateral Movement on Tripartite User-Host-Application Graph

Utilizing the developed results in Sec. III-B and Sec. III-C, the cascading process of lateral movement on the tripartite user-host-application graph can be modeled by

Iv Segmentation on User-Host Graph

In this section we investigate segmentation on user-host graphs as a countermeasure for suppressing lateral movement. Segmentation works by creating new user accounts to separate user from host in order to reduce the reachability of lateral movement, as illustrated in Fig. 2 (b). In principle, segmentation removes some edges from the access graph and then merge these removed edges to create new user accounts. Therefore, segmentation retains the same access functionality and constrains lateral movement attacks at the price of additional user accounts. The following analysis provides a theoretical framework of different segmentation strategies.

Recall from (3) that the matrix is the key factor affecting the reachability of lateral movement on . Therefore, an effective edge removal approach for segmentation is reducing the spectral radius of (i.e., ) by removing some edges from . Note that by definition so that is a positive semidefinite (PSD) matrix, and all entries of are nonnegative. Therefore, by the Perron-Frobenious theorem [24] the entries of

’s largest eigenvector

(i.e., the eigenvector such that ) are nonnegative.

Here we investigate the change in when an edge is removed from in order to define an edge score function that is associated with spectral radius reduction of . If an edge ( is removed from , then the resulting adjacency matrix of is . The corresponding induced adjacency matrix is

(9)

By the Courant-Fischer theorem [24] we have

(10)

The relation in (10) leads to a greedy removal strategy that finds the edge that maximizes the edge score function , in order to minimize a lower bound on the spectral radius of . Moreover, Lemma 1 below shows that the edge score function is also associated with an upper bound on the spectral radius of . Following similar methodology, when a subset of edges are removed from , we have

(11)

where the function

(12)

In a nutshell, the function provides a score that evaluates the effect of edge removal set on the spectral radius of . The lemma presented in Appendix G shows is nonnegative as it can be represented as a sum of nonnegative terms. The following lemma shows that is associated with an upper bound on the spectral radius of . Therefore, maximizing can be an effective strategy for spectral radius reduction of .

Lemma 1.

For any edge removal set with , if there exits one edge removal set such that , then there exists some constant such that

(13)
Proof.

The proof can be found in Appendix H. ∎

Moreover, the lemma presented in Appendix I shows that is a monotonic increasing set function, which means that for any two subsets satisfying , . In addition, the following theorem shows that is a monotone submodular set function [25], which establishes performance guarantee of greedy edge removal on reducing the spectral radius of . Submodularity means has diminishing gain: for any and , the discrete derivative satisfies .

Theorem 1.

is a monotone submodular set function.

Proof.

The proof can be found in Appendix J. ∎

With the established results, a greedy segmentation algorithm (Algorithm 1) is proposed. Algorithm 1 computes the edge score function for every edge and segments edges of highest scores to create new user accounts. For efficient computation step 2 of Algorithm 1 can be represented by the matrix form , where if , and otherwise.

Input: , number of segmented edges
Output: modified access adjacency matrix
if recalculating score then
     Initialization: . . .
     for  to  do
         1. Compute the leading eigenvector of
         
         2. Compute score
              for all
         3. Remove the highest scored edge
             from
         4. . .
             .      
else
     1. Compute the leading eigenvector of
     2. Compute score
         for all
     3. Remove the edges of highest scores from
     4. Store this set of edges in
5. Segment the removed edges in to create new users. A new user has access to a set of hosts
6. Obtain the modified access adjacency matrix
Algorithm 1 Greedy score segmentation algorithm

Using the monotonic submodularity of in Theorem 1, the following theorem shows that this greedy algorithm (Algorithm 1 without score recalculation) has performance guarantee on spectral radius reduction relative to the optimal batch edge removal strategy of combinatorial computation complexity for selecting the best edges.

Theorem 2.

(Greedy segmentation without score recalculation)  Let be the optimal batch edge removal set with that maximizes and let with be the greedy edge removal set obtained from Algorithm 1. If , then there exists some constant such that

Proof.

The proof can be found in Appendix K. ∎

As a variant of Algorithm 1 without score recalculation, for better traceability one may desire to successively recalculate the largest eigenvector and update the edge score function after each edge removal. The following corollary provides a theoretical analysis of the greedy segmentation algorithm with score recalculation (Algorithm 1 with score recalculation), which shows that score recalculation can successively reduce the spectral radius of .

Corollary 1.

(Greedy segmentation with score recalculation)  Let denote the adjacency matrix of for some , and let denote the largest eigenvector of . For any edge removal set , let , and let be a maximizer of . Then . Furthermore, if , then .

Proof.

The proof can be found in Appendix L. ∎

In addition to establishing the performance guarantee of greedy score segmentation (Algorithm 1) for reducing , the following theorem shows that the two intuitive greedy segmentation algorithms proposed in Algorithm 2, with an aim of successively segmenting the edge connecting to the most connected user or host, are also effectively reducing an upper bound on . The terms and denote the degree vector of users and hosts, respectively, and the terms and denote the maximum degree of users and hosts in , respectively.

Theorem 3.

(Greedy user-(host-)first segmentation)  If an edge is removed from and is irreducible, then

Proof.

The proof and the case when is reducible can be found in Appendix M. ∎

Input: , number of segmented edges
Output: modified access adjacency matrix
Initialization: . . .
for  to  do
     1. Compute user (host) degree vector
         ()
     2. Obtain and
         
         ( and
         )
     3. Remove the edge from .
     4. . .
         
5. Segment the removed edges in to create new users. A new user has access to a set of hosts
6. Obtain the modified access adjacency matrix
Algorithm 2 Greedy user-(host-)first segmentation algorithm

Since the term in Theorem 3 is a vector of access connections of user , Theorem 3 suggests a greedy user-first segmentation approach that segments the edge between the user of maximum degree and the corresponding accessible host of maximum degree in order to reduce the upper bound on spectral radius in Theorem 3. Similar analysis apples to the greedy host-first segmentation approach in Algorithm 2.

V Hardening on Host-Application Graph

In this section we discuss two countermeasures for constraining lateral movements on the host-application graph. Edge hardening refers to securing access from application to host , and in effect reducing the compromise probability . Node hardening refers to securing a particular host and in effect increasing its hardening level.

Recall from (8) that the reachability of lateral movement on host-application graph is governed by the matrix . Note that although is in general not a symmetric matrix, its entries are nonnegative and hence by the Perron-Frobenious theorem [24] is real and nonnegative, and the entries of its largest eigenvector are nonnegative.

Hardening a host for an application means that after hardening the compromise probability is reduced to some value such that . Let denote the set of hardened edges and let be the compromise probability matrix after edge hardening. Then we have Let and let be the largest eigenvector of . We can show that

(14)

The proof of (14) can be found in Appendix N.

Let be a score function that reflects the effect of the edge hardening set on spectral radius reduction of . The lemma presented in Appendix O shows that is a monotonic increasing set function of . The following analysis shows that is associated with a pair of upper and lower bounds on the spectral radius of after edge hardening.

Input: , number of hardened edges ,
Output: modified compromise probability matrix
if recalculating score then
     Initialization: . .
     for  to  do
         1. Compute the leading eigenvector of
         2. Compute score
         3. Obtain
         4. Edge hardening:
         5. (see Appendix P)      
else
     Initialization:
     1. Compute the leading eigenvector of
     2. Compute score
     3. Find the edges of highest scores
     4. Store this set of edges in
     5. Edge hardening: for all
Algorithm 3 Greedy edge hardening algorithm

The edge hardening algorithm proposed in Algorithm 3 is a greedy algorithm that hardens the edges of highest scores between applications and hosts, where the per-edge hardening score is defined as . Step 5 in Algorithm 3 with score recalculation can be updated efficiently by tracking the changes in the matrix caused by Step 4 (see Appendix P). The following theorem shows that the hardened edge set obtained from Algorithm 3 without score recalculation is a maximizer of .

Theorem 4.

(Greedy edge hardening without score recalculation) For any hardening set with , let with be the greedy hardening set obtained from Algorithm 3. Then is a maximizer of .

Proof.

The proof can be found in Appendix Q. ∎

Furthermore, the following theorem shows that Algorithm 3 without score recalculation has bounded performance guarantee on spectral radius reduction of relative to that of the optimal batch edge hardening set for which the computation complexity is combinatorial.

Theorem 5.

(Performance guarantee of greedy edge hardening without score recalculation) For any hardening set with , . Furthermore, let with be the optimal hardening set that minimizes and let with be the hardening set that maximizes . If and , then there exists some constant such that

Proof.

The proof can be found in Appendix R. ∎

The corollary below shows Algorithm 3 with score recalculation can successively reduce the spectral radius of .

Input: edge score , number of hardened nodes ,
Output: modified node hardening vector
Initialization:
1. Compute edge hardening score for all
     and
2. Compute node hardening score
    for all
3. Find the first nodes of highest scores and store this
    set of nodes in
4. Node hardening: for all
Algorithm 4 Greedy node hardening algorithm
Corollary 2.

(Greedy edge hardening with score recalculation) Let denote the largest eigenvector of and let . For any edge hardening set , let be a maximizer of . Then . Furthermore, if and , then .

Proof.

The proof can be found in Appendix S. ∎

Lastly, for node hardening we use the edge hardening score to define the node hardening score for host , where In effect, node hardening on host enhances its hardening level from to a value . A greedy node hardening algorithm based on the node hardening score is summarized in Algorithm 4. In Sec. VI we also investigate the performance of two other node score functions based on and for greedy node hardening, namely and .

Vi Experimental Results

Vi-a Dataset Description and Experiment Setup

To demonstrate the effectiveness of the proposed segmentation and hardening strategies against lateral movement attacks, we use the event logs and network flows collected from a large enterprise to create a tripartite user-host-application graph as in Fig. 2 (a) for performance analysis. This graph contains 5863 users, 4474 hosts, 3 applications, 8413 user-host access records and 6230 host-application-host network flows. All experiments assume that the defender has no knowledge of which nodes are compromised and the defender only uses the given tripartite network configuration for segmentation and hardening.

To simulate a lateral movement attack we randomly select 5 hosts (approximates 0.1% of total host number) as the initially compromised hosts and use the algorithms developed in Sec. III to evaluate the reachability, which is defined as the fraction of reachable hosts by propagating on the tripartite graph from the initially compromised hosts. The initial node hardening level of each host is independently and uniformly drawn from the unit interval between 0 and 1. The compromise probability matrix

is a random matrix where the fraction of nonzero entries is set to be 10% and each nonzero entry is independently and uniformly drawn from the unit internal between 0 and 1. The compromise probability after hardening,

, is set to be for all and . All experimental results are averaged over 10 trials.

(a)  
(b)  
Fig. 3: The effect of segmentation on the user-host access graph. (a) Reachability with respect to different segmentation strategies. (b) Fraction of newly created user accounts from segmentation. Given the same number of segmented edges, greedy host-first segmentation strategy (green curve) is the most effective approach to constraining reachability (Fig. 3 (a)) at the cost of most additional accounts (Fig. 3 (b)).

Vi-B Segmentation against Lateral Movement

Fig. 3 shows the effect of different segmentation strategies proposed in Sec. IV on the user-host graph. In particular, Fig. 3 (a) shows that greedy host-first segmentation strategy is the most effective approach to constraining reachability given the same number of segmented edges, since accesses to high-connectivity hosts (i.e., hubs) are segmented. For example, segmenting 15% of user-host accesses can reduce the reachability to nearly one third of its initial value. Greedy segmentation with score recalculation is shown to be more effective than that without score recalculation since it is adaptive to user-host access modification during segmentation. Greedy user-first segmentation strategy is not as effective as the other strategies since segmentation does not enforce any user-host access reduction and therefore after segmentation a user can still access the hosts but with different accounts.

Fig. 3 (b) shows the fraction of newly created accounts with respect to different segmentation strategies. There is clearly a trade-off between network security and implementation practicality since Fig. 3 suggests that segmentation strategies with better reachability reduction capability also lead to more additional accounts. However, in practice a user might be reluctant to use many accounts to pursue his/her daily jobs even though doing so can greatly mitigate the risk from lateral movement attacks.

Vi-C Hardening against Lateral Movement

Fig. 4 shows the effect of different hardening strategies proposed in Sec. V on the host-application graph. As shown in Fig. 4

(a), the proposed greedy edge hardening strategies with and without score recalculation have similar performance in reachability reduction, and they outperform the greedy heuristic strategy that hardens edges of highest compromise probability. This suggest that the proposed edge hardening strategies indeed finds the nontrivial edges affecting lateral movement. Fig.

4 (b) shows that the node hardening strategies using the node score function and lead to similar performance in reachability reduction, and they outperform the greedy heuristic strategy that hardens nodes of lowest hardening level. These results show that the greedy edge and node hardening approaches based on the proposed hardening matrix outperform heuristics using the compromise probability matrix and the hardening level vector , which suggest that the intuition of hardening the host of lowest security level might not be the best strategy for constraining lateral movement, as it does not take into account the connectivity structure of the host-application graph.

(a)  
(b)  
Fig. 4: The effect of hardening on host-application graph. (a) Reachability with respect to different edge hardening strategies. (b) Reachability with respect to different node hardening strategies. The greedy hardening approaches based on the proposed hardening matrix (red and blue curves) outperform heuristics using the compromise probability matrix and the hardening level vector (green curve).

Vi-D Segmentation and Hardening on Tripartite Graph

Lastly, we investigate the joint effect of segmentation and hardening on constraining lateral movement attacks on the user-host-application tripartite graph. Fig. 5

shows the lateral movement reachability under a selected combination of the proposed segmentation and hardening strategies. Since these joint segmentation and hardening strategies lead to similar results in reachability reduction, we display their mean and standard deviation. In addition, for clarity we only plot representative points to demonstrate the effectiveness. It can be observed that different combinations of the proposed strategies result in similar tendency in constraining lateral movements. Originally, more than half of hosts can be compromised if no preventative actions are taken. Nonetheless,

the proposed segmentation and hardening strategies can greatly reduce the reachability of lateral movements to secure the network.

(a)  
(b)  
Fig. 5: The effect of segmentation and hardening on lateral movement attack in user-host-application tripartite graph. This figure shows the mean and the standard deviation (std) of reachability of four joint segmentation and hardening strategies. The size and the color of a point in the plot reflects the level of reachability.

Vii Benchmark: Performance Evaluation on Actual Lateral Movement Attacks

This section demonstrates the importance of incorporating the heterogeneity of a cyber system for enhancing the resilience to lateral movement attacks. Specifically, real lateral movement attacks taking place in an enterprise network are collected as a performance benchmark333Dataset available at https://sites.google.com/site/pinyuchenpage/datasets. This dataset contains the communication patterns between 2010 hosts via 2 communication protocols, and therefore the enterprise network can be summarized as a bipartite host-application graph. It also contains lateral movements originated from a single compromised host, and in total includes 2001 propagation paths. The details of the collected benchmark dataset are given in Appendix T. The experiment in this section differs from the analysis in Sec. VI, as this dataset contains actual lateral movement traces on the host-application graph, whereas in Sec. VI we have a complete user-host-application tripartite graph of an enterprise, but without the actual attack traces.

We compare the performance of our proposed edge hardening method (Algorithm 3) to the NetMelt algorithm [3], which is a well-known edge removal method for containing information diffusion on a homogeneous graph. For the proposed edge hardening method, the edges in the host-application bipartite graph are hardened sequentially according to the computed scores, and the initial compromise probability matrix is set to be a matrix of ones. For every propagation path, the lateral movement will be contained if the edge it attempts to leverage is hardened. Since NetMelt can only deal with homogeneous graphs (in this case, the host-host graph), its recommendation on hardening a host pair is equivalent to hardening corresponding host-application edges (in this case, ), whereas our method has better granularity for edge hardening by considering the connectivity structure of the host-application bipartite graph. The computation complexity of NetMelt is [3], where is the number of edges in the host-host graph, is the number of hardened edges, and is the number of hosts. Since the operation of leading eigenpair computation in Algorithm 3 is similar to NetMelt, the computation complexity for Algorithm 3 without score recalculation is , where is the number of nonzero entries in the matrix . For Algorithm 3 with score recalculation, the computation complexity is .

Fig. 6: Performance evaluation on the collected benchmark dataset. The proposed approaches (blue and red curves) can restrain the reachability to roughly 10% by hardening less than 1.5% of edges, whereas NetMelt (green) requires to harden more than 5% of edges to achieve comparable reachability.

Fig. 6 shows the reachability of lateral movements with respect to the fraction of hardened edges. Initially the reachability is nearly 100%, suggesting that almost every host is vulnerable to lateral movement attacks without edge hardening. The proposed method (both with or without score recalculation) can restrain the reachability to roughly 10% by hardening less than 1.5% of edges, whereas NetMelt requires to harden more than 5% of edges to achieve comparable reachability, since it does not exploit the heterogeneity of the cyber system. Consequently, the results demonstrate the utility of incorporating heterogeneity for building resilient systems.

Viii Conclusion and Future Work

This paper developed a framework for joint modeling of multiple dimensions of cyber behavior (user access control, application traffic) for enhancing cyber enterprise resiliency in an unified, tripartite network model. Our experiments performed on a real dataset demonstrate the value and powerful insights from this unified model with respect to analysis performed on a single dimensional dataset. Through the tripartite graph model, the dominant factors affecting lateral movement are identified and effective algorithms are proposed to constrain the reachability with theoretical performance guarantees. We also synthesized a benchmark dataset containing traces of actual lateral movement attacks. The results showed that our proposed approach can effectively contain lateral movements by incorporating the heterogeneity of the cyber system. Our future work includes generalization to -partite networks to model other dimensions of behavior (e.g., authentication mechanisms and social profile of users).

References

  • [1] M. A. Sasse, S. Brostoff, and D. Weirich, “Transforming the ‘weakest link’—a human/computer interaction approach to usable and effective security,” BT technology journal, vol. 19, no. 3, pp. 122–131, 2001.
  • [2] H. Goldman, R. McQuaid, and J. Picciotto, “Cyber resilience for mission assurance,” in IEEE International Conference on Technologies for Homeland Security (HST), 2011, pp. 236–241.
  • [3] H. Tong, B. A. Prakash, T. Eliassi-Rad, M. Faloutsos, and C. Faloutsos, “Gelling, and melting, large graphs by edge manipulation,” in ACM CIKM, 2012, pp. 245–254.
  • [4] N. Provos, M. Friedl, and P. Honeyman, “Preventing privilege escalation,” in USENIX, vol. 3, 2003.
  • [5] S. Bugiel, L. Davi, A. Dmitrienko, T. Fischer, A.-R. Sadeghi, and B. Shastry, “Towards taming privilege-escalation attacks on android.” in NDSS, vol. 17, 2012, p. 19.
  • [6] L. Xing, X. Pan, R. Wang, K. Yuan, and X. Wang, “Upgrading your android, elevating my malware: Privilege escalation through mobile os updating,” in IEEE Symp. on Security and Privacy, 2014, pp. 393–408.
  • [7] P.-Y. Chen, C.-C. Lin, S.-M. Cheng, H.-C. Hsiao, and C.-Y. Huang, “Decapitation via digital epidemics: a bio-inspired transmissive attack,” IEEE Commun. Mag., vol. 54, no. 6, pp. 75–81, June 2016.
  • [8] T. Das, R. Bhagwan, and P. Naldurg, “Baaz: A system for detecting access control misconfigurations,” in USENIX, 2010, pp. 161–176.
  • [9] Y. Chen, S. Nyemba, and B. Malin, “Detecting anomalous insiders in collaborative information systems,” IEEE Trans. Depend. Sec. Comput., vol. 9, no. 3, pp. 332–344, 2012.
  • [10] A. Zheng, J. Dunagan, and A. Kapoor, “Active graph reachability reduction for network security and software engineering,” IJCAI, vol. 22, no. 1, p. 1750, 2011.
  • [11] S.-M. Cheng, W. C. Ao, P.-Y. Chen, and K.-C. Chen, “On modeling malware propagation in generalized social networks,” IEEE Commun. Lett., vol. 15, no. 1, pp. 25–27, Jan. 2011.
  • [12] P.-Y. Chen, S.-M. Cheng, and K.-C. Chen, “Optimal control of epidemic information dissemination over networks,” IEEE Trans. on Cybern., vol. 44, no. 12, pp. 2316–2328, Dec. 2014.
  • [13] J. Gao, S. V. Buldyrev, S. Havlin, and H. E. Stanley, “Robustness of a network of networks,” Physical Review Letters, vol. 107, no. 19, p. 195701, 2011.
  • [14] A. Chapman, M. Nabi-Abdolyousefi, and M. Mesbahi, “Controllability and observability of network-of-networks via cartesian products,” IEEE Trans. Autom. Control, vol. 59, no. 10, pp. 2668–2679, 2014.
  • [15] J. Ni, H. Tong, W. Fan, and X. Zhang, “Inside the atoms: ranking on a network of networks,” in ACM KDD, 2014, pp. 1356–1365.
  • [16] M. Halappanavar, S. Choudhury, E. Hogan, P. Hui, J. Johnson, I. Ray, and L. Holder, “Towards a network-of-networks framework for cyber security,” in IEEE ISI, 2013, pp. 106–108.
  • [17] P. Demeester, M. Gryseels, A. Autenrieth, C. Brianza, L. Castagna, G. Signorelli, R. Clemenfe, M. Ravera, A. Jajszczyk, D. Janukowicz, K. V. Doorselaere, and Y. Harada, “Resilience in multilayer networks,” IEEE Commun. Mag., vol. 37, no. 8, pp. 70–76, 1999.
  • [18] S. Choudhury, P.-Y. Chen, L. Rodriguez, D. Curtis, P. Nordquist, I. Ray, and K. Oler, “Action recommendation for cyber resilience,” in ACM CCS Workshop, 2015, pp. 3–8.
  • [19] H. Chan, L. Akoglu, and H. Tong, “Make it or break it: Manipulating robustness in large networks,” in SIAM Data Mining, 2014, pp. 325–333.
  • [20] P.-Y. Chen and A. O. Hero, “Assessing and safeguarding network resilience to nodal attacks,” IEEE Commun. Mag., vol. 52, no. 11, pp. 138–143, Nov. 2014.
  • [21] L. A. Adamic, C. Faloutsos, T. J. Iwashyna, B. A. Prakash, and H. Tong, “Fractional immunization in networks,” in Siam Data Mining, 2013, pp. 659–667.
  • [22] P. Hu and W. C. Lau, “How to leak a 100-million-node social graph in just one week?-a reflection on oauth and api design in online social networks,” in Black Hat, 2014.
  • [23] L. T. Le, T. Eliassi-Rad, and H. Tong, “Met: A fast algorithm for minimizing propagation in large graphs with small eigen-gaps,” in SIAM Data Mining, vol. 15, 2015, pp. 694–702.
  • [24] R. A. Horn and C. R. Johnson, Matrix Analysis.   Cambridge University Press, 1990.
  • [25] S. Fujishige, Submodular Functions and Optimization.   Annals of Discrete Math., North Holland, 1990.

Supplementary File

Appendix A Kronecker Product

If is an matrix and is an matrix, then the Kronecker product is an matrix defined as

(15)

Some useful properties of Kronecker product are

(16)
(17)

If is an matrix, is an matrix, is an matrix, and is an matrix, then

(18)

Appendix B Proof of (3)

Following (2),

(19)

Appendix C Proof of (4)

Following the definition of , we have