The advancements in sensing, communication, and computing technology have resulted in both increased efficiency and complexity of engineered system operations. For instance, smart cities of the future are envisioned to deploy billions of sensors, allowing the connection of various urban services that will be interoperated. Such interconnectivity, and in turn the resulting interdependence is a prime contributor to the increasing complexity of the system. These complex systems are typically designed from the perspective of optimizing the efficiency, which is considered as the primary objective of the system. However, in addition to the efficiency of the system, resilience is also an important property. Resilience quantifies the ability of the system to absorb and recover from extreme conditions. This is especially true for interconnected and interdependent systems wherein the holistic performance of the system is highly dependent on the nature and degree of interactions between different components of the system. With increasing frequency of natural hazards and adversarial attacks, properties such as robustness and resilience of complex networks have become important considerations.
Graph theory is a powerful framework that allows the studying and analyses of complex systems by representing the physical system as a graph consisting of nodes and edges ( while abstracting away the physical aspects of the components). Graph theory has been extensively used to study properties such as robustness and resilience of systems [ferrario2016evaluation, bristow2017graph, munikoti2020robustness]. Resilience is typically related to the impact of an attack in proportion to the level of disruption caused by the attack. The magnitude of disruption varies with the node being attacked, which is used as a criterion to rank nodes in the graph. There are numerous metrics and methodologies available in the literature to quantify this disruption, including effective graph resistance, flow robustness, total graph diversity, etc. [alenazi2015comprehensive]. However, the existing methods of ranking nodes in a graph based on the above resilience metrics are computationally expensive since it involves traversal of the whole graph for every compromised node under consideration and iterating over all nodes in the graph. The complexity of existing approaches increases as the network size grows. For example, the complexity for identifying the optimal link whose removal maximally reduces robustness is of order for a graph with nodes [wang2014improving].
As a result, several recent efforts have been made to approximate such algorithms. However, with an increase in graph size (number of nodes in the graph), the accuracy of these algorithms decreases, and the execution time increases considerably. Moreover, several applications involve dynamically changing network topologies requiring node resilience to be dynamically estimated and maintained. In addition, existing approaches are not inductive in a sense that one has to repeat the same procedure whenever the graph structure changes. Therefore, this paper proposes a graph neural network-based inductive learning approach to efficiently estimate the ranking of nodes based on the resilience of the graph. Since it is the relative importance of nodes (as measured by node resiliency) as opposed to the absolute values of resilience that are more suited to several real-world applications such as in the case of targeted attacks, this article aims to approximate the relative ranks of nodes as a measure of resilience. Furthermore, the model learned on one graph, can be used to predict resilience in a larger graph with the similar characteristics.
I-a Related Work
Ranking the nodes based on a resilience score is typically an iterative approach. Authors in [wang2014improving], study the graph metric of effective graph resistance as a robustness measure of complex networks. Robustness is improved by protecting the link whose removal maximally increases the effective graph resistance. As stated earlier, the complexity of an exhaustive search to identify such a link is of order for a graph of nodes. Therefore, [wang2014improving] proposes various strategies that provide a trade-off between scalable computation and accurate identification of links. Still, the lowest achieved complexity is of order . Similarly, the authors in [kermanshah2017robustness] study the robustness of road networks to extreme flooding events. Here, they simulate extreme events by removing entire sections of a road system that corresponds to a node in the graph. They conclude that robustness in spatial systems is dependent on many factors, including the topological structure. However, the adopted method in this work is computationally exhaustive and it is difficult to infer anything about the new road network. In [wang2015network], the authors deploy effective graph resistance as a metric to relate the topology of a power grid to its robustness against cascading failures. Specifically, the authors propose various strategies to identify node pairs where the addition of links will optimize graph robustness. The minimum achieved complexity is of the order , where with links and nodes. Further, the authors in [pizzuti2018genetic]
propose a method based on a Genetic algorithm to enhance network robustness focusing on the protection of links whose removal would severely decrease the graph robustness. Simulations on real-world and synthetic networks show that their method in most cases is equivalent in complexity to the exhaustive search.
With a growing emphasis on resilience of complex interconnected systems, there is a need to identify nodes/links in the underlying graph model whose removal significantly decreases the graph robustness. However, as seen in the previous discussion, existing methods to identify such nodes/links are computationally inefficient. Therefore, the key contributions of our work include:
A computationally efficient framework built on a GNN-based architecture to quickly identify critical nodes (in terms of resilience) in a large complex network.
The proposed framework is both scalable and generalizable. That is, new predictive models for other resilience metrics or alternate graph types (other than the ones used to train the GNN), can be retrained very efficiently leveraging the existing models.
Results show that the model is more than accurate in identifying Top-5% of the critical nodes, and the execution time is significantly low compared to conventional approaches.
The rest of the paper is organized as follows. Section II describes various graph robustness metrics followed by a brief introduction of graph neural network. Section III presents the proposed framework with experiments and results in Section IV. The final conclusions are provided in Section V.
This article proposes the use of graph neural networks for inductive learning of node resilience in a large complex network. In this section, we first discuss the resilience of graphs and the graph theoretic metric adopted in this study, followed by a discussion of graph neural networks that we adopt to approximate the metric.
Ii-a Robustness of Graphs
Graph robustness (also referred as graph resilience) is the ability of a graph to continue to serve its role when it is subject to failures or attacks. In other words, it represents the resistive capability of a graph against disruptions. In this paper, we have used resilience and robustness interchangeably. Intuitively robustness is all about back-up possibilities, or alternative paths, but it is a challenge to capture these concepts in a mathematical framework. The disruptions in a graph signify the loss of nodes or links. For instance, in the case of urban networks such as power or water distribution network, a cyber attack or natural disaster could shut down a particular resource or functionality. This loss can be simulated by removing the corresponding node/link from the graph. Further, the loss of nodes/links could be random or targeted depending upon the triggering scenario. In random, the nodes/links are targeted randomly, whereas, in a targeted attack, the nodes/links are attacked based on some prior information. During the last few years, a lot of network-based surrogate metrics have been proposed [alenazi2015comprehensive], which approximately quantify the graph resilience against random and targeted attacks. The authors in [alenazi2015comprehensive, alenazi2015evaluation], extensively study various graph robustness metrics for different types of graphs and provide the most appropriate metrics for each type of graph. Some of the known robustness metrics are effective graph resistance, weighted spectrum, total graph diversity, network criticality, etc. [alenazi2015comprehensive] concludes that there is no generic robustness metric that can work for all types of graphs under all scenarios. Here, the type refers to various classes of graph based on their degree distributions, average clustering coefficient, assortativity, etc. Classic examples include Power-law graph, Erdos-Renyi graph, etc. Thus, different metrics are needed, based on the objective and type of graphs. Of all the metrics proposed in the literature, effective graph resistance () and weighted spectrum () appear to be the most accurate and generic metrics for graph resilience. Therefore, we have selected these two metrics for our graph resilience study. However, it is important to note that the proposed approach is generic enough to work for any robustness metric. Further, in most of the modeling scenarios, a node of a graph, in general, represents the process/functionality and a link denotes the flow of matter and information. As we are more interested in the attacks that downgrade the functionality, we proceed with the study of node failures in this work, and analysis of link failure will be pursued as a part of future work.
Effective graph resistance () has been used in various works to study graph robustness [wang2014improving, wang2015network, ellens2013graph]. It is the sum of the effective resistances over all the pairs of nodes [ellens2011effective]. Further, the effective resistance between any two nodes of a graph, assuming that a network is seen as an electrical circuit, is computed by the commonly known series and parallel operations. From series and parallel formulations, it follows that considers both the number of paths between the nodes and their length (link weight), intuitively measuring the presence and quality of back-up possibilities. The spectral form of can be expressed as,
where, are the eigen values of the Laplacian matrix of graph , and is the number of connected components in the graph, which matches with the number of zero eigen values. Thus, only the non-zero eigen values are utilized while computing . Further, equation (1) is a normalized expression, that allows the comparison of across graphs of different dimensions.
Similar to , Weighted spectrum () is also a widely used metric for graph robustness [fay2009weighted, long2014measuring]. It is defined as the normalized sum of -cycles in a graph [fay2009weighted]. An -cycle in a graph is defined to be a sequence of nodes where is adjacent to for and is adjacent to . It has been introduced to analyze the resiliency of internet topology. Thereafter, it has been compared to various other resiliency metrics and found to be a versatile metric especially for geographically correlated attacks [alenazi2015evaluation]. The can be expressed as,
where, different values of corresponds to different graph properties. For instance, denotes the number of triangles in a graph with related to the weighted clustering coefficient. Similarly, with , is proportional to the number of disjoint paths in a graph. As we are concerned with resiliency and connectivity, we have used in our study.
Ii-B Graph Neural Networks
Deep neural networks have shown tremendous success in effectively capturing hidden patterns of Euclidean data. However, there is an increasing number of applications, where data resides in the form of graphs. For example, in a protein network, a graph-based learning system can exploit the interactions for new drug discovery. Graphical data presents various challenges in terms of irregularity, the variable size of unordered nodes, and nodes from a graph may have a different number of neighbors. This hinders the application of standard deep learning operations (convolution) on graphs. Therefore, recently, new definitions and generalizations of standard operations have been developed for graph data.[kipf2016semi]
is one of the initial works that efficiently transforms the convolutional operations from Euclidean to graph space. Its working principle resembles that of the convolutional neural network and can work directly on graphs and take advantage of their topological information. The standard learning tasks on graph data are node classification, link prediction, graph classification, etc. A graph neural network (GNN) mostly involves learning of node embedding vectors followed by feedforward layers for regression or classification tasks. The proposed learning algorithm in[kipf2016semi], depends on the size of the graph, which leads to scalability issues. To address this scalability issue, authors in [hamilton2017inductive] propose an inductive learning framework. Here, node embeddings are learned using subgraphs and thus the training is independent of graph size. Furthermore, this framework can be leveraged to infer about the unseen/new nodes of the graph belonging to the same family. The standard procedure to learn this embedding vector involves a “message passing mechanism”. Here, the information (node feature) is aggregated from the neighbors of a node and then combined with its own feature to generate a new feature vector. This process is repeated to generate the final embedding for each node of a graph. The GraphSAGE algorithm learns the mapping (aggregator) function instead of learning the embedding vectors. Hence, it can induce the embedding of a new node or node unseen during training, given its features and neighborhood. Our proposed framework is based on GraphSAGE as it is more relevant for large graphs. GraphSAGE learns a representation for every node based on some combination of its neighboring nodes, parametrized by . The parameter controls the hierarchy level of the neighborhood, i.e., the number of hops to be considered as neighbor. For instance, if is , then all the nodes which are hops away from the selected node will be considered as neighbors. After defining the neighborhood, the aggregate function is employed to associate each neighbor’s embedding with weights to create a neighborhood embedding for a selected node. Thereafter, for each neighborhood depth until , a neighborhood embedding is generated with the aggregator function for each node and concatenated with the existing node embedding. The update rule for node features can be written as,
where, is the feature vector of node at layer. and
are the learnable weights of the GNN. Finally, the concatenated vector is passed through a feedforward layer to get the final node outputs. GraphSAGE learn the weights of an aggregator function and feedforward layers by minimizing an appropriate loss function. In a supervised learning setting, the algorithm reduces the loss associated with node target labels. Once weights are learned, then an embedding vector and consequently the node labels can be predicted for a test node given its features and neighboring information.
Iii Proposed Framework
We propose a two-step approach for inductive learning-based approximation of resilience graphs. In the first step, a computationally manageable graph is used to learn an appropriate node embedding that can be used to rank the nodes based on the node resilience score. The use of a small graph in the first step allows quick learning of node embeddings. However, to obtain reliable predictions from the GNN, it is important that this graph exhibit property similar to the desired application graph. One way to achieve this is to use a standard graph such as power-law graph or power-law cluster graph that is known to exhibit properties similar to most real world graphs. However, in certain applications where such a representative reference graph cannot be found, it is possible to use a subset of the nodes of the full graph for learning the node embedding while evaluating the network on the remaining nodes in the graph. The second step involves the straightforward application of the trained GNN to predict the node rank based on their node resilience score for any given application.
The prediction of node resilience score with the use of GNN consists of two components - node embedding and regression. The first component learns the embedding vector for each node by utilizing the graph structure and node target ranks. This is achieved in a manner such that nodes that are close in the graph space also lie close in the embedding space while maintaining a similar consistency between the node ranks. As an initialization, the embedding of each node is composed of only the degree of the node, followed by a predefined number of zeroes. The embedding module is based on GraphSAGE algorithm [hamilton2017inductive]
, which is described in the previous section. We have used a max pooling aggregator with the depth of the model as three. Thus, the neighborhood aggregation for each node is made from all the neighbors that are three hops away from the target node. Thereafter, at each layer of the network, the neighborhood aggregation is combined with the node’s present embeddings to generate a new embedding. The aggregation and combination operation involve trainable network weights.
The second component is straightforward, and it consists of stacked feedforward layers that regress node ranks from node embeddings. It contains two dense layers with ten neurons in each, followed by a last dense layer with one output. The activation function in the last layer is kept asrelu as ranks are positive numbers. The output of this module corresponds to node rank. The training of both the modules is conducted end to end with input being a specific node along with its neighbors and output being the corresponding node rank. The complete framework is shown in figure 1.
The network weights can be updated by optimizing the appropriate loss function. In our framework, the output of the model is node ranks that are utilized to identify critical nodes in a network. So, the suitable loss function is the ranking loss [chen2009ranking]. Unlike other loss functions, such as Cross-Entropy Loss or Mean Square Error Loss, whose objective is to learn to predict a value or a set of values given an input, the objective of ranking loss is to preserve the relative distances between the inputs. Here, we have used pair-wise ranking loss that looks at a pair of node ranks at a time. The goal of the training model is to minimize the number of inversions in the ranking, i.e., cases where the pair of node ranks are in the wrong order relative to the ground truth. This loss function can be expressed as,
where, is the ground truth value of resilience score for node . is the relative rank order which the model is learning to infer by minimizing the loss .
is a sigmoid function. The loss is aggregated for all the training node pairs, and then optimized to update the model weights.
In this work, a different model is trained for each type of synthetic graph. This is because, different families of graphs vary in their overall structure and link connections, i.e., degree distributions, assortativity, average clustering coefficient. For a particular graph family, different random instants of a graph are sampled, and then the ground truth of node resilience score is computed for each sampled graph using Algorithm 1. The model is then trained on multiple graphs of small dimensions. Thereafter, the trained model is tested on similar graphs of higher dimensions. For example, a model can be trained on multiple power law cluster graphs of dimension , and then can be evaluated on power-law graphs of dimension
. The training algorithm is summarized in Algorithm 2. The model is trained on Tensorflow framework with Adam optimizer[StellarGraph].
For real-world graphs such as US Power grid [snapnets, nr]
, Wiki-vote, one can use the already trained model for predicting node ranks based on resilience score. However, many networks may not resemble any of the synthetic graphs. Under these circumstances, the trained models on synthetic graphs may not give the desire results on real networks. To overcome this challenge, one can implement transfer learning with synthetically trained models on a real network. Here, the tuning of a model can be done with a very small number of nodes sayof the total nodes, thereby saving a significant amount of time. The computational advantage and accuracy of the proposed approach is shown in the next section.
Iii-B Evaluation metrics
The trained model predicts the ranks of test nodes in a graph. In a general setting of resilience analysis, it is more relevant to identify top ranked nodes that are most critical with respect to graph robustness, rather than knowing the ranks of all the nodes. Therefore, we have used Top-N% accuracy to evaluate the proposed framework against the conventional approach. Top-N% is defined as the percentage of overlap between the top-N% nodes as predicted by an approximation method and the top-N% nodes as identified by standard approach, i.e., Algorithm 1 here. It can be expressed as,
where, is the number of nodes and is the desired band. Here, we have reported the results for top-5% accuracy. Further, the computational efficiency of the proposed approach is demonstrated in terms of wall-clock running time. Wall-clock time is the actual time the computer takes to process a program.
Iv-a Beseline approach
To evaluate the performance of our proposed approach, we compare ILGR with a conventional method of estimating node resiliency. The classic procedure involves an iterative method of removing a node from the graph and computing the robustness metric of the residual graph. Here, the residual graph signifies the leftover graph after the removal of node. This process is repeated for all the nodes of the graph. Thereafter, the computed metric values of all the nodes are arranged to generate nodes ranks and identify the most critical nodes whose removal maximally decreases the graph robustness. Algorithm 1 summarizes the existing approach to identify most critical nodes based on graph resiliency.
We evaluate the performance of ILGR on both synthetic and real world graphs. The two most popular synthetic graph families which are considered for analysis are Scale-free (power law) and power law cluster (PLC) graphs [holme2002growing]. Power law (PL) graphs are those graphs whose degree distribution follow power law (heavy tailed), and many real networks have shown to be of this family [barabasi2000scale]
. It can be constructed through a process of preferential attachment in which the probability that a new nodeconnects with an existing node is proportional to the fraction of incoming edges to . On the other hand, PLC graphs follow both power law degree distribution and small-world phenomenon, and many real-world networks show these properties [newman2000models]. As shown in [holme2002growing], one can construct a PLC network in which the degree distribution follows a power law but also exhibits “clustering” by requiring that, in some fraction of cases (), a new node connects to a random selection of the neighbors of the node to which last connected. We have generated these graphs using python NetworkX liblary [hagberg2008exploring].
The two real-world graphs that are analyzed in this work are the US power grid and Wiki-vote [snapnets, nr]. US power grid network represents the western US power grid with nodes representing buses and links as transmission lines. The Wiki-vote network contains all the Wikipedia voting data from the inception of Wikipedia till January 2008. Nodes in the network represent wikipedia users and the edge from node to node represents that user voted on user .
This subsection evaluates the proposed framework in terms of Top-5% accuracy and algorithm execution time. Different types of graph with varying dimensions are considered for assessment. Table I tabulates the Top-5% accuracy of two different models, trained separately on power-law (PL) and power-law cluster graph (PLC), respectively. The models are trained on graphs of dimension varying from to nodes. The trained model is then employed to predict node ranks in graphs of higher dimensions, i.e., , , and nodes. The tables III and II report accuracy for the effective graph resistance and weighted spectrum in a synthetic graphs, respectively. It can be observed that the mean accuracy in detecting Top-5% of the nodes is more than for both the robustness metrics. This demonstrates the accuracy of the proposed framework in approximating node resilience scores.
The scalability of the proposed approach is illustrated through tables III and IV, which reports the Top-5% accuracy in real-world networks, predicted through synthetic graph-based trained models. It can be inferred that the model has sufficiently high accuracy in detecting critical nodes for both resilience metrics, even though the model has never seen the real-world graph during the training period. The performance can further be improved if we implement transfer learning on a model trained with the synthetic graphs. The tuning on the real-world graph can be done very quickly using a significantly lower number of nodes say of the total nodes in the graph.
Along with the accurate identification of nodes, the proposed framework provides an appreciable advantage in computational time which is indicated by the running times in table V. The proposed method is multiple orders faster than the conventional approach. All the training and experiments are conducted on a system with Intel i7 processor running at 2.2 GHz with 6 GB Nvidia RTX 2070 GPU. The time reported for the proposed approach only includes the prediction time as training is done offline. Even if we include the training time, the proposed method would be substantially faster compared to a conventional method. This is because, the training is usually done on a small subset of nodes whose ground truth generation does not consume as much time as compared to all the nodes. Secondly, the training of GNN on small subset of nodes is relatively faster compared to the computation of ranks for all the nodes with the conventional approach.
|graphspecs||Ntest 1000||Ntest 5000||Ntest 10000|
|graphspecs||Ntest 1000||Ntest 5000||Ntest 10000|
|graphspecs||Ntest||Model||Time: proposed (s)||Time: conventional(s)|
This paper proposes a graph neural network based ILGR framework for fast identification of critical nodes in large complex networks. Criticality is defined based on two graph robustness metrics, i.e., effective graph resistance and weighted spectrum. ILGR framework consist of two parts, where in the first part, a graph neural network model is trained on a synthetic graph with a small subset of nodes. The second part deals with the prediction of ranks for unseen nodes of the graph. The mean Top-5% accuracy of the model is more than for both the robustness metrics. Further, the scalability of the model is shown by predicting node ranks on real graphs. The proposed approach is multiple orders faster compared to the conventional method. As part of future work, we will explore other robustness metrics and extend the framework from node to link identification.