Given an undirected graph and an integer , the critical node detection problem (CNDP) is to identify a set of nodes whose removal maximally degrades network connectivity according to some predefined connectivity metrics. An important variant of CNDP is critical node problem (CNP), in which the connectivity metric is defined as pairwise connectivity. Recently, the CNP has attracted much attention for its wide real-world applications in a number of fields, e.g., risk management [Arulselvan et al.2007], network vulnerability assessment [Shen et al.2013], biological molecule studies [Boginski and Commander2009, Tomaino et al.2012], and social network analysis [Fan and Pardalos2010, Leskovec et al.2007].
di_summa_complexity_2011di_summa_complexity_2011 showed that the CNP can be solved in polynomial time with dynamic programming over trees. While for general graphs, the CNP is known to be NP-hard [Arulselvan et al.2009]. Currently, there are mainly two types of algorithms for CNP, i.e.
, exact algorithms and local search algorithms. Exact algorithms solve the CNP by using Integer Linear Programming (ILP)[Arulselvan et al.2009, Di Summa et al.2012, Veremyev et al.2014a, Veremyev et al.2014b, Ventresca and Aleman2014a, Ventresca and Aleman2014c, Shen et al.2013]. These algorithms in ILP formulation can guarantee the optimality of their solutions, but the drawback is that they will require exponential computation time in the general cases.
Consequently, many efforts have gone into the studies of local search algorithms that can return a good quality solution within a reasonable time. An early greedy algorithm was proposed by ashwin_arulselvan_detecting_2009 ashwin_arulselvan_detecting_2009 and impoved later by many heuristic algorithms[Ventresca and Aleman2014b, Aringhieri et al.2015, Addis et al.2016, Aringhieri et al.2016b]. In [Aringhieri et al.2016b], the authors proposed a method based on a general Variable Neighborhood Search (VNS) framework and another one is based on an Iterated Local Search (ILS) framework. Moreover, two metaheuristic approaches, namely simulated annealing and population-based incremental learning methods have been explored [Ventresca2012] for large networks. Recently, two evolutionary algorithms were proposed. The first is an efficient evolutionary framework for solving different variants of the CNDP, including the CNP [Aringhieri et al.2016a]. The second is an approach based on the Memetic Algorithm (MA), which achieves state-of-the-art performance [Zhou et al.2018]. For a detailed review of the CNP, we refer the reader to a comprehensive survey by lalou_critical_2018 lalou_critical_2018.
Although a considerable number of algorithms for CNP have been developed, they all suffer from the great computational complexity of calculating the objective function value, i.e., the pairwise connectivity. A common drawback of all existing algorithms is that the objective function of a neighbor candidate solution have to be computed from scratch, resulting in a slow searching process, especially for the exploitation phase. Indeed, this drawback has also been pointed out in recent works [Aringhieri et al.2016b, Zhou et al.2018].
To overcome this problem, aringhieri_local_2016 aringhieri_local_2016 presented an improved neighborhood search algorithm by performing a modified Connect algorithm. As a result, they obtained two refined neighborhood without losing the quality of neighbors, namely Neighborhood and , which improve the efficiency of origin operation. While other algorithms resort to reduce the size of the neighborhood candidate solutions by sacrificing the quality of the best neighbor during exploitation. For instance, zhou_fast_2017 zhou_fast_2017 breaks the traditional greedy operation into two distinct and operation. Another alternative method is to redefine the neighbors of a candidate solution by considering the problem feature, e.g., the largest component in the residual graph [Zhou et al.2018]. Overall, no faster algorithm is found so far for the computation of the objective function.
In this paper, we propose the first incremental evaluation mechanism (IEM) for the CNP, which can speed up the computation of the objective function. The basic idea of IEM is to track the component configuration which is to maintain and utilize the size and index of each component and during the search. Based on the component configuration maintained, the objective function of a candidate solution can be computed by means of obtained computed objective function value. The computation of the objective function value is necessary to evaluate a candidate solution, thus IEM can speed up the evaluation process for each iteration of all existing algorithms.
There are two important greedy operations in the local search algorithms for CNP. First is operation, which is widely used in previous algorithms [Arulselvan et al.2009, Aringhieri et al.2016b]. aringhieri_local_2016aringhieri_local_2016 improved it to a more efficient operation, namely . Second is - operation, which can be regarded as a two-stage greedy operation, which is used in [Zhou and Hao2017, Zhou et al.2018]. In this work, we equip two operations with IEM to get two new operations, then we compare the computational complexity of proposed operations with previous works theoretically. Moreover, we carry out experiments to show the significant improvement of IEM on two greedy operations.
Indeed, a common element of CNDP problems is that computing the objective function from scratch is costly, hence IEM can be easily generalized to other variants of CNDP. Finally, we implement a simple evolutionary algorithm and its improved version equipped with IEM to solve CNP. By comparing our results with the state-of-the-art algorithm, we found out the effectiveness of IEM.
The paper is organized as follows. The next section introduces some necessary technical preliminaries. We introduce our main ideas IEM in Section 3. In Section 4, we present an application of IEM, i.e., improving operation, along with related experiments. Experiments of IEM in comparison with the state-of-the-art algorithm are shown in Section 5. Finally, we make conclusions and outline future work.
In this section, we begin with some basic definitions and notations. Then we review the greedy operations for CNP.
2.1 Definitions and Notations
In the following, we use to denote a graph where is the set of nodes and is a set of edges. The size of a graph is defined as the number of nodes. In a graph , each edge is a 2-element subset of . For an edge , we say that nodes and are the endpoints of edge, and is adjacent to . A graph is connected when there is a path between every pair of nodes. The component is defined to be a connected subgraph where and . We will use and to denote the -th component and its size, respectively. The neighborhood of a node is .
Given an undirected graph and an integer . The CNP seeks to find a set of at most nodes, the deletion of which minimizes pairwise connectivity in the remaining graph .
Figure 1 shows a CNP instance, where the undirected graph consists of seven nodes, seven edges, and . A candidate solution is a set of nodes. If a candidate solution is , then the graph is broken into two components whose nodes are and . The pairwise connectivity of each component is and , so the residual objective value is .
The neighbor of a candidate solution is also a candidate solution that differs in only one node. We say a neighbor is better than neighbor if . Given a candidate solution , the decrement (resp. increment) of a node (resp. ) is defined as the decremental (incremental) pairwise connectivity after removing node (resp. reintroducing node ), which is (resp. ).
2.2 Review of Greedy Operations
Deleting one more node from the residual graph always leads to a better solution, thus the common practice for exploitation is to begin with an initial solution of size , then updates it to the best neighbor by greedy operations, e.g., operation, - operation, etc. We review the and the - below.
: Given a candidate solution , a complete neighborhood evaluation requires to select all the nodes , with and pair them with all the nodes . Then the size of whole neighbors is . The operation is to find the best neighbor from neighbors. Currently, must be computed through a algorithm computing the connected components of a graph [Hopcroft and Tarjan1973], which requires . Hence the complexity to find the best neighbor is , which is very time-consuming. Recently, aringhieri_local_2016 aringhieri_local_2016 proposed an improved algorithm to do , named (with neighborhood , we do not mention because it is not as good as ). The computational complexity of is , which is the best operation up to now.
-: Given a candidate solution of nodes, this operation first expend it to of nodes by adding a node with maximum decrement, where the complexity is . Then it removes a node from with minimum increment, thus the complexity is . Overall, the whole complexity for this method is .
3 Main Ideas
In this section, we present the Incremental Evaluation Mechanism to speed up the computation of the objective function for CNP. We first explain the details of IEM, then analysis the computational complexity.
3.1 Incremental Evaluation Mechanism
In this subsection, we present a new mechanism, named IEM, to compute objective function incrementally. The basic idea of IEM is to maintain and utilize the component configuration during the search. The increment after removing node from can be computed by means of component configuration, and the decrement after adding node to can be computed by only traversing several components that are related to and . In the following, we first state two key definitions and explain IEM.
Given a candidate solution , a move is an update action applied on , denoted as (, ), where is a node to be removed from and is the node to be added to .
The current component configuration is defined as a tuple , where is a set of variables indicating the size of -th component and : is a function which maps a node to the index of the component containing .
The IEM is described as the following three steps. First two steps track component configuration . Last step is to compute objective function value incrementally by means of .
Initialization: Given an initial candidate solution , IEM computes and simultaneously obtains the initial component configuration by traversing the whole graph with a depth-first search (DFS) process [Cormen et al.2009].
Update : Given current component configuration and a move (, ), the component configuration after the move can be updated with Algorithm 1. In this algorithm, we initialize a new index for the component to be merged (line 1), and assign its size with 1 (line 2). Then we update the component configuration after putting node back to the residual graph (line 3-5), that is, increasing the size of the merged component (line 4), and remove the information of the size of neighborhood component (line 5), updating for each neighbors of in the residual graph (line 6). Similarly, we update after the component is split by deleting the node (line 8-12).
Evaluation: Given current candidate solution , , component configuration , and a move . The increment of can be computed easily by means of , which is illustrated in formula (2).
Where denotes a mapping from the component size to corresponding pairwise connectivity, i.e., , and . Take the graph in Figure 1(b) as an example. If we put node back into the residual graph, the size of new component is the sum of the size of , , and itself, which is four. Hence , meaning the increment of node is . The decrement of requires only traversing the -th component instead of the whole graph with a modified Connect algorithm [Hopcroft and Tarjan1973]. Then the objective function after move (, ) can be directly computed with the formula below.
By bringing in the component configuration, IEM computes the pairwise connectivity efficiently during the search, and the correctness of IEM is not difficult to prove.
3.2 Computational Complexity
Now we analyse the computational complexity of IEM. For the step of Initialization, we need a traversal on the whole graph to obtain the initial component configuration, thus the complexity is . When IEM updates after move , as described in Update, the complexity includes two parts, where the first part is to connect several components into one component (line 3-6) with complexity ( is the component containing ), and the second part splits graph with (line 8-11), which needs a traversal through the component of with a complexity ( is the component containing ). At Evaluation step, we compute increment with only a complexity by formula (2), where is the maximum node degree in . While for , we also need to traverse its component , but fortunately it can be done simultaneously when updating the component configuration of (line 3-6).
We conclude the time complexity of IEM and previous evaluation method (i.e., by DFS) in Table 1. The 2nd-4th column report the step of move , time complexity, and space complexity, respectively. indicates the sum of the sizes of components around and . Initially (#move=0), both of origin method and IEM compute objective function from scratch. With the optimization of the solution, the size of the biggest component decreases and the computational complexity of IEM gets smaller (this is also shown in an experiment in the next section). Intuitively, IEM don’t traverse the component that is not changed after move by taking the advantage of component configuration. The space complexity of IEM is linear to the size of component configuration, which is only for .
|Method||#move||Time complexity||Space complexity|
4 Applications of IEM
In this section, we evaluate the effectiveness of IEM on exploitation by applying IEM to two important greedy operations, i.e., and -. Then we evaluate the new operations on standard benchmarks for CNP. Finally, we discuss the generalization of IEM.
4.1 New Swap Operation
We mentioned that is an improved version of . Now we further improve operation with IEM to get a faster operation, denoted as + . The pseudocode of + is outlined in Algorithm 2.
In the beginning, we compute the decrement for each node by a modified depth-first search process, namely Connect (line 1), then we initialize to denote the best move (line 2). There is a loop to traverse all nodes in the current candidate solution (lines 3-8). In each loop, + searches the best move for , denoted as (, ). If this move is better than the best move found so far within the loop, which means this move leads to a more decrement than increment, we update it (line 9-10). After the loop, + computes the best neighbor (line 11) and its objective function value in an incremental way (line 12), then + updates the component configuration (line 13), which is the key procedure for + . Finally, + returns the best neighbor , , and (line 14).
In each loop, the increment caused by reintroducing node to the residual graph can be computed with the formula (2) (line 4). While the computation for best is much more complex (line 5-8). The intuition behind this computation is to choose the biggest decrement value among and , where indicates the “global optimum” when (biggest decrement of all nodes in the whole residual graph) and is “local optimum” (biggest decrement in a single component). Since is obtained when the node has not been put back, the value in may become invalide after is put back. Therefore, we consider both of “global optimum” and “local optimum”.
For example, in Figure 1(b), . If , meaning to put node back, then . The best removed node of ‘local optimum’ is and . Because we compute by assuming put node back, the previous computed values , , , and become invalid. By comparing with and , we get the best removed node is .
With the method described above, we iteratively update the best move until the end of the loop. Compared with , + doesn’t need to traverse the whole graph for each step of the loop, resulting in a more efficient greedy operation. Moreover, we improve - ( for short) with Neighborhood [Aringhieri et al.2016b], named and its IEM version +. Due to the limit of space, we will not go into details of but directly show the complexity comparison in the following table.
Table 2 illustrates the complexity of origin operation, and + , where is the largest component during the search and is the maximum node degree in G. The time complexity of + is obtained by replacing in with the time complexity of IEM, which is , along with only one DFS (with complexity ) out of loop to compute in line 1. For CNP, the average size of all components is usually small, especially when the candidate solution is close to the optimum, thus + has lower complexity than . While + will make little difference if the average size of the component is close to the whole graph , meaning that the graph is dense. In this case, the complexity of + is .
4.2 Experiments on Greedy Operations
To evaluate the effectiveness of + and +, we conduct an experiment by applying , + , , and + into the same generic local search algorithm for CNP, which is based on the Algorithm 4 of [Aringhieri et al.2016b], resulting in four corresponding algorithms. All of four algorithms begin with a random initial solution, then continuously move to next candidate solution with each operation until reaching the local optimum. If one algorithm stucks in local optimum, it will restart with a new random solution. We run four algorithms on linux machine with 3.60 GHz Intel Core i7 and 8GB RAM. Timeout is set to 30 minutes for each algorithm.
Table 3 presents the comparison results on those operations. The 1st column indicates two popular benchmarks for CNP, which are Synthetic benchmark set [Ventresca2012] with 16 instances and Real-world benchmark set [Aringhieri et al.2016a] with 26 instances. The 2nd-3rd (resp. 5th-6th) columns report the number of iterations of and + (resp. and +). The column 4 and 7 indicates the percentage of promotion on the number of iteratations improved by IEM when compared with and +, respectively. In this table, The results show that IEM dramaticlly improves the speed of iterations except one instance (‘astroph’) on operation. The reason is that the complexity of + is not better than when the graph is dense and the quality of current candidate solution is bad, which means the graph is still connected after removing nodes randomly. However, it can be solved easily by initializing the candidate solution with a vertex cover for a well developed algorithm.
4.3 Discussions on IEM
Essentially, the main contribution of IEM is to speed up computing objective function, hence, it can be directly used in many local search algorithms, e.g., the state-of-the-art algorithm, named MACNP [Zhou et al.2018].
Besides, the idea of IEM can be generalized to many variants of CNDP. Recall that CNDP is a class of problems to identify the critical nodes in a network, aiming to evaluate the robustness of the whole network. For a given candidate solution , the objective function of is often related to all nodes in the graph. For instance, MaxNum is a variant of CNDP whose goal is to degrades the number of components in the residual graph after deleting nodes. By using IEM, such objective value can be computed easier than CNP.
In this section, we adopt two popular benchmarks, i.e., Synthetic benchmark set and Real-world benchmark set, which are mentioned before. Then we evaluate the effectiveness of IEM by applying it to the state-of-the-art algorithm.
Since the source code of the state-of-the-art algorithm MACNP is not available, we reimplemented their algorithm completely based on their paper [Zhou et al.2018] and another version equipped with IEM. Although IEM improves the results of MACNP, its performance is still not comparable with the results attained by running their binary code111Available at http://www.info.univ-angers.fr/pub/hao/cnps.html
. Considering the study of particle swarm optimization algorithms for CNP is an interesting research direction that has not been investigated before[Lalou et al.2018], we implemented PSOCNP to solve CNP based on a discrete particle swarm optimization algorithm [Pan et al.2008]. Based on IEM, we implemented an improved version of PSOCNP, named PSOCNP+IEM. Both of them are implemented in C++ and compiled by g++ with ‘-O3’ option.
All the experiments were carried out on the same platform, which is a Linux machine equipped with an Intel i7-9800x processor with 3.6 GHz and 32 GB RAM. We rerun the binary code of MACNP. Each algorithm was tried 30 times and 3600 seconds for each trial. All the results are shown in Table 4. In this table, we use , , , and to indicate the best objective value, the average objective value, the average time in seconds to attain the , and the number of successful trials to attain , respectively. At the last row, stands for instances outperformed by the algorithm compared with MACNP and instances on which both MACNP and the algorithm achieved the same quality of the solution. The symbol ‘’ in instance “ER2344” and “hepth” are two new found upper bounds.
In bold we present the best results. PSOCNP+IEM outperforms PSOCNP in all instances, which means it attained better and in 27 instances and achieved the same quality solution but faster in another 15 instances. Although the results of PSOCNP is not comparable with MACNP, we delightedly found PSOCNP+IEM spent less time to attain in 14 instances compared with MACNP, and it significantly improves the quality of in 21 hard instances. Thanks to IEM, the PSOCNP has been improved to be a competitive algorithm.
6 Conclusions and Future Work
This paper focused on the computation of the objective function for CNP. We introduced a new mechanism, called IEM, to compute the objective function incrementally in low complexity. To evaluate the effectiveness of IEM, we compared operation equipped with IEM with the previous one, leading to a faster exploitation process. Besides, we use a simple PSOCNP and its IEM version to compare with the state-of-the-art algorithm MACNP. The experimental results show that the IEM significantly improves the performance of PSOCNP.
In the future, we plan to further study the variants of CNDP with IEM, and to improve the current PSOCNP algorithm. Moreover, it is interesting to seek for more efficient evaluation methods to compute the objective function.
- [Addis et al.2016] Bernardetta Addis, Roberto Aringhieri, Andrea Grosso, and Pierre Hosteins. Hybrid constructive heuristics for the critical node problem. Annals of Operations Research, 238(1-2):637–649, 2016.
- [Aringhieri et al.2015] Roberto Aringhieri, Andrea Grosso, Pierre Hosteins, and Rosario Scatamacchia. VNS solutions for the Critical Node Problem. Electronic Notes in Discrete Mathematics, 47:37–44, 2015.
[Aringhieri et al.2016a]
Roberto Aringhieri, Andrea Grosso, Pierre Hosteins, and Rosario Scatamacchia.
A general Evolutionary Framework for different classes of
Critical Node Problems.
Engineering Applications of Artificial Intelligence, 55:128–145, 2016.
- [Aringhieri et al.2016b] Roberto Aringhieri, Andrea Grosso, Pierre Hosteins, and Rosario Scatamacchia. Local search metaheuristics for the critical node problem. Networks, 67(3):209–221, 2016.
- [Arulselvan et al.2007] ASHWIN Arulselvan, Clayton W Commander, Panos M Pardalos, and OLEG Shylo. Managing network risk via critical node identification. Risk management in telecommunication networks, 2007.
- [Arulselvan et al.2009] Ashwin Arulselvan, Clayton W. Commander, Lily Elefteriadou, and Panos M. Pardalos. Detecting critical nodes in sparse graphs. Computers & Operations Research, 36(7):2193–2200, 2009.
- [Boginski and Commander2009] Vladimir Boginski and Clayton W Commander. Identifying critical nodes in protein-protein interaction networks. In Clustering challenges in biological networks, pages 153–167. 2009.
- [Cormen et al.2009] Thomas H Cormen, Charles E Leiserson, Ronald L Rivest, and Clifford Stein. Introduction to algorithms. 2009.
- [Di Summa et al.2011] Marco Di Summa, Andrea Grosso, and Marco Locatelli. Complexity of the critical node problem over trees. Computers & Operations Research, 38(12):1766–1774, 2011.
- [Di Summa et al.2012] Marco Di Summa, Andrea Grosso, and Marco Locatelli. Branch and cut algorithms for detecting critical nodes in undirected graphs. Computational Optimization and Applications, 53(3):649–680, 2012.
[Fan and Pardalos2010]
Neng Fan and Panos M Pardalos.
Robust optimization of graph partitioning and critical node detection
in analyzing networks.
International Conference on Combinatorial Optimization and Applications, pages 170–183, 2010.
- [Hopcroft and Tarjan1973] John Hopcroft and Robert Tarjan. Algorithm 447: Efficient Algorithms for Graph Manipulation. 16(6):7, 1973.
- [Lalou et al.2018] Mohammed Lalou, Mohammed Amin Tahraoui, and Hamamache Kheddouci. The Critical Node Detection Problem in networks: A survey. Computer Science Review, 28:92–117, 2018.
- [Leskovec et al.2007] Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance. Cost-effective outbreak detection in networks. In SIGKDD13, pages 420–429, 2007.
- [Pan et al.2008] Quan-Ke Pan, M Fatih Tasgetiren, and Yun-Chia Liang. A discrete particle swarm optimization algorithm for the no-wait flowshop scheduling problem. Computers & Operations Research, 35(9):2807–2839, 2008.
- [Shen et al.2013] Yilin Shen, Nam P Nguyen, Ying Xuan, and My T Thai. On the discovery of critical links and nodes for assessing network vulnerability. IEEE/ACM Transactions on Networking, 21(3):963–973, 2013.
- [Tomaino et al.2012] Vera Tomaino, Ashwin Arulselvan, Pierangelo Veltri, and Panos M Pardalos. Studying connectivity properties in human protein–protein interaction network in cancer pathway. In Data Mining for Biomarker Discovery, pages 187–197. 2012.
- [Ventresca and Aleman2014a] Mario Ventresca and Dionne Aleman. A derandomized approximation algorithm for the critical node detection problem. Computers & Operations Research, 43:261–270, 2014.
- [Ventresca and Aleman2014b] Mario Ventresca and Dionne Aleman. A Fast Greedy Algorithm for the Critical Node Detection Problem. In Zhao Zhang, Lidong Wu, Wen Xu, and Ding-Zhu Du, editors, Combinatorial Optimization and Applications, volume 8881, pages 603–612. Cham, 2014.
- [Ventresca and Aleman2014c] Mario Ventresca and Dionne Aleman. A region growing algorithm for detecting critical nodes. In International Conference on Combinatorial Optimization and Applications, pages 593–602, 2014.
- [Ventresca2012] Mario Ventresca. Global search algorithms using a combinatorial unranking-based problem representation for the critical node detection problem. Computers & Operations Research, 39(11):2763–2775, 2012.
- [Veremyev et al.2014a] Alexander Veremyev, Vladimir Boginski, and Eduardo L Pasiliao. Exact identification of critical nodes in sparse networks via new compact formulations. Optimization Letters, 8(4):1245–1259, 2014.
- [Veremyev et al.2014b] Alexander Veremyev, Oleg A Prokopyev, and Eduardo L Pasiliao. An integer programming framework for critical elements detection in graphs. Journal of Combinatorial Optimization, 28(1):233–273, 2014.
- [Zhou and Hao2017] Yangming Zhou and Jin-Kao Hao. A fast heuristic algorithm for the critical node problem. In GECCO17, pages 121–122, Berlin, Germany, 2017.
- [Zhou et al.2018] Yangming Zhou, Jin-Kao Hao, and Fred Glover. Memetic search for identifying critical nodes in sparse graphs. IEEE Transactions on Cybernetics, pages 1–14, 2018.