Modern network science [1, 2, 3, 4, 5] has brought crucial and significant improvements to our understanding of complex system . One of the most prominent features for representing real complex network systems is the structure of communities [7, 8], i.e. the organization of vertices, for which vertices in the same community have more connections than ones in the other communities . In a complex system, each community is often composed of multiple entities with similar properties, which provides a deep insight into the structure and function of the whole network system [10, 11]. Therefore, detecting communities is of great importance for many different applications of biology, physics, sociology and computer science, where the system is usually modeled as a complicated network with edges linking each related node pairs .
With this respect, many different methods were proposed to solve the challenging problem of partitioning independent communities from the given network: one of the most popular methods, proposed by Girvan and Newman , is to identify network communities through maximizing the modularity of the associate network, which has been the essential criterion of many community detection methods. However, optimizing a network’s modularity is mathematically nontrivial. Recent studies demonstrated that exactly maximizing the modularity of network is NP-complete, for which many polynomial-time approximation methods have been proposed, such as the greedy method , simulated annealing , extremal optimization , intelligent optimization [16, 17]
, game theory-based methods and spectral methods 
. Another way to deal with such a network partition problem is to construct the affinity matrix of the associate network graph, where the affinity matrix encodes the seminal information about network structure and connectivities, and compute the optimized nodal labels over the given graph which directly separate the network graph into different partitions or communities. According to the proposed optimization criterion of labeling estimation, many different approaches were proposed to recover such labels of graph nodes, for example, label propagation and random walks over graphs[20, 21]22, 23] based balanced graph cut approaches including ratio cut , normalized cut , p-Laplacian based graph cut  and cheeger cut  etc., however, most of them are computationally expensive and inefficiency due to their posed non-convex optimization formulations. In contrast, min-cut-based methods [28, 29] can solve the studied combinatorial optimization problem efficiently in an approximate way along with the capability to handle large-scale graphs. Recently, convex optimization was developed as a popular optimization framework to build up fast solvers for labeling recovery in the spatially continuous setting [30, 31, 32, 33] etc.: the main ideas of related convex optimization approaches, i.e. relax the binary label constraints to a continuous convex set and round the result of the reduced convex optimization problem back to binary, can be directly extended to discover the optimum labeling of each graph node, for example Yin et al’s total-variation-based region force method (TVRF) to semi-supervised clustering , which introduced a fast splitting optimization framework to the proposed convex optimization problem and outperformed most state-of-the-art graph partitioning algorithms.
Besides the matter of developing efficient optimization algorithms, another major challenges of detecting communities from some network are in two folds: First, it lacks reliable descriptions to encode inherent network structures and affiliation of any node to the specified community. One common way is to sample some graph nodes into some communities beforehand, so called benchmark nodes, which can partially solve this difficulty: benchmark nodes reveal limited network structure information which can still indicate the recovery of other nodes belonging to the same community through evaluating the given network’s coreness, betweenness etc.; in addition, benchmark nodes provide meaningful starting points to propagate label along graph links to the other nodes, i.e. label inference, which can be incorporated into the often-used optimization procedures. Second, it could be hard to get enough benchmark nodes in real-world applications; especially for the large sparse networks, low degree benchmark nodes can hardly provide useful information about network communities and worsen performance of the followed partition procedure. Therefore, how to discover more ’trustable’ benchmark nodes during the whole computing process becomes one key factor of extracting partitions or communities accurately from the given network.
I-a Contributions and Organization
Motivated by previous studies, we present a novel two-stage optimization method for network community partition, based on the internal structure measure of the given network graph. The proposed optimization approach utilizes a new network centrality measure of both links and vertices to construct the key affinity matrix of the given network correctly, for the cases vanishing node similarities which commonly happen for network community detection. The network centrality information actually reveals the essential structure of network, which, hence, provides a proper clue for detecting network communities and introduces an additional ‘confidence’ criterion for labelings by referencing the labeled benchmark nodes. Particularly, the two-stage optimization method makes use of the network centrality-based ‘confidence’ measure for the stage of benchmark node refinement, and an efficient convex optimization algorithm to the solve the followed challenging combinatorial optimization problem of graph clustering, which is developed based under a new variational perspective of primal and dual. Refining benchmark nodes can effectively improve the accuracy and reliability of the proposed optimization approach. Experiments over both artificial and real-world network datasets demonstrate that the proposed optimization method of community detection outperforms the state-of-art algorithms in terms of accuracy and reliability.
I-B Definitions and Notations
Let be the set of communities, which is represented by the graph with vertices (or nodes) and edges (or links); each edge , where , denotes the existing link between two nodes and , and each community , , is a distinct subgraph of . The connectivity of can be expressed as its adjacency matrix whose -entry means there exists a link between the two nodes and , and otherwise. The matrix represents the affinity matrix of graph , where measures the similarity between the two vertices of and , and is usually given as a symmetric matrix with non-negative entries. Additionally, the diagonal matrix is given by , .
With this, the linear operators of gradient and divergence over the graph are introduced as follows : for some scalar function given at each node , its gradient evaluates the difference of between two nodes and along the link such that
whose -norm, , is measured as
for some function given at each edge , its divergence at the node measures the balance of over all the edges associate linking the neighbour nodes around , i.e. such that
Ii Semi-Supervised Graph Partition and Convex Optimization Model
In this work, we aim to partition communities from a given graph network with a new two-stage optimization method. The proposed optimization approach utilizes a new network centrality measure of both links and vertices to construct the key affinity matrix of the given network correctly, for the cases vanishing node similarities which commonly happen for network community detection. The network centrality information actually reveals the essential structure of network, which, hence, provides a proper clue for detecting network communities and introduces an additional ‘confidence’ criterion for labelings by referencing the labeled benchmark nodes. Particularly, the two-stage optimization method makes use of the network centrality-based ‘confidence’ measure for the stage of benchmark node refinement, and an efficient convex optimization algorithm to the solve the followed challenging combinatorial optimization problem of graph clustering, which is developed based under a new variational perspective of primal and dual. Refining benchmark nodes can effectively improve the accuracy and reliability of the proposed optimization approach.
Ii-a Semi-Supervised Graph Partitioning
Graph partitioning targets to cut the given graph into multiple independent subgraphs (or communities). Let be a binary matrix, where denotes the node belongs to the community or not . Then, graph partitioning tries to minimize the following energy function:
Also, the convex penalty function of each term in (4) can also be the quadratic function or -norm function , , which results in the weighted Laplacian or -Laplacian as the energy function of (4) [23, 26].
In addition, each vertex belongs to only one subgraph/community, i.e.
where denotes the -th column of .
It is clear that the optimization model (12) has a trivial solution, where all vertices belong to the same community. One important way to avoid this situation is to integrate priori information into the proposed optimization problem (4), for example, some benchmark nodes are labeled, hence separating such partially labeled graph into multiple independent subgraphs/communities introduces a proper semi-supervised graph partition problem [36, 34]. Indeed, the pre-labeled nodes helps improving the partition results for the given graph mainly in two folds: first, the labeled nodes provide the starting positions to propagate labels to the other vertices ; second, they also reveal the essential features to construct graph partitioning hints in geometry or other respects (see the following section).
Let , , be the benchmark set which represents a sample fraction of the community , and the total benchmark set . To this end, we have
With the locations of pre-labeled nodes, we define the novel measure , and
, which characterizes the probability of each vertexbelonging to community such that
and the matrix is the corresponding normalized affinity matrix; when the denominator is zero, set . Therefore, we can integrate the cross-entropy information between the possibility and the label function into the optimization model (6) which gives rise to the following optimization problem:
subject to the constraint (5).
Ii-B Convex Relaxation and Dual Optimization
Finding the optimum to the proposed minimization problem (LABEL:fi_function) over the binary constraint is challenging, actually NP hard which means there is no efficient polynomial-time algorithm for such combinatorial optimization problem (LABEL:fi_function). In practice, we often replace the binary constraint by its convex relaxation instead; hence, we have
On the other hand, combine the two constraints and (5), i.e.
which denotes that, for each node , the -th row of the matrix belongs to the -dim simplex set .
where , subject to
The convex optimization problem (10) is mathematically equal to the following maximization problem
In addition, the optimum , and , to the original convex optimization problem (12) are just the optimal multipliers to the above linear equality constraints.
The proof can be found in the appendix -A.
Given the fact that the dual optimization model (13) is equivalent to the studied convex optimization problem (12), where the optimum to (12) works as the optimal multipliers to the linear equality of the dual problem (13), the energy function of the respective primal-dual model (23) is just the conventional Lagrangian function of (13):
In this paper, we employ the classical augmented Lagrangian method (ALM)  to construct a novel efficient ALM-based algorithm to tackle the linear equality constrained dual optimization problem (13), which can resolve both and the additional dual variables simultaneously. Upon the above classical Lagrangian function, we define its augmented Lagrangian function
fix , compute by maximizing :
update by the computed :
Once the proposed ALM-based (Alg. 2) converges to some optimum , and , we can simply round into its binary version, such that for each node , when and .
Iii Network Structure Centralities and Two-Stage Community Partition
Clearly, it is the key factor for partitioning a graph or network accurately that the right affinity description is provided for (LABEL:fi_function) and (8). The classical way for most state-of-art clustering methods is to employ similarities between nodes or specified nodal features for constructing the associate affinity matrix, which is, however, unavailable in many cases of network clustering. In this work, we propose a novel method to calculate such network affinity matrix based upon inherent structure centrality of the network links and vertices, and introduce a new two-stage optimization strategy (TSOS) to cluster the communities with both efficiency and accuracy.
Iii-a Network Structure and Betweenness of Links
In this section, we define the affinity/adjacency matrix directly from the network structure information of link centrality, i.e. betweenness of network links.
where is the total number of all shortest paths from any node to a different node , and is the number of such paths through the link .
Betweenness of the link is actually one of the most important factors for network partition: high betweenness of a link can be taken as the bridge to connect two communities, which means that once removed, the number of isolated network blocks would increase. Indeed, this is exactly the expected edge to separate the network. To this end, for an edge , we can define the corresponding weight where is a positive strictly decrease function, i.e. the cost of cutting the edge with high betweenness value is low, hence partitioning would likely happen on this edge. In this paper, we consider the inverse of betweenness as the definition of the affinity matrix :
Iii-B Nodal Centrality, Benchmark Confidence and Two-Stage Optimization Strategy (TSOS)
Actually, the ‘core members’ or ‘core nodes’ of each network community are closely connected with each other and dominate more nodes than the other ones within one hop range, which defines the centrality of network nodes. Clearly, such core nodes with the correct label will directly find the other core nodes with the same label, i.e. the core members of the respective community, once the core nodes are discovered beforehand (see the following section for details).
The topological centrality of each node can be quantified through the concept of -core, which is defined as the largest subnetwork in which every node has at least links, i.e. with the degree . As shown in Fig. 2, the k-core of a given network can be obtained by recursively removing all nodes with the degree less than , until all the nodes in the remaining network have the degree not less than . Repeating this for , finally determines the -shell decomposition of a network. Hence, the coreness of each node , , is then defined as the integer for which this node belongs to the -core but not to the -core [41, 42]. In general, the node with a bigger coreness value must have a higher centrality. In this paper, we therefore adopt such coreness to characterize the nodal centrality and the ‘core nodes’, or ‘core members’, are the ones with the highest coreness number.
On the other hand, the coreness of any node defines the closeness of such node to the ‘core nodes’ of the associate community; hence, we define the related ‘confidence’ measure that evaluates the possibility of any node belonging to the corresponding community :
where bigger , , means lower betweenness in terms of (16) and lower likelihood for cutting the associate link , so higher ‘confidence’ to associate two nodes and . In this sense, actually evaluates the ‘confidence’ to combine the node into the benchmark set of the community .
With such ‘confidence’ measure, we introduce a two-stage optimization framework: it first computes an initial network partition through the proposed ALM-based dual optimization algorithm; once the initial partitioned communities are obtained, the ‘confidence’ measure for each node within its initial partition is calculated by (17), so as to choose the new nodes with high ‘confidence’ as (18) into the related benchmark set ; using the expanded benchmark sets, the proposed ALM-based dual optimization is explored to recompute the network partition. More details of the two-stage optimization strategy can be found in Alg. 1.
In fact, the proposed two-stage optimization strategy does not require many initial benchmark nodes to ensure the accuracy of network partition, since the benchmark sets can be expanded with more dominate nodes of high confidence. Meanwhile, the two-stage optimization method, along with increasing benchmark nodes, essentially reduces the total number of undetermined graph nodes, this improves efficiency of the following partition procedure. On the other hand, the alternating steps of optimization and benchmark expansion can be performed not only two but also more than two times, hence a multi-stage optimization method. In practice, we found the two-stage-optimization can reach the result good enough, using more than two optimization stages does not improve the results significantly (see the experiment results of Fig. 4 for details).
We, in this work, explore two artificial networks of GN and LFR and 5 real-world networks to validate the effectiveness and efficiency of the proposed two-stage optimization strategy (TSOS), see Alg. 1, for partitioning the given network into multiple communities with inherent network structure information. Experiment results are recorded from the average performance of 20 independent trials, and compared with ground-truth.
For the unweighted networks of GN, LFR, Dophin, Football, and Polbooks, their affinity matrices are calculated by (16). For the data clustering networks of COIL and MINST, we adopt their given similarity weights to construct their affinity matrices directly by equation (8). In addition, we compare our proposed method with one of state-of-the-art data clustering approach proposed by Yin et al , namely the total-variation-based data clustering algorithm with region force (TVRF).
Iv-a Experiments of Benchmark Expansion, Optimization Stages and Parameter
Iv-A1 Bechmark Expansion
In this section, we show the proposed two-stage optimization strategy (TSOS) significantly improve the community partition results of networks, especially the sparsely connected networks. For the given sparse network of Zachary karate clubs consisting 34 vertices and 2 communities, as illustrated in Fig. 3, the labeled benchmark nodes dominate very few nodes (see Fig. 3(a)), thus provides not much network structure information and results in inaccurate partition result initially (see Fig. 3 (b)). Actually, shortage of pre-labeled nodes, or benchmark nodes with sufficient dominates, is often the big challenge for the state-of-the-art semi-supervised partition methods, which are suffering from less network structure information. Expansion of benchmark nodes are performed by (18) with the benchmark confidence measure (17). New benchmark nodes are selected as shown in Fig. 3(c), where four nodes are inserted into two respective benchmark sets. The followed partition procedure through ALM-based algorithm significantly improves the accuracy of community partition by , see Fig. 3(d)!
Iv-A2 Optimization Stages
The procedures of optimization and benchmark expansion, as Alg. 2, can be performed not only two but also more than two times, i.e. with multiple optimization stages. Experiment results shown in Fig. 4 indicate that the proposed two-stage-optimization strategy can reach the result with enough accuracy, performing more than two optimization stages does not improve the results significantly.
Iv-A3 Selection of Parameter in (18)
Nodes with high ‘confidence’ values are the good options for benchmarks. In order to ensure each new benchmark node chosen correctly, the value of should be selected high enough. By Chebyshev’s inequality , it confirmed that, for any distribution, the amount of data within times of standard deviations is at least ratio , which means
Thus, the high value of with small appearing probability stands out for a ’trustable’ selection. Experiment results over five different networks, see Fig. 5, show that most experiments do not get noticeable improvements when . In this work, we choose for networks of average degree ; for networks with average degree ; for networks with average degree , for example the graphs of MINST and COIL.
Iv-B Experiments on Artificial Networks of GN and LFR
|Algorithms||Accuracy (%) Classical||Accuracy (%)||Accuracy (%)|
GN artificial network  is proposed by Grivan and Newman, which is still one most popular topics discussed in related literatures. It provides a basic node set  with communities, the average total degree of each node is fixed to . At the same time, GN provides a flexible network generation mechanism which is controlled by the number of nodes of each community , the number of communities , the number of internal half-edges per node , and the number of external half-edges per node etc. Studies  show that the parameters and determine the detectable of network communities. Higher value of decreases the detectability of network communities . In this study, we test our proposed TSOS comparing with TVRF, over three different types of GN networks including the classical GN network and its two variants with different ( and ), for which each community has at least one benchmark node and the fraction of benchmark nodes is set as and .
As the results shown in Table I, picking more benchmark nodes results in higher partition accuracy while the same algorithm configuration is set up. For the classical GN network, both algorithms can reach accuracy when only nodes are used as benchmark. The proposed TSOS can still obtain a completely correct result for the GN network with , and a much higher partition accuracy than TVRF for the difficult case with . This shows the effectiveness of the proposed strategy by incorporating new benchmark nodes into an additional step of network partition refinement.
|Algorithms||Accuracy (%)||Accuracy (%)||Accuracy (%)||Accuracy(%)||Accuracy(%)|
In contrast to the homogeneous GN networks whose nodes have the same degree, which is actually not a good proxy of real networks with community structure, the artificial benchmark LFR network, proposed by Lancichinetti, Fortunato and Radicchi , has a power law distribution of degree. LFR benchmark is basically a configuration model with built-in communities , which is built by joining stubs at random selection, once one has established which stubs are internal and which ones are external to the stubs . The mixing parameters is the ratio between the external degree and the degree of each vertex i.e. . Obviously, when is low, each community can be better separated from the others. Here, we generate networks including nodes, with the value of ranging from [0.1 0.5], the distributions of degree and community size follow respective power laws of and , the average degree is set to and the community sizes , , are set from to .
In the experiments, each community has at least one benchmark node; and nodes are selected as benchmarks for each experiment, so about and benchmark nodes are picked, which are slightly bigger than the total number of communities, i.e. rather small samples. As shown in Tab.II, when increases, our proposed TSOS method can still keep the results with high accuracy and perform much better than the TVRF algorithm, hence more robust to increasing external degree, i.e. high mixing parameter does not affect the performance of TSOS more than TVRF. On the other hand, choosing more benchmark nodes promotes both algorithms’ performance; however, the proposed TSOS gets improved more significantly.
Iv-C Experiments on Real-World Networks
In this work, five real-world networks are used to validate the proposed TSOS method, which includes three classical networks of Dolphin network , Football network  and Political book network , and two data clustering sets of MINST  and COIL . The three classical social networks are widely used in many community detection studies; the COIL-100 (Columbia object image library-100) data set  contains many color images of different objects, its related graph network used in this paper includes randomly selected objects (1500 images) from the dataset and the edge weights of the built-up -NN graph are calculated through the Euclidean distance between two images; the MINST data  totally consists of 70000 size-normalized and centered images of handwritten digits , the images are naturally partitioned to roughly balanced clusters, a -NN graph is constructed from the original MINST data set and its edge weights are computed from the Euclidean distance between two images as
-dim vectors. These networks are considered as undirected and their network parameters are shown in Tab. III.
|P||Network||Benchmark nodes||TVRF (%)||TSOS (%)|
Experiment results of real-world networks are illustrated in Tab. IV. Similar as the other experiments, picking more benchmark nodes clearly improves network partition accuracy. In addition, the proposed TSOS method performs better for the cases with less initial benchmark nodes, while it can still obtain similar partition accuracy as TVRF for the cases with more initial benchmark nodes. Clearly, for a really small ratio of selected benchmark nodes to the total number of network nodes, e.g. MNIST, TSOS achieves much better partition accuracy than TVRF: by TSOS versus by TVRF (with nodes as benchmark), by TSOS versus by TVRF (with nodes as benchmark). This should thank to the introduced intermediate step of benchmark expansion with a proper confidence criterion.
, the computed results through the proposed TSOS often have less variance while keeping higher accuracy, hence better robustness in numerics. For example, clustering MINST data graph with onlynodes as benchmark, the variance of experiment results by TSOS is only , which is much less than the results’ variance by TVRF. Detailed performance for each experiment setting can be found in Fig. 6. Moreover, the total numbers of benchmark nodes after expansion are much more the initial benchmark nodes, as blue bars vs. yellow bars shown in Fig. 6. Such computational robustness is often the seminal factor of partitioning large-scale networks, especially when only a small portion of nodes are available as benchmark.
V Conclusions and Future Studies
We introduce a novel two-stage optimization strategy for partitioning network communities, which makes use of inherent network structure information, i.e. the new network centrality measure of both links and vertices, so as to construct the key affinity description of the given network for which the direct similarities between graph nodes or nodal features are not available to obtain the classical affinity matrix. Such calculated network centrality information presents an essential measure for detecting network communities, and also a ‘confidence’ criterion for developing new benchmark nodes. We also develop an efficient convex optimization algorithm under the new variational perspective of primal and dual to tackle the challenging combinatorial optimization problem of network partitioning. Experiment results demonstrate that the proposed optimization approach largely improves the accuracy of clustering communities from various networks.
It is obvious that obtaining a reasonable affinity matrix is the key factor for most graph or network partition algorithms. One way to improve the effectiveness of the affinity matrix is to take into account the pairs of nodes that are not directly connected, for example, the affinity matrix through the principle of three degree influence :
where , or the more generalized affinity matrix given as:
where is the attenuation constant which should be less than for convergence.
The computation of each , and , for any , and , in the introduced ALM-based dual optimization algorithm (Alg. 1) can be implemented edgewise and nodewise at the same time, which forms the basis to reimplement the algorithmic steps on modern parallel computing platforms like GPUs or HPCs, so as to significantly improve numerical efficiency and handle super large-scale network partition problems.
This work was supported by the National Natural Science Foundation of China (Grant Nos. 61877046, 61877047 and 11801200), Shaanxi Provincial Natural Science Foundation of China (Grant No.2017JM1001), the Fundamental Research Funds for the Central Universities, and the Innovation Fund of Xidian University.
-a Equivalent Convex Optimization Models
By simple convex analysis, we can equally express the absolute function as , subject to . In this sense, we have the following equivalent expression for each absolute function term of (12):
We can also reformulate the energy term of (12) along with the constraint such that
This is clear that for any , the maximum of reaches infinity when tends to ; for any , its maximum reaches when .
In addition, the linear equality constraint of (11) can be identically rewritten as
and each variable is free.
where the divergence operator is given in (3). In this work, we call the above optimization problem as the equivalent primal-dual model.
While minimizing the primal-dual formulation (23) over all , we can easily obtain the following maximization problem
Clearly, the optimization formulation (24) is also equivalent to the convex optimization problem (12), which is named as the equivalent dual model in this paper. We actually focus on the optimum , and , to the optimization problem (12), which are the optimal multipliers to the linear equality constraints
in the sense of optimizing its identical dual model (24).
-B Detailed Augmented Lagrangian Method Based Algorithm to (12)
-  Guanrong(Ron)Chen, “Network science research: some recent progress in china and beyond,” National Science Review, vol. 1, no. 3, p. 334, 2014.
-  W. DJ and S. SH, “Collective dynamics of small-world networks,” Nature, pp. 440–442, 1998.
-  A. L. Barabasi and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
-  M. E. Newman, Networks: an introduction. Oxford University Press, Oxford, 2010.
-  S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.-U. Hwang, “Complex networks: Structure and dynamics,” Physics reports, vol. 424, no. 4-5, pp. 175–308, 2006.
-  M. Mitchell, “Complex systems: Network thinking,” Artificial Intelligence, vol. 170, no. 18, pp. 1194–1212, 2006.
-  J. Goldenberg, B. Libai, and E. Muller, “Talk of the network: A complex systems look at the underlying process of word-of-mouth,” Marketing letters, vol. 12, no. 3, pp. 211–223, 2001.
-  G. Palla, I. Derényi, I. Farkas, and T. Vicsek, “Uncovering the overlapping community structure of complex networks in nature and society,” Nature, vol. 435, no. 7043, p. 814, 2005.
-  S. Fortunato, “Community detection in graphs,” Physics Reports, vol. 486, no. 3-5, pp. 75–174, 2010.
-  L. Yang, X. Cao, D. Jin, X. Wang, and D. Meng, “A unified semi-supervised community detection framework using latent space graph regularization,” IEEE Transactions on Cybernetics, vol. 45, no. 11, pp. 2585–2598, 2015.
-  M. Rosvall and C. T. Bergstrom, “An information-theoretic framework for resolving community structure in complex networks,” Proceedings of the National Academy of Sciences, vol. 104, no. 18, pp. 7327–7331, 2007.
-  M. E. Newman, “Detecting community structure in networks,” The European Physical Journal B, vol. 38, no. 2, pp. 321–330, 2004.
-  M. Girvan and M. E. Newman, “Community structure in social and biological networks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 12, pp. 7821–7826, 2002.
-  R. Guimera, M. Sales-Pardo, and L. A. N. Amaral, “Modularity from fluctuations in random graphs and complex networks,” Physical Review E, vol. 70, no. 2, p. 025101, 2004.
-  J. Duch and A. Arenas, “Community detection in complex networks using extremal optimization,” Physical Review E, vol. 72, no. 2, p. 027104, 2005.
C. Liu, J. Liu, and Z. Jiang, “A multiobjective evolutionary algorithm based on similarity for community detection from signed social networks,”IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2274–2287, 2014.
-  L. Ma, M. Gong, J. Liu, Q. Cai, and L. Jiao, “Multi-level learning based memetic algorithm for community detection,” Applied Soft Computing, vol. 19, pp. 121–133, 2014.
-  Z. Bu, H. J. Li, J. Cao, Z. Wang, and G. Gao, “Dynamic cluster formation game for attributed graph clustering,” IEEE Transactions on Cybernetics, vol. PP, no. 99, pp. 1–14, 2017.
H. T. Ali and R. Couillet, “Improved spectral community detection in large
The Journal of Machine Learning Research, vol. 18, no. 1, pp. 8344–8392, 2017.
-  X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-supervised learning using gaussian fields and harmonic functions,” in ICML, 2003, pp. 912–919.
-  A. Azran, “The rendezvous algorithm: Multiclass semi-supervised learning with markov random walks,” in ICML, 2007.
-  F. Chung, “Spectral graph theory,” CBMS regional conference series in mathematics, no. 92, 1996.
-  U. Luxburg, A tutorial on spectral clustering. Kluwer Academic Publishers, 2007.
-  L. Hagen and A. Kahng, “Fast spectral methods for ratio cut partitioning and clustering,” in IEEE International Conference on Computer-Aided Design, 1991. Iccad-91. Digest of Technical Papers, 1991, pp. 10–13.
-  J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 22, no. 8, pp. 888–905, 2000.
-  T. Bühler and M. Hein, “Spectral clustering based on the graph p-laplacian,” in Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 2009, pp. 81–88.
-  M. Hein and S. Setzer, “Beyond spectral clustering-tight relaxations of balanced graph cuts,” in Advances in Neural Information Processing Systems, 2011, pp. 2366–2374.
-  A. Blum and S. Chawla, “Learning from labeled and unlabeled data using graph mincuts,” ICML 2001.
-  Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” PAMI, vol. 23, pp. 1222 – 1239, 2001.
-  T. F. Chan and S. Esedoḡlu, “Aspects of total variation regularized function approximation,” SIAM J. Appl. Math., vol. 65, no. 5, pp. 1817–1837, 2005.
-  X. Bresson, X.-C. Tai, T. F. Chan, and A. Szlam, “Multi-class transductive learning based on relaxations of cheeger cut and mumford-shah-potts model,” Journal of Mathematical Imaging and Vision, vol. 49, no. 1, pp. 191–201, 2014.
-  J. Yuan, E. Bae, and X. C. Tai, “A study on continuous max-flow and min-cut approaches,” in Computer Vision and Pattern Recognition, 2010, pp. 2217–2224.
-  J. Yuan, E. Bae, X.-C. Tai, and Y. Boykov, “A continuous max-flow approach to potts model,” in European Conference on Computer Vision. Springer, 2010, pp. 379–392.
-  K. Yin and X.-C. Tai, “An effective region force for some variational models for learning and clustering,” Journal of Scientific Computing, vol. 74, no. 1, pp. 175–196, 2018.
-  D. Zhou and B. Schölkopf, “Regularization on discrete spaces,” in Joint Pattern Recognition Symposium. Springer, 2005, pp. 361–368.
-  X. Zhu and A. B. Goldberg, Introduction to Semi-Supervised Learning, ser. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2009.
-  D. P. Bertsekas, Nonlinear Programming. Athena Scientific, September 1999.
-  Y. Bai, S. Liu, and Z. Zhang, “Effective hybrid link-adding strategy to enhance network transport efficiency for scale-free networks,” International Journal of Modern Physics C, vol. 28, no. 08, p. 1750107, 2017.
-  R. Dunn, F. Dudbridge, and C. M. Sanderson, “The use of edge-betweenness clustering to investigate biological function in protein interaction networks,” Bmc Bioinformatics, vol. 6, no. 1, p. 39, 2005.
-  B. Bollobas, Modern Graph Theory, ser. Graduate Texts in Mathematics. Springer, 1998, vol. 184.
-  Y. Yang, T. Nishikawa, and A. E. Motter, “Small vulnerable sets determine large network cascades in power grids,” Science, vol. 358, no. 6365, p. eaan3184, 2017.
-  S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, “K-core organization of complex networks,” Physical Review Letters, vol. 96, no. 4, p. 040601, 2006.
-  S. Ghahramani, Fundamentals of probability. Prentice Hall,, 2000.
-  M. E. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Physical Review E, vol. 69, no. 2, p. 026113, 2004.
-  S. Fortunato and D. Hric, “Community detection in networks: A user guide,” Physics Reports, vol. 659, pp. 1–44, 2016.
-  R. Guimera and L. A. N. Amaral, “Functional cartography of complex metabolic networks,” Nature, vol. 433, no. 7028, p. 895, 2005.
-  A. E. Krause, K. A. Frank, D. M. Mason, R. E. Ulanowicz, and W. W. Taylor, “Compartments revealed in food-web structure,” Nature, vol. 426, no. 6964, p. 282, 2003.
-  A. Lancichinetti, S. Fortunato, and F. Radicchi, “Benchmark graphs for testing community detection algorithms,” Physical review E, vol. 78, no. 4, p. 046110, 2008.
-  B. Bollobás, Extremal graph theory. Courier Corporation, 2004.
-  D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, “The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations,” Behavioral Ecology & Sociobiology, vol. 54, no. 4, pp. 396–405, 2003.
-  M. E. Newman, “Modularity and community structure in networks,” in APS March Meeting, 2006, pp. 8577–8582.
O. Chapelle, B. Scholkopf, and A. Z. Eds, “Semi-supervised learning (chapelle,
o. et al., eds.; 2006) [book reviews],”
IEEE Transactions on Neural Networks, vol. 20, no. 3, pp. 542–542, 2009.
-  S. Nayar, “Columbia object image library (coil100),” 1996.
-  Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
-  J. H. Fowler and N. A. Christakis, “Dynamic spread of happiness in a large social network: Longitudinal analysis of the framingham heart study social network,” BMJ: British Medical Journal, vol. 338, no. 7685, pp. 23–27, 2009.