I Introduction
Modern network science [1, 2, 3, 4, 5] has brought crucial and significant improvements to our understanding of complex system [6]. One of the most prominent features for representing real complex network systems is the structure of communities [7, 8], i.e. the organization of vertices, for which vertices in the same community have more connections than ones in the other communities [9]. In a complex system, each community is often composed of multiple entities with similar properties, which provides a deep insight into the structure and function of the whole network system [10, 11]. Therefore, detecting communities is of great importance for many different applications of biology, physics, sociology and computer science, where the system is usually modeled as a complicated network with edges linking each related node pairs [12].
With this respect, many different methods were proposed to solve the challenging problem of partitioning independent communities from the given network: one of the most popular methods, proposed by Girvan and Newman [13], is to identify network communities through maximizing the modularity of the associate network, which has been the essential criterion of many community detection methods. However, optimizing a network’s modularity is mathematically nontrivial. Recent studies demonstrated that exactly maximizing the modularity of network is NPcomplete, for which many polynomialtime approximation methods have been proposed, such as the greedy method [12], simulated annealing [14], extremal optimization [15], intelligent optimization [16, 17]
, game theorybased methods
[18] and spectral methods [19]. Another way to deal with such a network partition problem is to construct the affinity matrix of the associate network graph, where the affinity matrix encodes the seminal information about network structure and connectivities, and compute the optimized nodal labels over the given graph which directly separate the network graph into different partitions or communities. According to the proposed optimization criterion of labeling estimation, many different approaches were proposed to recover such labels of graph nodes, for example, label propagation and random walks over graphs
[20, 21][22, 23] based balanced graph cut approaches including ratio cut [24], normalized cut [25], pLaplacian based graph cut [26] and cheeger cut [27] etc., however, most of them are computationally expensive and inefficiency due to their posed nonconvex optimization formulations. In contrast, mincutbased methods [28, 29] can solve the studied combinatorial optimization problem efficiently in an approximate way along with the capability to handle largescale graphs. Recently, convex optimization was developed as a popular optimization framework to build up fast solvers for labeling recovery in the spatially continuous setting [30, 31, 32, 33] etc.: the main ideas of related convex optimization approaches, i.e. relax the binary label constraints to a continuous convex set and round the result of the reduced convex optimization problem back to binary, can be directly extended to discover the optimum labeling of each graph node, for example Yin et al’s totalvariationbased region force method (TVRF) to semisupervised clustering [34], which introduced a fast splitting optimization framework to the proposed convex optimization problem and outperformed most stateoftheart graph partitioning algorithms.Besides the matter of developing efficient optimization algorithms, another major challenges of detecting communities from some network are in two folds: First, it lacks reliable descriptions to encode inherent network structures and affiliation of any node to the specified community. One common way is to sample some graph nodes into some communities beforehand, so called benchmark nodes, which can partially solve this difficulty: benchmark nodes reveal limited network structure information which can still indicate the recovery of other nodes belonging to the same community through evaluating the given network’s coreness, betweenness etc.; in addition, benchmark nodes provide meaningful starting points to propagate label along graph links to the other nodes, i.e. label inference, which can be incorporated into the oftenused optimization procedures. Second, it could be hard to get enough benchmark nodes in realworld applications; especially for the large sparse networks, low degree benchmark nodes can hardly provide useful information about network communities and worsen performance of the followed partition procedure. Therefore, how to discover more ’trustable’ benchmark nodes during the whole computing process becomes one key factor of extracting partitions or communities accurately from the given network.
Ia Contributions and Organization
Motivated by previous studies, we present a novel twostage optimization method for network community partition, based on the internal structure measure of the given network graph. The proposed optimization approach utilizes a new network centrality measure of both links and vertices to construct the key affinity matrix of the given network correctly, for the cases vanishing node similarities which commonly happen for network community detection. The network centrality information actually reveals the essential structure of network, which, hence, provides a proper clue for detecting network communities and introduces an additional ‘confidence’ criterion for labelings by referencing the labeled benchmark nodes. Particularly, the twostage optimization method makes use of the network centralitybased ‘confidence’ measure for the stage of benchmark node refinement, and an efficient convex optimization algorithm to the solve the followed challenging combinatorial optimization problem of graph clustering, which is developed based under a new variational perspective of primal and dual. Refining benchmark nodes can effectively improve the accuracy and reliability of the proposed optimization approach. Experiments over both artificial and realworld network datasets demonstrate that the proposed optimization method of community detection outperforms the stateofart algorithms in terms of accuracy and reliability.
IB Definitions and Notations
Let be the set of communities, which is represented by the graph with vertices (or nodes) and edges (or links); each edge , where , denotes the existing link between two nodes and , and each community , , is a distinct subgraph of . The connectivity of can be expressed as its adjacency matrix whose entry means there exists a link between the two nodes and , and otherwise. The matrix represents the affinity matrix of graph , where measures the similarity between the two vertices of and , and is usually given as a symmetric matrix with nonnegative entries. Additionally, the diagonal matrix is given by , .
With this, the linear operators of gradient and divergence over the graph are introduced as follows [35]: for some scalar function given at each node , its gradient evaluates the difference of between two nodes and along the link such that
(1) 
whose norm, , is measured as
(2) 
for some function given at each edge , its divergence at the node measures the balance of over all the edges associate linking the neighbour nodes around , i.e. such that
(3) 
Ii SemiSupervised Graph Partition and Convex Optimization Model
In this work, we aim to partition communities from a given graph network with a new twostage optimization method. The proposed optimization approach utilizes a new network centrality measure of both links and vertices to construct the key affinity matrix of the given network correctly, for the cases vanishing node similarities which commonly happen for network community detection. The network centrality information actually reveals the essential structure of network, which, hence, provides a proper clue for detecting network communities and introduces an additional ‘confidence’ criterion for labelings by referencing the labeled benchmark nodes. Particularly, the twostage optimization method makes use of the network centralitybased ‘confidence’ measure for the stage of benchmark node refinement, and an efficient convex optimization algorithm to the solve the followed challenging combinatorial optimization problem of graph clustering, which is developed based under a new variational perspective of primal and dual. Refining benchmark nodes can effectively improve the accuracy and reliability of the proposed optimization approach.
Iia SemiSupervised Graph Partitioning
Graph partitioning targets to cut the given graph into multiple independent subgraphs (or communities). Let be a binary matrix, where denotes the node belongs to the community or not . Then, graph partitioning tries to minimize the following energy function:
(4) 
Also, the convex penalty function of each term in (4) can also be the quadratic function or norm function , , which results in the weighted Laplacian or Laplacian as the energy function of (4) [23, 26].
In addition, each vertex belongs to only one subgraph/community, i.e.
(5) 
Using the definitions of graph gradients (1) and (2), the optimization problem (4) can be written in a more concise form of minimizing the corresponding graph totalvariation function such that
(6) 
where denotes the th column of .
It is clear that the optimization model (12) has a trivial solution, where all vertices belong to the same community. One important way to avoid this situation is to integrate priori information into the proposed optimization problem (4), for example, some benchmark nodes are labeled, hence separating such partially labeled graph into multiple independent subgraphs/communities introduces a proper semisupervised graph partition problem [36, 34]. Indeed, the prelabeled nodes helps improving the partition results for the given graph mainly in two folds: first, the labeled nodes provide the starting positions to propagate labels to the other vertices [36]; second, they also reveal the essential features to construct graph partitioning hints in geometry or other respects (see the following section).
Let , , be the benchmark set which represents a sample fraction of the community , and the total benchmark set . To this end, we have
(7) 
With the locations of prelabeled nodes, we define the novel measure , and
, which characterizes the probability of each vertex
belonging to community such that(8) 
and the matrix is the corresponding normalized affinity matrix; when the denominator is zero, set . Therefore, we can integrate the crossentropy information between the possibility and the label function into the optimization model (6) which gives rise to the following optimization problem:
(9) 
subject to the constraint (5).
IiB Convex Relaxation and Dual Optimization
Finding the optimum to the proposed minimization problem (LABEL:fi_function) over the binary constraint is challenging, actually NP hard which means there is no efficient polynomialtime algorithm for such combinatorial optimization problem (LABEL:fi_function). In practice, we often replace the binary constraint by its convex relaxation instead; hence, we have
(10) 
On the other hand, combine the two constraints and (5), i.e.
(11) 
which denotes that, for each node , the th row of the matrix belongs to the dim simplex set .
In this sense, we can rewrite the convex optimization problem (10), also in view of (4), as
(12) 
where , subject to
Through variational analysis, see appendix A for details, we can prove the equivalence between the convex optimization problem (12) and its associate dual model (24) such that
Proposition 1
The convex optimization problem (10) is mathematically equal to the following maximization problem
(13) 
subject to
(14) 
In addition, the optimum , and , to the original convex optimization problem (12) are just the optimal multipliers to the above linear equality constraints.
Proof 1
The proof can be found in the appendix A.
Given the fact that the dual optimization model (13) is equivalent to the studied convex optimization problem (12), where the optimum to (12) works as the optimal multipliers to the linear equality of the dual problem (13), the energy function of the respective primaldual model (23) is just the conventional Lagrangian function of (13):
In this paper, we employ the classical augmented Lagrangian method (ALM) [37] to construct a novel efficient ALMbased algorithm to tackle the linear equality constrained dual optimization problem (13), which can resolve both and the additional dual variables simultaneously. Upon the above classical Lagrangian function, we define its augmented Lagrangian function
Therefore, the proposed ALMbased algorithm to the dual optimization problem (13) explores two major steps at each iteration till convergence (see Alg. 2 in the appendix B for details):

fix , compute by maximizing :

update by the computed :
Once the proposed ALMbased (Alg. 2) converges to some optimum , and , we can simply round into its binary version, such that for each node , when and .
Iii Network Structure Centralities and TwoStage Community Partition
Clearly, it is the key factor for partitioning a graph or network accurately that the right affinity description is provided for (LABEL:fi_function) and (8). The classical way for most stateofart clustering methods is to employ similarities between nodes or specified nodal features for constructing the associate affinity matrix, which is, however, unavailable in many cases of network clustering. In this work, we propose a novel method to calculate such network affinity matrix based upon inherent structure centrality of the network links and vertices, and introduce a new twostage optimization strategy (TSOS) to cluster the communities with both efficiency and accuracy.
Iiia Network Structure and Betweenness of Links
In this section, we define the affinity/adjacency matrix directly from the network structure information of link centrality, i.e. betweenness of network links.
In fact, betweenness of the network link is defined as the total number of shortest paths that pass through [38, 39] from all vertices to all the other vertices, such that
(15) 
where is the total number of all shortest paths from any node to a different node , and is the number of such paths through the link .
Betweenness of the link is actually one of the most important factors for network partition: high betweenness of a link can be taken as the bridge to connect two communities, which means that once removed, the number of isolated network blocks would increase[40]. Indeed, this is exactly the expected edge to separate the network. To this end, for an edge , we can define the corresponding weight where is a positive strictly decrease function, i.e. the cost of cutting the edge with high betweenness value is low, hence partitioning would likely happen on this edge. In this paper, we consider the inverse of betweenness as the definition of the affinity matrix :
(16) 
Hence, the parameter values of the optimization problem (10) can be computed through (8).
IiiB Nodal Centrality, Benchmark Confidence and TwoStage Optimization Strategy (TSOS)
Actually, the ‘core members’ or ‘core nodes’ of each network community are closely connected with each other and dominate more nodes than the other ones within one hop range, which defines the centrality of network nodes. Clearly, such core nodes with the correct label will directly find the other core nodes with the same label, i.e. the core members of the respective community, once the core nodes are discovered beforehand (see the following section for details).
The topological centrality of each node can be quantified through the concept of core, which is defined as the largest subnetwork in which every node has at least links, i.e. with the degree . As shown in Fig. 2, the kcore of a given network can be obtained by recursively removing all nodes with the degree less than , until all the nodes in the remaining network have the degree not less than . Repeating this for , finally determines the shell decomposition of a network. Hence, the coreness of each node , , is then defined as the integer for which this node belongs to the core but not to the core [41, 42]. In general, the node with a bigger coreness value must have a higher centrality. In this paper, we therefore adopt such coreness to characterize the nodal centrality and the ‘core nodes’, or ‘core members’, are the ones with the highest coreness number.
On the other hand, the coreness of any node defines the closeness of such node to the ‘core nodes’ of the associate community; hence, we define the related ‘confidence’ measure that evaluates the possibility of any node belonging to the corresponding community :
(17) 
where bigger , , means lower betweenness in terms of (16) and lower likelihood for cutting the associate link , so higher ‘confidence’ to associate two nodes and . In this sense, actually evaluates the ‘confidence’ to combine the node into the benchmark set of the community .
With such ‘confidence’ measure, we introduce a twostage optimization framework: it first computes an initial network partition through the proposed ALMbased dual optimization algorithm; once the initial partitioned communities are obtained, the ‘confidence’ measure for each node within its initial partition is calculated by (17), so as to choose the new nodes with high ‘confidence’ as (18) into the related benchmark set ; using the expanded benchmark sets, the proposed ALMbased dual optimization is explored to recompute the network partition. More details of the twostage optimization strategy can be found in Alg. 1.
In fact, the proposed twostage optimization strategy does not require many initial benchmark nodes to ensure the accuracy of network partition, since the benchmark sets can be expanded with more dominate nodes of high confidence. Meanwhile, the twostage optimization method, along with increasing benchmark nodes, essentially reduces the total number of undetermined graph nodes, this improves efficiency of the following partition procedure. On the other hand, the alternating steps of optimization and benchmark expansion can be performed not only two but also more than two times, hence a multistage optimization method. In practice, we found the twostageoptimization can reach the result good enough, using more than two optimization stages does not improve the results significantly (see the experiment results of Fig. 4 for details).
In this work, we often pick nodes with high ‘confidence’ into the benchmark set, whose related suffice the following condition:
(18) 
where and
are the average and standard deviation of all the values
, and , see Sec. IVA for choosing the proper parameter for experiments in this work.Iv Experiments
We, in this work, explore two artificial networks of GN and LFR and 5 realworld networks to validate the effectiveness and efficiency of the proposed twostage optimization strategy (TSOS), see Alg. 1, for partitioning the given network into multiple communities with inherent network structure information. Experiment results are recorded from the average performance of 20 independent trials, and compared with groundtruth.
For the unweighted networks of GN, LFR, Dophin, Football, and Polbooks, their affinity matrices are calculated by (16). For the data clustering networks of COIL and MINST, we adopt their given similarity weights to construct their affinity matrices directly by equation (8). In addition, we compare our proposed method with one of stateoftheart data clustering approach proposed by Yin et al [34], namely the totalvariationbased data clustering algorithm with region force (TVRF).
Iva Experiments of Benchmark Expansion, Optimization Stages and Parameter
IvA1 Bechmark Expansion
In this section, we show the proposed twostage optimization strategy (TSOS) significantly improve the community partition results of networks, especially the sparsely connected networks. For the given sparse network of Zachary karate clubs consisting 34 vertices and 2 communities, as illustrated in Fig. 3, the labeled benchmark nodes dominate very few nodes (see Fig. 3(a)), thus provides not much network structure information and results in inaccurate partition result initially (see Fig. 3 (b)). Actually, shortage of prelabeled nodes, or benchmark nodes with sufficient dominates, is often the big challenge for the stateoftheart semisupervised partition methods, which are suffering from less network structure information. Expansion of benchmark nodes are performed by (18) with the benchmark confidence measure (17). New benchmark nodes are selected as shown in Fig. 3(c), where four nodes are inserted into two respective benchmark sets. The followed partition procedure through ALMbased algorithm significantly improves the accuracy of community partition by , see Fig. 3(d)!
IvA2 Optimization Stages
The procedures of optimization and benchmark expansion, as Alg. 2, can be performed not only two but also more than two times, i.e. with multiple optimization stages. Experiment results shown in Fig. 4 indicate that the proposed twostageoptimization strategy can reach the result with enough accuracy, performing more than two optimization stages does not improve the results significantly.
IvA3 Selection of Parameter in (18)
Nodes with high ‘confidence’ values are the good options for benchmarks. In order to ensure each new benchmark node chosen correctly, the value of should be selected high enough. By Chebyshev’s inequality [43], it confirmed that, for any distribution, the amount of data within times of standard deviations is at least ratio , which means
(19) 
Thus, the high value of with small appearing probability stands out for a ’trustable’ selection. Experiment results over five different networks, see Fig. 5, show that most experiments do not get noticeable improvements when . In this work, we choose for networks of average degree ; for networks with average degree ; for networks with average degree , for example the graphs of MINST and COIL.
IvB Experiments on Artificial Networks of GN and LFR
Algorithms  Accuracy (%) Classical  Accuracy (%)  Accuracy (%) 
TVRF(3%)  
TSOS(3%)  
TVRF(6%)  
TSOS(6%) 
GN artificial network [44] is proposed by Grivan and Newman, which is still one most popular topics discussed in related literatures. It provides a basic node set [45] with communities, the average total degree of each node is fixed to . At the same time, GN provides a flexible network generation mechanism which is controlled by the number of nodes of each community , the number of communities , the number of internal halfedges per node , and the number of external halfedges per node etc. Studies [46] show that the parameters and determine the detectable of network communities. Higher value of decreases the detectability of network communities [47]. In this study, we test our proposed TSOS comparing with TVRF, over three different types of GN networks including the classical GN network and its two variants with different ( and ), for which each community has at least one benchmark node and the fraction of benchmark nodes is set as and .
As the results shown in Table I, picking more benchmark nodes results in higher partition accuracy while the same algorithm configuration is set up. For the classical GN network, both algorithms can reach accuracy when only nodes are used as benchmark. The proposed TSOS can still obtain a completely correct result for the GN network with , and a much higher partition accuracy than TVRF for the difficult case with . This shows the effectiveness of the proposed strategy by incorporating new benchmark nodes into an additional step of network partition refinement.
Algorithms  Accuracy (%)  Accuracy (%)  Accuracy (%)  Accuracy(%)  Accuracy(%) 

TVRF(4%)  
TSOS(4%)  
TVRF(8%)  
TSOS(8%) 
In contrast to the homogeneous GN networks whose nodes have the same degree, which is actually not a good proxy of real networks with community structure, the artificial benchmark LFR network, proposed by Lancichinetti, Fortunato and Radicchi [48], has a power law distribution of degree. LFR benchmark is basically a configuration model with builtin communities [49], which is built by joining stubs at random selection, once one has established which stubs are internal and which ones are external to the stubs [45]. The mixing parameters is the ratio between the external degree and the degree of each vertex i.e. . Obviously, when is low, each community can be better separated from the others. Here, we generate networks including nodes, with the value of ranging from [0.1 0.5], the distributions of degree and community size follow respective power laws of and , the average degree is set to and the community sizes , , are set from to .
In the experiments, each community has at least one benchmark node; and nodes are selected as benchmarks for each experiment, so about and benchmark nodes are picked, which are slightly bigger than the total number of communities, i.e. rather small samples. As shown in Tab.II, when increases, our proposed TSOS method can still keep the results with high accuracy and perform much better than the TVRF algorithm, hence more robust to increasing external degree, i.e. high mixing parameter does not affect the performance of TSOS more than TVRF. On the other hand, choosing more benchmark nodes promotes both algorithms’ performance; however, the proposed TSOS gets improved more significantly.
IvC Experiments on RealWorld Networks
In this work, five realworld networks are used to validate the proposed TSOS method, which includes three classical networks of Dolphin network [50], Football network [13] and Political book network [51], and two data clustering sets of MINST [34] and COIL [52]. The three classical social networks are widely used in many community detection studies; the COIL100 (Columbia object image library100) data set [53] contains many color images of different objects, its related graph network used in this paper includes randomly selected objects (1500 images) from the dataset and the edge weights of the builtup NN graph are calculated through the Euclidean distance between two images; the MINST data [54] totally consists of 70000 sizenormalized and centered images of handwritten digits , the images are naturally partitioned to roughly balanced clusters, a NN graph is constructed from the original MINST data set and its edge weights are computed from the Euclidean distance between two images as
dim vectors
[34]. These networks are considered as undirected and their network parameters are shown in Tab. III.P  Network  Nodes  Edges  Clusters  Clustering coefficient  

1  Dolphin  62  159  2  5.129  0.303 
2  Polbooks  105  441  3  8.400  0.488 
3  Football  115  613  12  10.66  0.403 
4  COIL  1500  3750  6  5   
5  MNIST  70000  350000  10  10   
P  Network  Benchmark nodes  TVRF (%)  TSOS (%) 

1  Dolphin  2 ()  
2  Dolphin  6 ()  
3  Polbooks  4 ()  
4  Polbooks  15()  
5  Football  12 ()  
6  COIL  45 (3%)  
7  COIL  150 (10%)  
8  MNIST  70 (0.1%)  
9  MNIST  140 (0.2%) 
Experiment results of realworld networks are illustrated in Tab. IV. Similar as the other experiments, picking more benchmark nodes clearly improves network partition accuracy. In addition, the proposed TSOS method performs better for the cases with less initial benchmark nodes, while it can still obtain similar partition accuracy as TVRF for the cases with more initial benchmark nodes. Clearly, for a really small ratio of selected benchmark nodes to the total number of network nodes, e.g. MNIST, TSOS achieves much better partition accuracy than TVRF: by TSOS versus by TVRF (with nodes as benchmark), by TSOS versus by TVRF (with nodes as benchmark). This should thank to the introduced intermediate step of benchmark expansion with a proper confidence criterion.
Particularly, for each network partition, we repeat experiments times with different initial conditions. In view of Tab. IV an d Fig. 6
, the computed results through the proposed TSOS often have less variance while keeping higher accuracy, hence better robustness in numerics. For example, clustering MINST data graph with only
nodes as benchmark, the variance of experiment results by TSOS is only , which is much less than the results’ variance by TVRF. Detailed performance for each experiment setting can be found in Fig. 6. Moreover, the total numbers of benchmark nodes after expansion are much more the initial benchmark nodes, as blue bars vs. yellow bars shown in Fig. 6. Such computational robustness is often the seminal factor of partitioning largescale networks, especially when only a small portion of nodes are available as benchmark.V Conclusions and Future Studies
We introduce a novel twostage optimization strategy for partitioning network communities, which makes use of inherent network structure information, i.e. the new network centrality measure of both links and vertices, so as to construct the key affinity description of the given network for which the direct similarities between graph nodes or nodal features are not available to obtain the classical affinity matrix. Such calculated network centrality information presents an essential measure for detecting network communities, and also a ‘confidence’ criterion for developing new benchmark nodes. We also develop an efficient convex optimization algorithm under the new variational perspective of primal and dual to tackle the challenging combinatorial optimization problem of network partitioning. Experiment results demonstrate that the proposed optimization approach largely improves the accuracy of clustering communities from various networks.
It is obvious that obtaining a reasonable affinity matrix is the key factor for most graph or network partition algorithms. One way to improve the effectiveness of the affinity matrix is to take into account the pairs of nodes that are not directly connected, for example, the affinity matrix through the principle of three degree influence [55]:
where , or the more generalized affinity matrix given as:
where is the attenuation constant which should be less than for convergence.
The computation of each , and , for any , and , in the introduced ALMbased dual optimization algorithm (Alg. 1) can be implemented edgewise and nodewise at the same time, which forms the basis to reimplement the algorithmic steps on modern parallel computing platforms like GPUs or HPCs, so as to significantly improve numerical efficiency and handle super largescale network partition problems.
Acknowledgement
This work was supported by the National Natural Science Foundation of China (Grant Nos. 61877046, 61877047 and 11801200), Shaanxi Provincial Natural Science Foundation of China (Grant No.2017JM1001), the Fundamental Research Funds for the Central Universities, and the Innovation Fund of Xidian University.
Appendix
a Equivalent Convex Optimization Models
By simple convex analysis, we can equally express the absolute function as , subject to . In this sense, we have the following equivalent expression for each absolute function term of (12):
(20) 
We can also reformulate the energy term of (12) along with the constraint such that
(21) 
This is clear that for any , the maximum of reaches infinity when tends to ; for any , its maximum reaches when .
In addition, the linear equality constraint of (11) can be identically rewritten as
(22) 
and each variable is free.
Observe the facts (20), (21) and (22), it is easy to prove that the nodewise simplex constrained convex optimization problem (12) is mathematically equivalent to the following minimax formulation
(23) 
where the divergence operator is given in (3). In this work, we call the above optimization problem as the equivalent primaldual model.
While minimizing the primaldual formulation (23) over all , we can easily obtain the following maximization problem
(24) 
Clearly, the optimization formulation (24) is also equivalent to the convex optimization problem (12), which is named as the equivalent dual model in this paper. We actually focus on the optimum , and , to the optimization problem (12), which are the optimal multipliers to the linear equality constraints
(25) 
in the sense of optimizing its identical dual model (24).
B Detailed Augmented Lagrangian Method Based Algorithm to (12)
Details of the proposed augmented Lagrangian methodbased algorithm to the linear equality constrained convex optimization problem (12) is listed in Alg. 2.
References
 [1] Guanrong(Ron)Chen, “Network science research: some recent progress in china and beyond,” National Science Review, vol. 1, no. 3, p. 334, 2014.
 [2] W. DJ and S. SH, “Collective dynamics of smallworld networks,” Nature, pp. 440–442, 1998.
 [3] A. L. Barabasi and R. Albert, “Emergence of scaling in random networks,” Science, vol. 286, no. 5439, pp. 509–512, 1999.
 [4] M. E. Newman, Networks: an introduction. Oxford University Press, Oxford, 2010.
 [5] S. Boccaletti, V. Latora, Y. Moreno, M. Chavez, and D.U. Hwang, “Complex networks: Structure and dynamics,” Physics reports, vol. 424, no. 45, pp. 175–308, 2006.
 [6] M. Mitchell, “Complex systems: Network thinking,” Artificial Intelligence, vol. 170, no. 18, pp. 1194–1212, 2006.
 [7] J. Goldenberg, B. Libai, and E. Muller, “Talk of the network: A complex systems look at the underlying process of wordofmouth,” Marketing letters, vol. 12, no. 3, pp. 211–223, 2001.
 [8] G. Palla, I. Derényi, I. Farkas, and T. Vicsek, “Uncovering the overlapping community structure of complex networks in nature and society,” Nature, vol. 435, no. 7043, p. 814, 2005.
 [9] S. Fortunato, “Community detection in graphs,” Physics Reports, vol. 486, no. 35, pp. 75–174, 2010.
 [10] L. Yang, X. Cao, D. Jin, X. Wang, and D. Meng, “A unified semisupervised community detection framework using latent space graph regularization,” IEEE Transactions on Cybernetics, vol. 45, no. 11, pp. 2585–2598, 2015.
 [11] M. Rosvall and C. T. Bergstrom, “An informationtheoretic framework for resolving community structure in complex networks,” Proceedings of the National Academy of Sciences, vol. 104, no. 18, pp. 7327–7331, 2007.
 [12] M. E. Newman, “Detecting community structure in networks,” The European Physical Journal B, vol. 38, no. 2, pp. 321–330, 2004.
 [13] M. Girvan and M. E. Newman, “Community structure in social and biological networks,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 12, pp. 7821–7826, 2002.
 [14] R. Guimera, M. SalesPardo, and L. A. N. Amaral, “Modularity from fluctuations in random graphs and complex networks,” Physical Review E, vol. 70, no. 2, p. 025101, 2004.
 [15] J. Duch and A. Arenas, “Community detection in complex networks using extremal optimization,” Physical Review E, vol. 72, no. 2, p. 027104, 2005.

[16]
C. Liu, J. Liu, and Z. Jiang, “A multiobjective evolutionary algorithm based on similarity for community detection from signed social networks,”
IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2274–2287, 2014.  [17] L. Ma, M. Gong, J. Liu, Q. Cai, and L. Jiao, “Multilevel learning based memetic algorithm for community detection,” Applied Soft Computing, vol. 19, pp. 121–133, 2014.
 [18] Z. Bu, H. J. Li, J. Cao, Z. Wang, and G. Gao, “Dynamic cluster formation game for attributed graph clustering,” IEEE Transactions on Cybernetics, vol. PP, no. 99, pp. 1–14, 2017.

[19]
H. T. Ali and R. Couillet, “Improved spectral community detection in large
heterogeneous networks,”
The Journal of Machine Learning Research
, vol. 18, no. 1, pp. 8344–8392, 2017.  [20] X. Zhu, Z. Ghahramani, and J. Lafferty, “Semisupervised learning using gaussian fields and harmonic functions,” in ICML, 2003, pp. 912–919.
 [21] A. Azran, “The rendezvous algorithm: Multiclass semisupervised learning with markov random walks,” in ICML, 2007.
 [22] F. Chung, “Spectral graph theory,” CBMS regional conference series in mathematics, no. 92, 1996.
 [23] U. Luxburg, A tutorial on spectral clustering. Kluwer Academic Publishers, 2007.
 [24] L. Hagen and A. Kahng, “Fast spectral methods for ratio cut partitioning and clustering,” in IEEE International Conference on ComputerAided Design, 1991. Iccad91. Digest of Technical Papers, 1991, pp. 10–13.
 [25] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 22, no. 8, pp. 888–905, 2000.
 [26] T. Bühler and M. Hein, “Spectral clustering based on the graph plaplacian,” in Proceedings of the 26th Annual International Conference on Machine Learning. ACM, 2009, pp. 81–88.
 [27] M. Hein and S. Setzer, “Beyond spectral clusteringtight relaxations of balanced graph cuts,” in Advances in Neural Information Processing Systems, 2011, pp. 2366–2374.
 [28] A. Blum and S. Chawla, “Learning from labeled and unlabeled data using graph mincuts,” ICML 2001.
 [29] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” PAMI, vol. 23, pp. 1222 – 1239, 2001.
 [30] T. F. Chan and S. Esedoḡlu, “Aspects of total variation regularized function approximation,” SIAM J. Appl. Math., vol. 65, no. 5, pp. 1817–1837, 2005.
 [31] X. Bresson, X.C. Tai, T. F. Chan, and A. Szlam, “Multiclass transductive learning based on relaxations of cheeger cut and mumfordshahpotts model,” Journal of Mathematical Imaging and Vision, vol. 49, no. 1, pp. 191–201, 2014.
 [32] J. Yuan, E. Bae, and X. C. Tai, “A study on continuous maxflow and mincut approaches,” in Computer Vision and Pattern Recognition, 2010, pp. 2217–2224.
 [33] J. Yuan, E. Bae, X.C. Tai, and Y. Boykov, “A continuous maxflow approach to potts model,” in European Conference on Computer Vision. Springer, 2010, pp. 379–392.
 [34] K. Yin and X.C. Tai, “An effective region force for some variational models for learning and clustering,” Journal of Scientific Computing, vol. 74, no. 1, pp. 175–196, 2018.
 [35] D. Zhou and B. Schölkopf, “Regularization on discrete spaces,” in Joint Pattern Recognition Symposium. Springer, 2005, pp. 361–368.
 [36] X. Zhu and A. B. Goldberg, Introduction to SemiSupervised Learning, ser. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2009.
 [37] D. P. Bertsekas, Nonlinear Programming. Athena Scientific, September 1999.
 [38] Y. Bai, S. Liu, and Z. Zhang, “Effective hybrid linkadding strategy to enhance network transport efficiency for scalefree networks,” International Journal of Modern Physics C, vol. 28, no. 08, p. 1750107, 2017.
 [39] R. Dunn, F. Dudbridge, and C. M. Sanderson, “The use of edgebetweenness clustering to investigate biological function in protein interaction networks,” Bmc Bioinformatics, vol. 6, no. 1, p. 39, 2005.
 [40] B. Bollobas, Modern Graph Theory, ser. Graduate Texts in Mathematics. Springer, 1998, vol. 184.
 [41] Y. Yang, T. Nishikawa, and A. E. Motter, “Small vulnerable sets determine large network cascades in power grids,” Science, vol. 358, no. 6365, p. eaan3184, 2017.
 [42] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, “Kcore organization of complex networks,” Physical Review Letters, vol. 96, no. 4, p. 040601, 2006.
 [43] S. Ghahramani, Fundamentals of probability. Prentice Hall,, 2000.
 [44] M. E. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Physical Review E, vol. 69, no. 2, p. 026113, 2004.
 [45] S. Fortunato and D. Hric, “Community detection in networks: A user guide,” Physics Reports, vol. 659, pp. 1–44, 2016.
 [46] R. Guimera and L. A. N. Amaral, “Functional cartography of complex metabolic networks,” Nature, vol. 433, no. 7028, p. 895, 2005.
 [47] A. E. Krause, K. A. Frank, D. M. Mason, R. E. Ulanowicz, and W. W. Taylor, “Compartments revealed in foodweb structure,” Nature, vol. 426, no. 6964, p. 282, 2003.
 [48] A. Lancichinetti, S. Fortunato, and F. Radicchi, “Benchmark graphs for testing community detection algorithms,” Physical review E, vol. 78, no. 4, p. 046110, 2008.
 [49] B. Bollobás, Extremal graph theory. Courier Corporation, 2004.
 [50] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, “The bottlenose dolphin community of doubtful sound features a large proportion of longlasting associations,” Behavioral Ecology & Sociobiology, vol. 54, no. 4, pp. 396–405, 2003.
 [51] M. E. Newman, “Modularity and community structure in networks,” in APS March Meeting, 2006, pp. 8577–8582.

[52]
O. Chapelle, B. Scholkopf, and A. Z. Eds, “Semisupervised learning (chapelle,
o. et al., eds.; 2006) [book reviews],”
IEEE Transactions on Neural Networks
, vol. 20, no. 3, pp. 542–542, 2009.  [53] S. Nayar, “Columbia object image library (coil100),” 1996.
 [54] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
 [55] J. H. Fowler and N. A. Christakis, “Dynamic spread of happiness in a large social network: Longitudinal analysis of the framingham heart study social network,” BMJ: British Medical Journal, vol. 338, no. 7685, pp. 23–27, 2009.
Comments
There are no comments yet.