I Introduction
Community detection methods hold a central place in machine learning, with an extensive range of applications related to sociological behavior, protein interactions, image segmentation, and gene expressions analysis
[1]. In most of these applications, the actual classes of the nodes in the network are unknown, but pairwise relations between nodes are exploited to identify communities.Fitting the parameters of a stochastic block model (SBM) [2, 3] to a given graph is a prominent way of searching for communities. The canonical SBM assumes that each node belongs to one block (representing a community) and that the expected number of edges between two nodes depends only on the blocks to which they belong. Thus, the model only assumes that nodes within each block are statistically equivalent in their connectivity patterns. Several variations of the standard SBM were also introduced to overcome some of its limitations. The degreecorrected SBM (DCSBM) introduced by Karrer and Newman [4], in particular, allows nonuniform node degree distributions, making block modeling more representative of realworld networks.
Broadly speaking, a solution for community detection (represented as a partition of the node set into communities) is assortative when connections within communities are more frequent than in between communities, it is disassortative when connections within communities are less frequent than in between communities, and finally it is nonassortative if no such relation exists among all communities. SBMbased community detection approaches are agnostic to the assortativity of their solutions. They allow to search for solutions with a predefined number of communities, and can indifferently model assortative and disassortative structures. This modeling capability can be viewed as an asset but also as a weakness. Indeed, SBMs are often used in contexts in which users expect assortative solutions. In the most dramatic situations, nonassortative solutions might go under the radar and lead to mistakes of interpretation. In other cases, nonassortative solutions with a better likelihood may substitute the assortative solutions which were originally sought (Figure 1). This later situation is especially prevalent in case studies involving sparse graphs, or with lightly assortative structures which challenge detection algorithms.
In this work, we propose a variant of the DCSBM which includes user knowledge about assortativity. We incorporate this information by setting assortativity constraints on the DCSBM parameter set. Indeed, if the user expects an assortative solution due to the characteristics of the application case, it is plausible to guide the convergence of the model via additional constraints. We show that the resulting constrained likelihoodmaximization model can be solved efficiently with an iterative method based on localsearch and interiorpoint algorithms. Our computational experiments show that the assortativity constraints prevents the search from converging towards spurious nonassortative local minima, especially for sparse networks, and that they contribute to identify different solution structures in application cases related to the analysis of the brain cortex. The key contributions of this work are therefore the following.

We introduce a DCSBM variant which incorporates assortativity constraints to represent prior user knowledge;

We propose an efficient solution approach based on local optimization and interiorpoint algorithms for this model;

Through extensive computational experiments, we discuss the practical implications of this constrained model and identify the regimes in which it contributes to improve community detection practice.
Ii Related Works
SBMs are commonly used to extract meaningful information from complex networks. The classical SBM is also a natural modeling choice for community detection [1] and a generalization of modularity maximization [5]. The surveys of Abbe [1] and Lee and Wilkinson [6]
discuss key results regarding recovery requirements and solution algorithms. Different types of algorithms can be used to fit SBMs, based on Markov Chain Monte Carlo (MCMC) approaches
[7, 3, 8], variational inference [9, 10], belief propagation [11][12, 13, 14], and semidefinite programming [15, 16], among others.To date, few works have considered the possibility of incorporating prior information on assortativity. Moore et al. [17]
studied a SBM in which the edge probabilities within and in between communities follow a Beta prior. The hyperparameters defining the Beta distributions drive the degree of assortativity in the graph. Yet, according to their experiments, these priors dominate only in small or sparse data sets, otherwise they tend to wash out.
The Assortative Mixed Membership SBM (aMMSB) introduced by Gopalan et al. [18]
considers soft nodetocommunity assignments and includes a latent parameter describing community strength, representing how tightly nodes are connected within each group. Edges are assumed to be drawn from a Bernoulli distribution centered around the community strength if the nodes belong to the same group. Otherwise, the distribution is centered around a small value. A variational inference approach is used to fit the model.
Li et al. [19] pursued this research line by proposing a scalable MCMC method using a stochastic gradient algorithm for posterior inference in the aMMSB.Lu and Szymanski [20] finally proposed a regularized variant of the DCSBM, using a prior to regularize the observed indegree ratio of each node. In practice, this adaptation turns out to penalize highdegree nodes with many connections to other communities. The new parameter is adjusted to control the assortativity level, and a MCMC algorithm is used to infer the block assignments.
The aforementioned models aim to better fit assortative networks, but they are either dependent on adhoc parameters which are difficult to scale [18, 20], or of limited effect for larger graphs [17]. In light of these works, we decided to explore a different approach, which consists in guiding the search towards assortative structures via constraints in the SBM parameters. To fit our model, we propose effective algorithms for the resulting constrained maximumlikelihood optimization problem.
Iii Background and Notations
Iiia DegreeCorrected Stochastic Block Model (DCSBM)
In its most fundamental form, the DCSBM considers nodes allocated to groups. We assume that the number of edges between a pair of nodes depends only on the groups to which the nodes belong and on their degrees [5]. Finding the latent membership of nodes corresponds to finding the blockmodel parameters that best fit the observed graph [1]. For an observed adjacency matrix representing a graph with (possibly weighted) edges, the loglikelihood function of the DCSBM is calculated as [4]:
(1) 
In this equation, is the degree of node , and is the total number of edges. Variables represent the binary community assignments, in such a way that indicates that node is assigned to group . is a symmetric edge probability matrix. Each element of corresponds to the expected number of edges between any two points in groups and . The expected number of edges between nodes and is , for and . If we fix the assignment , then the (unconstrained) maximumlikelihood for each parameter
can be estimated by differentiation:
(2) 
where is the number of edges between groups and , and is the sum of the degrees of nodes in group . If we substitute in Equation (1), we obtain the following loglikelihood function:
(3) 
in which we dropped the terms that do not involve .
IiiB Planted Partition Model and Modularity
The Planted Partition Model (PPM) is a special case of the standard SBM with only two parameters describing the blocks: if , and if . Newman [5] shows that maximizing the likelihood of the PPM is equivalent to maximizing modularity. Modularity optimization maximizes the difference between the observed graph and a random graph where edges are reinserted randomly and the degrees of each node is preserved. As a consequence, it results in maximizing the number of edges within groups, leading to assortative solutions. However, modularity maximization is also subject to strong limitations: beyond its inability to define the number of communities, the model assumes that all communities have similar statistical properties [5]. This is a major issue when the distribution of edges between the blocks varies significantly.
Iv AssortativeConstrained SBM
We now introduce the assortativeconstrained degreecorrected SBM (ACDCSBM) along with an efficient algorithm to fit it by maximum likelihood. Following Amini et al. [21], two main notions of assortativity can be distinguished for block models:
Strong assortativity. All diagonal terms of are greater or equal than all offdiagonal terms:
(4) 
Weak assortativity. Each diagonal term of is greater or equal than the other terms in its row:
(5) 
Other types of assortativity constraints may be considered with simple adaptations of our algorithm, e.g., imposing a lower bound on the number of blocks satisfying Condition (5). In this study, we will use the strongest definition of assortativity based on Condition (4). With these constraints, the loglikelihood maximization model becomes:
(6a)  
s.t.  (6b)  
(6c)  
(6d) 
where represents a continuous variable acting as a threshold.
It is important to note that the assortativity constraints only apply on the blockmodel parameters . This does not completely eliminate the possibility of a disassortative partition as represented by , but strongly penalizes its loglikelihood in comparison to other assortative solutions.
Iva Likelihood Maximization
We introduce an iterative algorithm to solve (6a–6d). This algorithm starts with a random initial solution and proceeds by iteratively evaluating each possible relocation of a node to a different community. Each such relocation is only applied if its application combined with an optimal update of results into an improvement of the likelihood. As such, the evaluation of each relocation may require the solution of a small constrained convex optimization subproblem with variables and constraints to find an optimal for the new partition. For the classical DCSBM, the optimal is simply obtained via Equation (2). This is however, no longer true for the ACDCSBM due to the assortativity constraints. As described in Algorithm 1, the overhead associated to this operation can be mitigated by combining two techniques:

an incremental move evaluation approach, using the loglikelihood of the unconstrained subproblem (Lines 9–10) to filter relocation candidates (Line 11), and possibly keeping this solution if it naturally satisfies the assortativity constraints (Lines 12–13);
We use the interior point algorithm of Domahidi et al. [22] for the solution of each subproblem. When the partition is fixed, the constrained maximization subproblem takes the following form:
(7a)  
s.t.  (7b)  
(7c)  
(7d) 
where represents the number of edges between communities and according to the fixed partition and .
V Empirical Studies
We conduct extensive computational experiments on synthetic and real data sets to analyze three aspects of the proposed assortativeconstrained DCSBM (ACDCSBM). Firstly, we wish to know under which conditions the assortativity constraints help to converge to desirable partitions. Secondly, we compare the ACDCSBM, the standard DCSBM and the modularity maximization model in terms of community detection performance. Finally, we apply the ACDCSBM to graphs representing brain cortex data, highlighting structures which were not previously detected before and discuss the implications of the different models.
Va Networks Generated From a PPM
The standard DCSBM usually finds assortative solutions for assortative networks with a sufficient amount of information. However, it can be trapped into spurious nonassortative local minima on sparse or lightly assortative networks. To limit the number of factors, we conduct this first analysis on data sets generated by a simple PPM (Section IIIB) with blocks, nodes, an average degree of , and different ratio values for representing different assortativity levels. Our goal is to evaluate in which regimes the assortativity constraints are meaningful. Figure 2 therefore depicts the performance of the standard DCSBM and of the proposed ACDCSBM in terms of normalized mutual information (NMI) [23]. For each data set and model, we report the results of 100 independent runs from different initial solutions. These results are represented as box plots, where the whiskers extend to 1.5 times the interquartile range.
For the data sets of Figure 2, detectability is known to be possible for values of smaller than (see [24]). As expected, as the ratio
increases beyond 0.4, both models are unable to recover the communities. In contrast, when this ratio diminishes below 0.4, the performance of both methods improves, highlighting a phase transition towards a regime where partial recovery is possible. As visible in these experiments, the transition of ACDCSBM occurs before that of the standard DCSBM. For example, when
, ACDCSBM achieves an average NMI of , compared to for DCSBM. Similarly, when , ACDCSBM achieves nearperfect recovery on a much larger proportion of the runs. As such, it appears that the assortativity constraints are useful to guide likelihood maximization algorithms in challenging data sets located within the phase transition regime.VB Networks Generated From SBMs
We now repeat the previous experiment on general SBMs, characterized by a larger number of parameters. To compare the results of the DCSBM and ACDCSBM, we generate synthetic data sets with nodes and blocks. For each data set, the parameters are uniformly sampled in the following intervals:
(8)  
(9) 
Each node is allocated to one of the four blocks with equal probability. Then, for each node pair
, a number of edges is generated from a Poisson distribution centered in
, where and represent the blocks of and .Figure 3 compares the NMI obtained with the standard DCSBM and the proposed ACDCSBM on these networks. For each network and model, we conduct independent runs from different initial solutions and report the results as boxplots. ACDCSBM obtains on 49 out of 50 datasets a better or equal median NMI than DCSBM. DCSBM appears to be very sensible to lowquality local minima. This behavior is particularly visible on the first six data sets presented in the figure. A pairwise Wilcoxon test comparing the average NMI of both methods over the 50 data sets confirms the statistical significance of this difference of performance (with ).
In a second part of this analysis, we filter the set of solutions produced by the methods to focus on the top in terms of likelihood for each data set. This corresponds to a typical use case in which multiple independent runs are performed to avoid local minima. Figure 4 displays the relative difference between the NMI of the top solutions of the ACDCSBM and those of the standard DCSBM. For the sake of completeness, we repeat the same analysis with the modularitymaximization algorithm. As visible in these results, the best ACDCDBM solutions still outperform those of the two other approaches on most data sets. The statistical significance of these observations is confirmed by pairwise Wilcoxon tests (with and for DCDBM and modularity maximization, respectively).
Figure 5 finally compares the number of assortative communities found by ACDCSBM and DCSBM. The standard DCSBM produces much fewer assortative communities in average (2.43 compared to 3.76). As discussed earlier in this article, ACDCSBM only enforces constraints on the blockmodel parameters , and therefore does not systematically guarantee assortative partitions. Yet, nonassortative partitions are heavily penalized from a likelihood perspective and therefore generally avoided. Finally, remark that modularity maximization always produces assortative solutions, but its equivalence to the PPM (with only two parameters driving the distribution of the edges) limits its ability to fit more general SBMs. Among these alternatives, ACDCSBM appears to find a tradeoff between insufficient and excessive expressiveness.
VC Brain Cortex Networks
Many realworld networks are known to present assortative structures, e.g., in applications to module or community detection in brain cortex networks, proteinprotein interaction, and metabolic networks [25, 26, 27, 28]. We analyze in this section the case of the “cats cortex network”, which is known to have an assortative structure and is divided into four main functional areas: visual, auditory, frontolimbic, and somatosensorymotor duties [29]. The network is obtained from the corticocortical connectivity pattern described by [30], based on 1139 corticocortical connections and 65 cortical areas. As in most community detection tasks, the ground truth in this network is not available. In fact, there is no unique “correct” partitioning [31], but different algorithms can allow to highlight different underlying structures.
Figure 6 reports the communities found with the standard DCSBM, the ACDCSBM and modularity maximization models on this dataset. For each model, we performed 100 optimization runs and registered the best solution (in terms of likelihood or modularity).
The best solution obtained with the standard DCSBM is visibly nonassortative. The minimum value found along the diagonal is 1.5060, whereas the maximum value in the offdiagonal is 1.9050. The size of each group is similar, and one disassortative community acts as a “hub” for edges that flow between groups. In contrast, the partition produced by the ACDCSBM satisfies the strong assortativity conditions. The minimum value of the diagonal is 2.0196, and the maximum value in the offdiagonal is 1.7152. This solution includes communities of different sizes with edges which are more evenly distributed between groups. Two mutuallydisconnected community pairs are also identified (greenyellow and dark blueyellow). Finally, the modularitymaximization approach leads to the most assortative partitioning of this network. Yet, since the model does not take into consideration, this partitioning contains only three groups, contrasting with the four functional areas which were originally expected.
Vi Conclusions
Assortativity constraints arise as a natural approach to guide maximumlikelihood algorithms away from spurious local minima on networks which have a presupposed assortative structure. In this work, we have shown that these constraints can be effectively handled with tailored local optimization and interior point methods. Our experiments show that the resulting ACDCSBM significantly outperforms unconstrained community detection methods in lightly assortative graphs, especially in regimes which are close to the detectability threshold. In these circumstances, the classic SBM has a strong tendency to converge towards nonassortative solutions, while the modularity maximization model does not generalize well to graphs in which the number of edges between groups widely varies. On the practical example of a brain cortex network, the proposed ACDCSBM reveals drastically different community structures which were not identified by other algorithms.
The research perspectives related to this work are numerous. We recommend to further evaluate the impact of assortativity constraints on known phase transitions and thresholds. We also recommend to investigate different algorithmic paradigms to improve the solution of this constrained maximum likelihood formulation, and to pursue the study of the ACDCSBM in a wider range of application contexts.
References
 Abbe [2017] E. Abbe, “Community detection and stochastic block models: Recent developments,” The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6446–6531, 2017.
 Holland et al. [1983] P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic blockmodels: First steps,” Social Networks, vol. 5, no. 2, pp. 109–137, 1983.
 Nowicki and Snijders [2001] K. Nowicki and T. A. B. Snijders, “Estimation and prediction for stochastic blockstructures,” Journal of the American Statistical Association, vol. 96, no. 455, pp. 1077–1087, 2001.
 Karrer and Newman [2011] B. Karrer and M. E. Newman, “Stochastic blockmodels and community structure in networks,” Physical Review E, vol. 83, no. 1, p. 016107, 2011.
 Newman [2016] M. E. Newman, “Equivalence between modularity optimization and maximum likelihood methods for community detection,” Physical Review E, vol. 94, no. 5, p. 052315, 2016.
 Lee and Wilkinson [2019] C. Lee and D. J. Wilkinson, “A review of stochastic block models and extensions for graph clustering,” Applied Network Science, vol. 4, no. 1, p. 122, 2019.

McDaid et al. [2013]
A. F. McDaid, T. B. Murphy, N. Friel, and N. J. Hurley, “Improved bayesian inference for the stochastic block model with application to large networks,”
Computational Statistics & Data Analysis, vol. 60, pp. 12–31, 2013.  Peixoto [2019] T. P. Peixoto, “Bayesian stochastic blockmodeling,” Advances in Network Clustering and Blockmodeling, pp. 289–332, 2019.
 Wang et al. [2017] Y. R. Wang, P. J. Bickel et al., “Likelihoodbased model selection for stochastic block models,” The Annals of Statistics, vol. 45, no. 2, pp. 500–528, 2017.
 Airoldi et al. [2008] E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,” Journal of Machine Learning Research, vol. 9, no. Sep, pp. 1981–2014, 2008.

Decelle et al. [2011a]
A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, “Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications,”
Physical Review E, vol. 84, no. 6, p. 066106, 2011.  Lei et al. [2015] J. Lei, A. Rinaldo et al., “Consistency of spectral clustering in stochastic block models,” The Annals of Statistics, vol. 43, no. 1, pp. 215–237, 2015.
 Qin and Rohe [2013] T. Qin and K. Rohe, “Regularized spectral clustering under the degreecorrected stochastic blockmodel,” in Advances in Neural Information Processing Systems, 2013, pp. 3120–3128.
 Rohe et al. [2011] K. Rohe, S. Chatterjee, B. Yu et al., “Spectral clustering and the highdimensional stochastic blockmodel,” The Annals of Statistics, vol. 39, no. 4, pp. 1878–1915, 2011.

Cai et al. [2015]
T. T. Cai, X. Li et al.
, “Robust and computationally feasible community detection in the presence of arbitrary outlier nodes,”
The Annals of Statistics, vol. 43, no. 3, pp. 1027–1059, 2015.  Chen et al. [2012] Y. Chen, S. Sanghavi, and H. Xu, “Clustering sparse graphs,” in Advances in Neural Information Processing Systems, 2012, pp. 2204–2212.

Moore et al. [2011]
C. Moore, X. Yan, Y. Zhu, J.B. Rouquier, and T. Lane, “Active learning for node classification in assortative and disassortative networks,” in
Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data mining. ACM, 2011, pp. 841–849.  Gopalan et al. [2012] P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno, “Scalable inference of overlapping communities,” in Advances in Neural Information Processing Systems, 2012, pp. 2249–2257.
 Li et al. [2016] W. Li, S. Ahn, and M. Welling, “Scalable MCMC for mixed membership stochastic blockmodels,” in Artificial Intelligence and Statistics, 2016, pp. 723–731.
 Lu and Szymanski [2019] X. Lu and B. K. Szymanski, “Regularized stochastic block model for robust community detection in complex networks,” arXiv preprint arXiv:1903.11751, 2019.
 Amini et al. [2018] A. A. Amini, E. Levina et al., “On semidefinite relaxations for the block model,” The Annals of Statistics, vol. 46, no. 1, pp. 149–179, 2018.
 Domahidi et al. [2013] A. Domahidi, E. Chu, and S. Boyd, “ECOS: An SOCP solver for embedded systems,” in 2013 European Control Conference (ECC). IEEE, 2013, pp. 3071–3076.
 Kvalseth [1987] T. O. Kvalseth, “Entropy and correlation: Some comments,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 17, no. 3, pp. 517–519, 1987.
 Decelle et al. [2011b] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, “Inference and phase transitions in the detection of modules in sparse networks,” Physical Review Letters, vol. 107, no. 6, p. 065701, 2011.
 Chen et al. [2008] Z. J. Chen, Y. He, P. RosaNeto, J. Germann, and A. C. Evans, “Revealing modular architecture of human brain structural networks by using cortical thickness from MRI,” Cerebral Cortex, vol. 18, no. 10, pp. 2374–2381, 2008.
 Kreimer et al. [2008] A. Kreimer, E. Borenstein, U. Gophna, and E. Ruppin, “The evolution of modularity in bacterial metabolic networks,” Proceedings of the National Academy of Sciences, vol. 105, no. 19, pp. 6976–6981, 2008.
 Ravasz et al. [2002] E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.L. Barabási, “Hierarchical organization of modularity in metabolic networks,” Science, vol. 297, no. 5586, pp. 1551–1555, 2002.
 Huss and Holme [2007] M. Huss and P. Holme, “Currency and commodity metabolites: their identification and relation to the modularity of metabolic networks,” IET Systems Biology, vol. 1, no. 5, pp. 280–285, 2007.
 Lameu et al. [2016] E. L. Lameu, F. S. Borges, R. R. Borges, K. C. Iarosz, I. L. Caldas, A. M. Batista, R. L. Viana, and J. Kurths, “Suppression of phase synchronisation in network based on cat’s brain,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 26, no. 4, p. 043107, 2016.
 Scannell et al. [1995] J. W. Scannell, C. Blakemore, and M. P. Young, “Analysis of connectivity in the cat cerebral cortex,” Journal of Neuroscience, vol. 15, no. 2, pp. 1463–1483, 1995.
 Peel et al. [2017] L. Peel, D. B. Larremore, and A. Clauset, “The ground truth about metadata and community detection in networks,” Science Advances, vol. 3, no. 5, p. e1602548, 2017.
Comments
There are no comments yet.