Community detection methods hold a central place in machine learning, with an extensive range of applications related to sociological behavior, protein interactions, image segmentation, and gene expressions analysis. In most of these applications, the actual classes of the nodes in the network are unknown, but pairwise relations between nodes are exploited to identify communities.
Fitting the parameters of a stochastic block model (SBM) [2, 3] to a given graph is a prominent way of searching for communities. The canonical SBM assumes that each node belongs to one block (representing a community) and that the expected number of edges between two nodes depends only on the blocks to which they belong. Thus, the model only assumes that nodes within each block are statistically equivalent in their connectivity patterns. Several variations of the standard SBM were also introduced to overcome some of its limitations. The degree-corrected SBM (DC-SBM) introduced by Karrer and Newman , in particular, allows non-uniform node degree distributions, making block modeling more representative of real-world networks.
Broadly speaking, a solution for community detection (represented as a partition of the node set into communities) is assortative when connections within communities are more frequent than in between communities, it is disassortative when connections within communities are less frequent than in between communities, and finally it is non-assortative if no such relation exists among all communities. SBM-based community detection approaches are agnostic to the assortativity of their solutions. They allow to search for solutions with a pre-defined number of communities, and can indifferently model assortative and disassortative structures. This modeling capability can be viewed as an asset but also as a weakness. Indeed, SBMs are often used in contexts in which users expect assortative solutions. In the most dramatic situations, non-assortative solutions might go under the radar and lead to mistakes of interpretation. In other cases, non-assortative solutions with a better likelihood may substitute the assortative solutions which were originally sought (Figure 1). This later situation is especially prevalent in case studies involving sparse graphs, or with lightly assortative structures which challenge detection algorithms.
In this work, we propose a variant of the DC-SBM which includes user knowledge about assortativity. We incorporate this information by setting assortativity constraints on the DC-SBM parameter set. Indeed, if the user expects an assortative solution due to the characteristics of the application case, it is plausible to guide the convergence of the model via additional constraints. We show that the resulting constrained likelihood-maximization model can be solved efficiently with an iterative method based on local-search and interior-point algorithms. Our computational experiments show that the assortativity constraints prevents the search from converging towards spurious non-assortative local minima, especially for sparse networks, and that they contribute to identify different solution structures in application cases related to the analysis of the brain cortex. The key contributions of this work are therefore the following.
We introduce a DC-SBM variant which incorporates assortativity constraints to represent prior user knowledge;
We propose an efficient solution approach based on local optimization and interior-point algorithms for this model;
Through extensive computational experiments, we discuss the practical implications of this constrained model and identify the regimes in which it contributes to improve community detection practice.
Ii Related Works
SBMs are commonly used to extract meaningful information from complex networks. The classical SBM is also a natural modeling choice for community detection  and a generalization of modularity maximization . The surveys of Abbe  and Lee and Wilkinson 
discuss key results regarding recovery requirements and solution algorithms. Different types of algorithms can be used to fit SBMs, based on Markov Chain Monte Carlo (MCMC) approaches[7, 3, 8], variational inference [9, 10], belief propagation 12, 13, 14], and semidefinite programming [15, 16], among others.
To date, few works have considered the possibility of incorporating prior information on assortativity. Moore et al. 
studied a SBM in which the edge probabilities within and in between communities follow a Beta prior. The hyperparameters defining the Beta distributions drive the degree of assortativity in the graph. Yet, according to their experiments, these priors dominate only in small or sparse data sets, otherwise they tend to wash out.
The Assortative Mixed Membership SBM (a-MMSB) introduced by Gopalan et al. 
considers soft node-to-community assignments and includes a latent parameter describing community strength, representing how tightly nodes are connected within each group. Edges are assumed to be drawn from a Bernoulli distribution centered around the community strength if the nodes belong to the same group. Otherwise, the distribution is centered around a small value. A variational inference approach is used to fit the model.Li et al.  pursued this research line by proposing a scalable MCMC method using a stochastic gradient algorithm for posterior inference in the a-MMSB.
Lu and Szymanski  finally proposed a regularized variant of the DC-SBM, using a prior to regularize the observed in-degree ratio of each node. In practice, this adaptation turns out to penalize high-degree nodes with many connections to other communities. The new parameter is adjusted to control the assortativity level, and a MCMC algorithm is used to infer the block assignments.
The aforementioned models aim to better fit assortative networks, but they are either dependent on ad-hoc parameters which are difficult to scale [18, 20], or of limited effect for larger graphs . In light of these works, we decided to explore a different approach, which consists in guiding the search towards assortative structures via constraints in the SBM parameters. To fit our model, we propose effective algorithms for the resulting constrained maximum-likelihood optimization problem.
Iii Background and Notations
Iii-a Degree-Corrected Stochastic Block Model (DC-SBM)
In its most fundamental form, the DC-SBM considers nodes allocated to groups. We assume that the number of edges between a pair of nodes depends only on the groups to which the nodes belong and on their degrees . Finding the latent membership of nodes corresponds to finding the block-model parameters that best fit the observed graph . For an observed adjacency matrix representing a graph with (possibly weighted) edges, the log-likelihood function of the DC-SBM is calculated as :
In this equation, is the degree of node , and is the total number of edges. Variables represent the binary community assignments, in such a way that indicates that node is assigned to group . is a symmetric edge probability matrix. Each element of corresponds to the expected number of edges between any two points in groups and . The expected number of edges between nodes and is , for and . If we fix the assignment , then the (unconstrained) maximum-likelihood for each parameter
can be estimated by differentiation:
where is the number of edges between groups and , and is the sum of the degrees of nodes in group . If we substitute in Equation (1), we obtain the following log-likelihood function:
in which we dropped the terms that do not involve .
Iii-B Planted Partition Model and Modularity
The Planted Partition Model (PPM) is a special case of the standard SBM with only two parameters describing the blocks: if , and if . Newman  shows that maximizing the likelihood of the PPM is equivalent to maximizing modularity. Modularity optimization maximizes the difference between the observed graph and a random graph where edges are reinserted randomly and the degrees of each node is preserved. As a consequence, it results in maximizing the number of edges within groups, leading to assortative solutions. However, modularity maximization is also subject to strong limitations: beyond its inability to define the number of communities, the model assumes that all communities have similar statistical properties . This is a major issue when the distribution of edges between the blocks varies significantly.
Iv Assortative-Constrained SBM
We now introduce the assortative-constrained degree-corrected SBM (AC-DC-SBM) along with an efficient algorithm to fit it by maximum likelihood. Following Amini et al. , two main notions of assortativity can be distinguished for block models:
Strong assortativity. All diagonal terms of are greater or equal than all off-diagonal terms:
Weak assortativity. Each diagonal term of is greater or equal than the other terms in its row:
Other types of assortativity constraints may be considered with simple adaptations of our algorithm, e.g., imposing a lower bound on the number of blocks satisfying Condition (5). In this study, we will use the strongest definition of assortativity based on Condition (4). With these constraints, the log-likelihood maximization model becomes:
where represents a continuous variable acting as a threshold.
It is important to note that the assortativity constraints only apply on the block-model parameters . This does not completely eliminate the possibility of a disassortative partition as represented by , but strongly penalizes its log-likelihood in comparison to other assortative solutions.
Iv-a Likelihood Maximization
We introduce an iterative algorithm to solve (6a–6d). This algorithm starts with a random initial solution and proceeds by iteratively evaluating each possible relocation of a node to a different community. Each such relocation is only applied if its application combined with an optimal update of results into an improvement of the likelihood. As such, the evaluation of each relocation may require the solution of a small constrained convex optimization subproblem with variables and constraints to find an optimal for the new partition. For the classical DC-SBM, the optimal is simply obtained via Equation (2). This is however, no longer true for the AC-DC-SBM due to the assortativity constraints. As described in Algorithm 1, the overhead associated to this operation can be mitigated by combining two techniques:
an incremental move evaluation approach, using the log-likelihood of the unconstrained subproblem (Lines 9–10) to filter relocation candidates (Line 11), and possibly keeping this solution if it naturally satisfies the assortativity constraints (Lines 12–13);
We use the interior point algorithm of Domahidi et al.  for the solution of each subproblem. When the partition is fixed, the constrained maximization subproblem takes the following form:
where represents the number of edges between communities and according to the fixed partition and .
V Empirical Studies
We conduct extensive computational experiments on synthetic and real data sets to analyze three aspects of the proposed assortative-constrained DC-SBM (AC-DC-SBM). Firstly, we wish to know under which conditions the assortativity constraints help to converge to desirable partitions. Secondly, we compare the AC-DC-SBM, the standard DC-SBM and the modularity maximization model in terms of community detection performance. Finally, we apply the AC-DC-SBM to graphs representing brain cortex data, highlighting structures which were not previously detected before and discuss the implications of the different models.
V-a Networks Generated From a PPM
The standard DC-SBM usually finds assortative solutions for assortative networks with a sufficient amount of information. However, it can be trapped into spurious non-assortative local minima on sparse or lightly assortative networks. To limit the number of factors, we conduct this first analysis on data sets generated by a simple PPM (Section III-B) with blocks, nodes, an average degree of , and different ratio values for representing different assortativity levels. Our goal is to evaluate in which regimes the assortativity constraints are meaningful. Figure 2 therefore depicts the performance of the standard DC-SBM and of the proposed AC-DC-SBM in terms of normalized mutual information (NMI) . For each data set and model, we report the results of 100 independent runs from different initial solutions. These results are represented as box plots, where the whiskers extend to 1.5 times the interquartile range.
increases beyond 0.4, both models are unable to recover the communities. In contrast, when this ratio diminishes below 0.4, the performance of both methods improves, highlighting a phase transition towards a regime where partial recovery is possible. As visible in these experiments, the transition of AC-DC-SBM occurs before that of the standard DC-SBM. For example, when, AC-DC-SBM achieves an average NMI of , compared to for DC-SBM. Similarly, when , AC-DC-SBM achieves near-perfect recovery on a much larger proportion of the runs. As such, it appears that the assortativity constraints are useful to guide likelihood maximization algorithms in challenging data sets located within the phase transition regime.
V-B Networks Generated From SBMs
We now repeat the previous experiment on general SBMs, characterized by a larger number of parameters. To compare the results of the DC-SBM and AC-DC-SBM, we generate synthetic data sets with nodes and blocks. For each data set, the parameters are uniformly sampled in the following intervals:
Each node is allocated to one of the four blocks with equal probability. Then, for each node pair
, a number of edges is generated from a Poisson distribution centered in, where and represent the blocks of and .
Figure 3 compares the NMI obtained with the standard DC-SBM and the proposed AC-DC-SBM on these networks. For each network and model, we conduct independent runs from different initial solutions and report the results as boxplots. AC-DC-SBM obtains on 49 out of 50 datasets a better or equal median NMI than DC-SBM. DC-SBM appears to be very sensible to low-quality local minima. This behavior is particularly visible on the first six data sets presented in the figure. A pairwise Wilcoxon test comparing the average NMI of both methods over the 50 data sets confirms the statistical significance of this difference of performance (with ).
In a second part of this analysis, we filter the set of solutions produced by the methods to focus on the top in terms of likelihood for each data set. This corresponds to a typical use case in which multiple independent runs are performed to avoid local minima. Figure 4 displays the relative difference between the NMI of the top solutions of the AC-DC-SBM and those of the standard DC-SBM. For the sake of completeness, we repeat the same analysis with the modularity-maximization algorithm. As visible in these results, the best AC-DC-DBM solutions still outperform those of the two other approaches on most data sets. The statistical significance of these observations is confirmed by pairwise Wilcoxon tests (with and for DC-DBM and modularity maximization, respectively).
Figure 5 finally compares the number of assortative communities found by AC-DC-SBM and DC-SBM. The standard DC-SBM produces much fewer assortative communities in average (2.43 compared to 3.76). As discussed earlier in this article, AC-DC-SBM only enforces constraints on the block-model parameters , and therefore does not systematically guarantee assortative partitions. Yet, non-assortative partitions are heavily penalized from a likelihood perspective and therefore generally avoided. Finally, remark that modularity maximization always produces assortative solutions, but its equivalence to the PPM (with only two parameters driving the distribution of the edges) limits its ability to fit more general SBMs. Among these alternatives, AC-DC-SBM appears to find a trade-off between insufficient and excessive expressiveness.
V-C Brain Cortex Networks
Many real-world networks are known to present assortative structures, e.g., in applications to module or community detection in brain cortex networks, protein-protein interaction, and metabolic networks [25, 26, 27, 28]. We analyze in this section the case of the “cats cortex network”, which is known to have an assortative structure and is divided into four main functional areas: visual, auditory, frontolimbic, and somatosensory-motor duties . The network is obtained from the cortico-cortical connectivity pattern described by , based on 1139 cortico-cortical connections and 65 cortical areas. As in most community detection tasks, the ground truth in this network is not available. In fact, there is no unique “correct” partitioning , but different algorithms can allow to highlight different underlying structures.
Figure 6 reports the communities found with the standard DC-SBM, the AC-DC-SBM and modularity maximization models on this dataset. For each model, we performed 100 optimization runs and registered the best solution (in terms of likelihood or modularity).
The best solution obtained with the standard DC-SBM is visibly non-assortative. The minimum value found along the diagonal is 1.5060, whereas the maximum value in the off-diagonal is 1.9050. The size of each group is similar, and one disassortative community acts as a “hub” for edges that flow between groups. In contrast, the partition produced by the AC-DC-SBM satisfies the strong assortativity conditions. The minimum value of the diagonal is 2.0196, and the maximum value in the off-diagonal is 1.7152. This solution includes communities of different sizes with edges which are more evenly distributed between groups. Two mutually-disconnected community pairs are also identified (green-yellow and dark blue-yellow). Finally, the modularity-maximization approach leads to the most assortative partitioning of this network. Yet, since the model does not take into consideration, this partitioning contains only three groups, contrasting with the four functional areas which were originally expected.
Assortativity constraints arise as a natural approach to guide maximum-likelihood algorithms away from spurious local minima on networks which have a presupposed assortative structure. In this work, we have shown that these constraints can be effectively handled with tailored local optimization and interior point methods. Our experiments show that the resulting AC-DC-SBM significantly outperforms unconstrained community detection methods in lightly assortative graphs, especially in regimes which are close to the detectability threshold. In these circumstances, the classic SBM has a strong tendency to converge towards non-assortative solutions, while the modularity maximization model does not generalize well to graphs in which the number of edges between groups widely varies. On the practical example of a brain cortex network, the proposed AC-DC-SBM reveals drastically different community structures which were not identified by other algorithms.
The research perspectives related to this work are numerous. We recommend to further evaluate the impact of assortativity constraints on known phase transitions and thresholds. We also recommend to investigate different algorithmic paradigms to improve the solution of this constrained maximum likelihood formulation, and to pursue the study of the AC-DC-SBM in a wider range of application contexts.
- Abbe  E. Abbe, “Community detection and stochastic block models: Recent developments,” The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6446–6531, 2017.
- Holland et al.  P. W. Holland, K. B. Laskey, and S. Leinhardt, “Stochastic blockmodels: First steps,” Social Networks, vol. 5, no. 2, pp. 109–137, 1983.
- Nowicki and Snijders  K. Nowicki and T. A. B. Snijders, “Estimation and prediction for stochastic blockstructures,” Journal of the American Statistical Association, vol. 96, no. 455, pp. 1077–1087, 2001.
- Karrer and Newman  B. Karrer and M. E. Newman, “Stochastic blockmodels and community structure in networks,” Physical Review E, vol. 83, no. 1, p. 016107, 2011.
- Newman  M. E. Newman, “Equivalence between modularity optimization and maximum likelihood methods for community detection,” Physical Review E, vol. 94, no. 5, p. 052315, 2016.
- Lee and Wilkinson  C. Lee and D. J. Wilkinson, “A review of stochastic block models and extensions for graph clustering,” Applied Network Science, vol. 4, no. 1, p. 122, 2019.
McDaid et al. 
A. F. McDaid, T. B. Murphy, N. Friel, and N. J. Hurley, “Improved bayesian inference for the stochastic block model with application to large networks,”Computational Statistics & Data Analysis, vol. 60, pp. 12–31, 2013.
- Peixoto  T. P. Peixoto, “Bayesian stochastic blockmodeling,” Advances in Network Clustering and Blockmodeling, pp. 289–332, 2019.
- Wang et al.  Y. R. Wang, P. J. Bickel et al., “Likelihood-based model selection for stochastic block models,” The Annals of Statistics, vol. 45, no. 2, pp. 500–528, 2017.
- Airoldi et al.  E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing, “Mixed membership stochastic blockmodels,” Journal of Machine Learning Research, vol. 9, no. Sep, pp. 1981–2014, 2008.
Decelle et al. [2011a]
A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, “Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications,”Physical Review E, vol. 84, no. 6, p. 066106, 2011.
- Lei et al.  J. Lei, A. Rinaldo et al., “Consistency of spectral clustering in stochastic block models,” The Annals of Statistics, vol. 43, no. 1, pp. 215–237, 2015.
- Qin and Rohe  T. Qin and K. Rohe, “Regularized spectral clustering under the degree-corrected stochastic blockmodel,” in Advances in Neural Information Processing Systems, 2013, pp. 3120–3128.
- Rohe et al.  K. Rohe, S. Chatterjee, B. Yu et al., “Spectral clustering and the high-dimensional stochastic blockmodel,” The Annals of Statistics, vol. 39, no. 4, pp. 1878–1915, 2011.
Cai et al. 
T. T. Cai, X. Li et al.
, “Robust and computationally feasible community detection in the presence of arbitrary outlier nodes,”The Annals of Statistics, vol. 43, no. 3, pp. 1027–1059, 2015.
- Chen et al.  Y. Chen, S. Sanghavi, and H. Xu, “Clustering sparse graphs,” in Advances in Neural Information Processing Systems, 2012, pp. 2204–2212.
Moore et al. 
C. Moore, X. Yan, Y. Zhu, J.-B. Rouquier, and T. Lane, “Active learning for node classification in assortative and disassortative networks,” inProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data mining. ACM, 2011, pp. 841–849.
- Gopalan et al.  P. K. Gopalan, S. Gerrish, M. Freedman, D. M. Blei, and D. M. Mimno, “Scalable inference of overlapping communities,” in Advances in Neural Information Processing Systems, 2012, pp. 2249–2257.
- Li et al.  W. Li, S. Ahn, and M. Welling, “Scalable MCMC for mixed membership stochastic blockmodels,” in Artificial Intelligence and Statistics, 2016, pp. 723–731.
- Lu and Szymanski  X. Lu and B. K. Szymanski, “Regularized stochastic block model for robust community detection in complex networks,” arXiv preprint arXiv:1903.11751, 2019.
- Amini et al.  A. A. Amini, E. Levina et al., “On semidefinite relaxations for the block model,” The Annals of Statistics, vol. 46, no. 1, pp. 149–179, 2018.
- Domahidi et al.  A. Domahidi, E. Chu, and S. Boyd, “ECOS: An SOCP solver for embedded systems,” in 2013 European Control Conference (ECC). IEEE, 2013, pp. 3071–3076.
- Kvalseth  T. O. Kvalseth, “Entropy and correlation: Some comments,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 17, no. 3, pp. 517–519, 1987.
- Decelle et al. [2011b] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, “Inference and phase transitions in the detection of modules in sparse networks,” Physical Review Letters, vol. 107, no. 6, p. 065701, 2011.
- Chen et al.  Z. J. Chen, Y. He, P. Rosa-Neto, J. Germann, and A. C. Evans, “Revealing modular architecture of human brain structural networks by using cortical thickness from MRI,” Cerebral Cortex, vol. 18, no. 10, pp. 2374–2381, 2008.
- Kreimer et al.  A. Kreimer, E. Borenstein, U. Gophna, and E. Ruppin, “The evolution of modularity in bacterial metabolic networks,” Proceedings of the National Academy of Sciences, vol. 105, no. 19, pp. 6976–6981, 2008.
- Ravasz et al.  E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barabási, “Hierarchical organization of modularity in metabolic networks,” Science, vol. 297, no. 5586, pp. 1551–1555, 2002.
- Huss and Holme  M. Huss and P. Holme, “Currency and commodity metabolites: their identification and relation to the modularity of metabolic networks,” IET Systems Biology, vol. 1, no. 5, pp. 280–285, 2007.
- Lameu et al.  E. L. Lameu, F. S. Borges, R. R. Borges, K. C. Iarosz, I. L. Caldas, A. M. Batista, R. L. Viana, and J. Kurths, “Suppression of phase synchronisation in network based on cat’s brain,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 26, no. 4, p. 043107, 2016.
- Scannell et al.  J. W. Scannell, C. Blakemore, and M. P. Young, “Analysis of connectivity in the cat cerebral cortex,” Journal of Neuroscience, vol. 15, no. 2, pp. 1463–1483, 1995.
- Peel et al.  L. Peel, D. B. Larremore, and A. Clauset, “The ground truth about metadata and community detection in networks,” Science Advances, vol. 3, no. 5, p. e1602548, 2017.