1 Introduction
Identifying statistically significant dependency between two or more sets of attributes serves as a key first check before further investigations are warranted. The space of possible attributes and their statistical dependencies is truly enormous, ranging from scalar values with relatively simple linear relationship to data with high dimensions, complex structures and nonlinear relationships. There are many traditional and recent new procedures for testing dependency on linear, nonlinear, lowdimensional Euclidean data with satisfactory performance, e.g., [1, 2, 3, 4, 5, 6, 7, 8], while detecting relationship between data of highdimensional or complex structure remains difficult and less wellunderstood. The large and still growing amount of structured data motivate a development of methods for those data.
In particular, graphs are emerging as a prevalent form of data representation in many scientific areas, ranging from linguistics to neuroscience to sociology. Graphs have complex structures and relationships. One type of question about graphs is to ask whether a given pair of graphs are statistically dependent. For example, one could ask "to what extent are the connectomes (brain graphs) of the left and right hemisphere of a species correlated with each other?", or "is the connectome constructed on chemical synapses statistically dependent on the connectome constructed on gap junctions? If so, how strong is such correlation?". The answers to these questions would explain the presence or absence of relationships between the objects of interest.
We investigate a popular graph model, the correlated stochastic block model (SBM), and propose a statistical test for testing (conditional) dependence between two sample graphs from SBM. The test utilizes the adjacency matrices and the block permutation procedure. We prove the validity of the resulting procedure, and demonstrate its effectiveness both simulated graphs and real brain graphs (connectomes).
2 Preliminary
Correlated Bernoulli Graphs
Let
be a graphvalued random variable with sample
. Each graph is defined by a set of vertices, , where , and a set of edges between pairs of vertices . Let be an adjacency matrixvalued random variable taking values , identifying which pairs of vertices share edges. Here, the graph is undirected, so the corresponding adjacency matrix is symmetric.The correlated ErdosRenyi Model was proposed as an intuitive way to capture correlations between graphs [9]
. ErdosRenyi Model (ER) is a random graph in which each edge is sampled i.i.d (independent and identically distributed) from a Bernoulli distribution with some parameter
. Let be a random variable denoting whether there is an edge from vertex to vertex in graph . are called correlated ER() if and are ER graphs with parameter and the two random variables and have Pearson’s correlation for all . In fact, one can generalize the notion ofcorrelated Bernoulli Graphs by allowing for different marginal probabilities for the two graphs. Namely,
are called correlated ER() if is a ER(), is a ER() and are correlated.To sample a pair of correlated ER, consider random variable . Note that can be written as following by the definition of Pearson’s correlation:
. Given this equation and the marginal probabilities of and , one can solve for the joint probability for each value of and and get the following sampling procedure: First, realize . Then, if , independently realize:
if , independently realize:
Note that this is only valid when:
To generate a pair of correlated ER(), one can simply follow this procedure for each edge independently.
The Stochastic Block Model (SBM) is a generalization of ER. SBM is parametrized by the block probability matrix , where is the number of blocks [10]. Each community is labeled . The entry gives the probability of an edge from a node in community to a node in community , for all . The community assignment of each node is given by the community membership function . For all node , would mean that node is a member of block . ER is a SBM with , so the block probability matrix . A block refers to a submatrix in the adjacency matrix formed by the edges connecting every node in community to every node in community . Every edge within a block is sampled i.i.d from .
One can generalize ER to correlated SBM similar to how one generalize an ER to a SBM. Assuming the same community assignment (but possibly different block probability matrix), to generate a pair of correlated SBM, one can follow the procedure of generating correlated ER for each block separately.
A further generalization of SBM is the Independent Edge Graph (IE). An IE is parametrize by the edge probability matrix , where is the number of vertices. The probability of an edge from vertex to vertex is given by . An ER is an IE with for all , and an SBM is an IE with . and are called correlated Bernoulli graphs if are both IE and the random variables and have Pearson’s correlation of for all
. Under the setting of correlated Bernoulli Graphs, the null hypothesis of the graph independence test is
, and the alternative is .Correlated Gaussian Graphs
Correlated Bernoulli graphs are binary by definition. To sample correlated weighted graphs, we leverage the joint normal distribution and introduce
Correlated Gaussian Graphs. and are called correlated Gaussian ER() if every pair of edges are sampled from a joint normal distribution with mean, variance
and covariance . One can generalize correlated Gaussian ER() to have different marginal distributions. Namely, and are called correlated Gaussian ER(), where if for all . One can further generalize correlated Gaussian ER to correlated Gaussian SBM by following the procedure of generating correlated Gaussian ER for each block separately. In the rest of this paper, we refer correlated Bernoulli graphs and correlated Gaussian graphs together as correlated graphs.Conditional independence testing
Conditional independence testing, also referred to as partial testing, is the testing of independence between conditional distributions [11]. Conditional independence is important if one is interested in identifying correlation given some known structure in the data. As a concrete example, given the connectomes of two individuals, we might observe the same block structure in both graphs because the brain of each individual is segmented into two hemispheres. The correlation due to such inherent structure might be interesting, but often we are more interested in any correlation that might exist in addition to the structural correlation, rendering partial testing an important problem.
A conditional independence testing problem can be formulated under the setting of correlated graphs. Let and be two correlated SBMs, and the corresponding adjacency matrices and , jointly sampled from a distribution . Since for correlated graphs, both graphs have the same set of uniquely labeled vertices, all variability is in the adjacency matrix, that is , and . For notational simplicity, we will use the same notation to refer to the graphs and the adjacency matrices for the rest of this paper. Furthermore, let and be the marginal distributions of the adjacency matrices conditioning on the community assignments. To determine whether are independent, given the community assignments, the following hypothesis is tested: and .
Given a pair of correlated SBMs, without conditioning on the community assignments, even when , both graphs still have the same block structures, which leads to some correlation between them. But when conditioning on the community assignments, any remaining correlation must be due to the correlation between edges. Therefore, in the paradigm of conditional testing, the null distribution asserts that .
Distance correlation (DCorr)
Distance correlation is a generalization of the classical Pearson correlation, to random vectors with arbitrary dimensions or in metric spaces
[3]. Consider semimetric spaces and with distance functions and . Consider random variables and . The distance covariance function for any random variables and is defined as the positive square root of the following ( and are independent copies of and respectively) [12]:is zero if and only if and are independent and is nonzero otherwise. Usually, it is assumed that and and the metric is Euclidean distance, but the setting considered herein is slightly different. Given the fact that the distance covariance function characterizes whether and are independent, the graph independence testing problem can be described under this formulation.
Let be the set of vertices of graph and , let function denote the community assignment of each vertex in , and let the distances and be functions of the adjacency matrix entries respectively (we introduce the notion of kernelinduced distance explicitly in Section 3) Then consider random variables and . Informally, the two metric spaces both include the same set of vertices with the same community assignments, and use the kernelinduced distance as distance functions. Given the definition above, the more formal formulation of the hypothesis under testing is the following: and .
For notational simplicity, we drop the subscript . The distance covariance function can be normalized to the distance correlation function as:
Given samples jointly sampled from
, an unbiased estimate of
based solely on the sample distance matrices is described in [13]. This sample test statistic is used for
DCorr in Algorithm 1 in the Appendix.Multiscale Graph Correlation (MGC)
Multiscale Graph Correlation (MGC) builds upon distance correlation by exploring all local distance correlation and efficiently searching for the optimal scale. The algorithm is described in details in [14]
and it is demonstrated that compared to distance correlation, MGC loses almost no power in monotonic dependencies and achieves better finite sample power on high dimensional data with nonmonotonic relationships. It is shown that many theoretic properties that hold for distance correlation also holds for MGC
[15]. Similar to distance correlation, for a sample test statistic, MGC can take as inputs two sample distances matrices, then output a test statistic indicating the strength of correlation.3 Methods
Graph Correlation using Induced Metric from Adjacency Matrix
One can view the adjacency matrix of an undirected graph as a similarity or kernel matrix, where the similarity of node and is the weight of the edge between them. A kernel can be converted into a distance metric using the bijective transformation between metric and kernels introduced in [16]. To that end, each adjacency entries is normalized by the maximum within the matrix (which is for unweighted graphs), and the diagonals of the adjacency matrix are tweaked to ensure the transformed distance satisfies the identity property, i.e. the distance of each node to itself is . Eventually, the transformed metric used in this paper is: , where
is the identity matrix and
is the matrix of ones. After computing the distances matrices for both graphs, one can use any proper correlations measures, such as MGC and DCorr that takes distance matrices as inputs. The sample test statistic described in [13] is used for DCorr and the test statistic in [14] is used for MGC.Pearson Correlation
Alternatively, one can ignore the graph structure entirely and calculate the Pearson correlation using the vectorized adjacency matrices as a measure of their correlations. The test statistic can be expressed as the following ( and denote the overall mean of the adjacency matrices respectively):
Vectorization is necessary since Pearson only operates on 1dimensional data. Pearson assumes the samples are i.i.d, but this assumption is violated for correlated Bernoulli graphs in general, except under the special setting of ER. As an example, in general, for SBM, each pair of edges is sampled independently, but potentially under different distributions (namely, edges in different blocks are sampled under Bernoulli with different parameters). We investigate how this violation of the i.i.d assumption affects the correlationbased procedure in Section 5. Both Algorithm 1 and 2 in the Appendix are procedures to compute a test statistics for measuring correlation between graphs, and they are referred to as GCorr (graph correlation) in subsequent algorithms.
Computing pvalue
Computing pvalue requires using a permutation test to estimate the distribution of the test statistic under the null. Under the setting of correlated graphs, in general, a naive permutation of the rowcolumn pairs of the adjacency matrix would result in an invalid test for SBM (the power would converge to 1 under the null), because the distribution of the permuted matrices does not approximate the null distribution at . Intuitively, since the block structure is the same in both graphs, one implicitly desires a conditional independence testing, which is not enabled by the usual permutation test procedure.
In general, it is not clear what a valid permutation test would be for correlated graphs. Such permutation should preserve the inherent graph structure while smearing the edge correlations. Under SBM, however, since the inherent graph structure is captured completely by the block structure, one can perform a block permutation test (Algorithm 3 in the Appendix) [11]. Namely, given the community assignment of nodes, the edges within each block are permuted, which preserves the block structure and thus is able to approximate the null distribution.
In practice, we usually don’t know the community assignment of each node. Assuming the vertices of both graphs are matched (there is a bijection between the vertices of both graphs), we can use a Joint Random Dot Product Graph (JRDPG) model to estimate the community assignment jointly. JRDPG is a procedure to embed multiple graphs sampled under some joint distribution. It works by finding the adjacency spectral embedding (ASE) of a matrix formed by concatenating the ASE of each of the jointly sampled graphs. If the graphs are sampled under an SBM, one can recover the communities by clustering the embeddings
[17]. The procedure to estimate the community assignment is Algorithm 4, and the procedure to compute pvalue is Algorithm 5 (both algorithms are in the Appendix).In Algorithm 4, one needs to choose the parameters , the number of dimensions of the latent positions, and , the prior estimate of the number of communities.
is chosen via the scree plots of the singular values
[18], whilecan be chosen with prior knowledge about the graph structure, or one can select the optimal number of clusters in a Gaussian Mixture Model (GMM) by selecting the clustering with the best Bayesian Information Criterion (BIC).
4 Theoretical Results
To derive the theoretic property of the proposed test, we operate under the setting of correlated SBM (which can be unweighted correlated Bernoulli graphs or weighted correlated Gaussian graphs), as stated in the following assumption:
Assumption 1.
The adjacency matrices are sampled jointly from a correlated SBM distribution.
First, we show that the test is valid, i.e., it properly controls the type I error. We prove validity by showing that block permutation results in a test statistic that equals the test statistic under the null in distribution. To simplify the proof, we write
as , where are the adjacency matrices used by as kernels respectively. is the same as in the sense that distance correlation is computed by first computing a distance matrix for and respectively, using the kernelinduced distances of and .Proposition 2.
In conditional testing, the Pearson, DCorr and MGC statistics are no longer under conditional independence. Instead, the test statistics under the null shall converge to some nonzero constants that depends on the actual distributions. Moreover, since the adjacency matrix is not positive semidefinite, the constant could be negative (whereas in case of Euclidean data and Euclidean distance, DCorr is asymptotically nonnegative).
Theorem 3.
For any of Pearson, DCorr, MGC statistics, for ER.
For SBM with fixed marginals, there exists a unique constant such that if and only if .
Therefore, any of the three sample correlations using block permutation is consistent against all possible alternatives under SBM.
Note that the theorem holds for either the binary ER / SBM from Bernoulli, or the weighted ER / SBM from Gaussian.
5 Simulated Experiments
Test statistics
We corroborate the theory using simulations with correlated graphs, for which we can sample a pair of graphs while controlling their correlation exactly. For this experiment, we compare the test statistics of Pearson and DCorr with the correlation used to generate different settings of correlated Bernoulli graphs. We use Algorithm 2 for Pearson and Algorithm 1 for DCorr (both algorithms are in the Appendix). The simulation settings are: (a): ER, ; (b): ER, ; (c): SBM, the block probability matrices of the two graphs , where when , and when ; (d): SBM, , where when , and when . All the community assignments are given instead of being estimated.
Figure 1 shows that for ER, both the Pearson and DCorr test statistics estimate perfectly. In particular, it is zero only when . This aligns with Theorem 3. For SBM, both test statistics are still the same, but they are no longer zero when . This is also expected based on Theorem 3. Intuitively, the test statistics differ from because they capture not only the correlation between pairs of edges, but also the correlation due to the block structure of SBMs. The test statistics of both DCorr and Pearson motivate a twosided test of conditional independence, which we describe in the next section.
Power
We use a power experiment to check that the test has the following properties: (1) validity: the power converges to below the type I error level under the null; (2) consistency: the power increases to 1 as sample size, i.e. the number of vertices . We compare the power of Pearson, DCorr and MGC on ER and SBM, both when the two graphs are sampled from distributions with the same probability matrix and when the distributions have different probability matrices. For MGC, we use Algorithm 1 but substituting the DCorr sample statistic with a MGC sample statistic. For SBM, we compute the power when the community assignments are given, when the assignments are unknown and estimated, and when the block sizes are different. The algorithm for calculating the power on correlated SBM is in Algorithm 6 in the Appendix. The simulation settings are the following: left column show , middle column shows , right column shows . For the rows 14 are the same as in Section 5; row 5 is the same as row 4, except the community assignments are estimated instead of given; row 6 is the same as row 5, except the block sizes are different: for each , there are 70 percent of nodes in the first community and 30 percent of nodes in the second community. Visualizations of some simulation settings are in Figure 2. All simulation results for correlated Bernoulli graphs are in Figure 3.
The results show that all the tests using block permutation (Pearson, DCorr and MGC) are valid and consistent, and have similar power under all settings. For comparison, the power of Pearson using the exact analytic pvalue instead of block permutation is also computed (the samples are pairs of edges in the vectorized upper triangles of both graphs [19], and the null is rejected if the pvalue is less than type I error level ). Without block permutation, the test is invalid for SBM. The same results also hold for the weighted correlated Gaussian ER and SBM, shown in Figure 4. For implementation, we use the python package GraSPy for graphs generation and manipulation [20] and mgcpy for various functionalities related to independence testing.
6 Real Data Experiments
We consider the application of the graph conditional independence testing procedures on connectomes, also known as "brain graphs". The connectome of nematode Caenorhabditis elegans (C. elegans
) is the only known wholeanimal connectome of an organism, including not only neurons but also body cells. It was constructed based on series of electron microscopy, and has been updated and made more complete over the years
[21, 22, 23, 24, 25]. The connectome is constructed on the level of individual synapses, and is constructed for both chemical synapses and gap junctions. The graphs are directed and weighted. The nodes of the graph are individual cells, and the edges represent the strength of synapses from one cell to another. The data provides an invaluable source of information to study the coordination of nervous system within the entirety of an organism.Given such data, one natural initial question to ask is whether the graph constructed based on chemical synapses and that constructed based on gap junctions are statistically dependent on each other, and if so, how strong the correlation is. To answer this question with the proposed testing procedure, we chose the chemical and gap junction connectome of the hermaphrodite, one of the two sexes of adult C. elegans. However, the graph independence testing procedure cannot be directly applied to the original graphs for two reasons: (1) not all the cells that are present in the chemical synapses connectome are present in the gap junctions connectome, and vice versa; (2) the original graphs are directed. To address (1), since each vertex is labeled with a unique cell name, we take the intersection of the cells in each of the graphs, and ensure the vertices of the two graphs are matched. After taking the intersection, each graph has 448 nodes, which includes all neurons and the cells that the neurons synapse onto. To address (2), the average weight of the two edges between a pair of nodes is used as the edge weight between them, rendering the graphs symmetric and thus undirected. The weighted and unweighted graphs after preprocessing are shown in Figure 5.
To derive a pvalue from the graph conditional independence test, the number of blocks used in block permutation needs to be set. By default, we pick the optimal that results in a GMM clustering with the lowest BIC in Algorithm 4. The estimated community assignment of the unweighted graphs is shown in Figure 6.
Since graphs derived from real data do not arise from an SBM, we don’t expect any given to perfectly estimate the community assignment of each node. But given a relatively large number of nodes, one can get a better fit with SBM by increasing while still maintaining the validity of block permutation. Figure 7 shows that for any reasonable chosen , the test detects strong dependency between the connectomes of chemical synapses and gap junctions, meaning that over and above block structure, one could in theory predict the presence of a gap junction given the presence of a chemical synapse, and vice versa.
7 Conclusion and Future Work
We presented a statistical test for conditional independence between a pair of undirected graphs. This is the first approach that we know of that addresses the question of whether two graphs are conditionally independent. The question of independence is very general and may be of interest in many distinct areas of scientific inquiries. In the field of connectomics, the proposed approach is especially relevant in helping researchers draw new connections among various types of brain imaging data and ask novel questions about the relationships among these data.
The proposed test relies on the theory of Stochastic Block Model (SBM). We note that this poses certain limitation in the effectiveness of the test in settings where the number of nodes is relatively small and the graph can only be fitted poorly given a SBM with small number of blocks, because it would be difficult for block permutation to approximate the null distribution. In such case, if one chooses a larger , one can better approximate the graph with an SBM, but it would require a larger number of nodes for block permutation to approximate the null distribution; on the other hand, if one chooses a smaller , the SBM might not fit the graph well enough for block permutation to approximate the null. With this in mind, we hypothesize that a generalization to Degreecorrected SBM [26] or Random Dot Product Graph [27] might allow better modeling in such settings, if one can design a valid permutation procedure in these graph models. Such generalization of the procedure is a potential future direction of the current work.
Authors, Affiliations, and Acknowledgements
Johns Hopkins University University of Delaware
References and Notes
 [1] K. Pearson, “Notes on regression and inheritance in the case of two parents,” Proceedings of the Royal Society of London, vol. 58, pp. 240–242, 1895.
 [2] M. G. Kendall, Rank Correlation Methods. London: Griffin, 1970.
 [3] G. J. Székely, M. L. Rizzo, N. K. Bakirov et al., “Measuring and testing dependence by correlation of distances,” The annals of statistics, vol. 35, no. 6, pp. 2769–2794, 2007.

[4]
A. Gretton, R. Herbrich, A. Smola, O. Bousquet, and B. Scholkopf, “Kernel
methods for measuring independence,”
Journal of Machine Learning Research
, vol. 6, pp. 2075–2129, 2005.  [5] K. Fukumizu, A. Gretton, X. Sun, and B. Schölkopf, “Kernel measures of conditional dependence,” in Advances in neural information processing systems, 2007.
 [6] A. Gretton and L. Gyorfi, “Consistent nonparametric tests of independence,” Journal of Machine Learning Research, vol. 11, pp. 1391–1423, 2010.
 [7] R. Heller, Y. Heller, and M. Gorfine, “A consistent multivariate test of association based on ranks of distances,” Biometrika, vol. 100, no. 2, pp. 503–510, 2013.
 [8] C. Shen, C. E. Priebe, and J. T. Vogelstein, “From distance correlation to multiscale graph correlation,” Journal of the American Statistical Association, 2019.
 [9] D. E. Fishkind, L. Meng, A. Sun, C. E. Priebe, and V. Lyzinski, “Alignment strength and correlation for graphs,” arXiv preprint arXiv:1808.08502, 2018.
 [10] P. Holland, K. Laskey, and S. Leinhardt, “Stochastic blockmodels: First steps,” Social Networks, vol. 5, no. 2, pp. 109–137, 1983.
 [11] T. Hothorn, K. Hornik, M. A. Van De Wiel, A. Zeileis et al., “Implementing a class of permutation pests: the coin package,” 2008.
 [12] D. Sejdinovic, B. Sriperumbudur, A. Gretton, K. Fukumizu et al., “Equivalence of distancebased and rkhsbased statistics in hypothesis testing,” The Annals of Statistics, vol. 41, no. 5, pp. 2263–2291, 2013.

[13]
G. J. Székely and M. L. Rizzo, “The distance correlation ttest of independence in high dimension,”
Journal of Multivariate Analysis
, vol. 117, pp. 193–213, 2013.  [14] J. T. Vogelstein, E. W. Bridgeford, Q. Wang, C. E. Priebe, M. Maggioni, and C. Shen, “Discovering and deciphering relationships across disparate data modalities,” eLife, vol. 8, p. e41690, 2019.
 [15] C. Shen, C. E. Priebe, and J. T. Vogelstein, “From distance correlation to multiscale graph correlation,” Journal of the American Statistical Association, no. justaccepted, pp. 1–39, 2018.
 [16] C. Shen and J. T. Vogelstein, “The exact equivalence of distance and kernel methods for hypothesis testing,” arXiv preprint arXiv:1806.05514, 2018.
 [17] D. Sussman, M. Tang, D. Fishkind, and C. Priebe, “A consistent adjacency spectral embedding for stochastic blockmodel graphs,” Journal of the American Statistical Association, vol. 107, no. 499, pp. 1119–1128, 2012.
 [18] M. Zhu and A. Ghodsi, “Automatic dimensionality selection from the scree plot via the use of profile likelihood,” Computational Statistics & Data Analysis, vol. 51, no. 2, pp. 918–930, 2006.
 [19] Student, “Probable error of a correlation coefficient,” Biometrika, pp. 302–310, 1908.
 [20] J. Chung, B. D. Pedigo, E. W. Bridgeford, B. K. Varjavand, and J. T. Vogelstein, “Graspy: Graph statistics in python,” arXiv preprint arXiv:1904.05329, 2019.
 [21] J. G. White, E. Southgate, J. N. Thomson, and S. Brenner, “The structure of the ventral nerve cord of caenorhabditis elegans,” Philosophical Transactions of the Royal Society of London. B, Biological Sciences, vol. 275, no. 938, pp. 327–348, 1976.
 [22] J. White, “Neuronal connectivity in caenorhabditis elegans,” Trends in Neurosciences, vol. 8, pp. 277–283, 1985.
 [23] J. G. White, E. Southgate, J. N. Thomson, and S. Brenner, “The structure of the nervous system of the nematode caenorhabditis elegans,” Philos Trans R Soc Lond B Biol Sci, vol. 314, no. 1165, pp. 1–340, 1986.
 [24] L. R. Varshney, B. L. Chen, E. Paniagua, D. H. Hall, and D. B. Chklovskii, “Structural properties of the caenorhabditis elegans neuronal network,” PLoS computational biology, vol. 7, no. 2, p. e1001066, 2011.
 [25] S. J. Cook, T. A. Jarrell, C. Brittin, Y. Wang, A. E. Bloniarz, M. A. Yakovlev, K. C. Q. Nguyen, L. T.H. Tang, E. A. Bayer, J. S. Duerr, H. Buelow, O. Hobert, D. H. Hall, and S. W. Emmons, “Wholeanimal connectomes of both c. elegans sexes,” Nature, in press, 2019.
 [26] B. Karrer and M. E. J. Newman, “Stochastic blockmodels and community stochastic blockmodels and community structure in networks,” Physical Review E, vol. 83, p. 016107, 2011.
 [27] S. Young and E. Scheinerman, “Random dot product graph models for social networks,” in Algorithms and Models for the WebGraph. Springer Berlin Heidelberg, 2007, pp. 138–149.
8 Appendix
Algorithms
There are several things to note about Algorithm 3. (1) Since the adjacency matrix is not necessarily given with the nodes in the same community next to each other, the vertices are first sorted based on the community assignment (referred to as sortvertex). Given that the vertices of the graphs are exchangeable, the sorted graph and the original graph are equivalent in distribution. (2) Since the graph is undirected, only the upper triangular part of the adjacency matrix needs to be permuted. The upper triangular matrix is then reflected across the diagonal to form a symmetric matrix (this procedure is symmetrize in Algorithm 3). (3) In the implementation, the ondiagonal blocks and offdiagonal blocks are permuted differently. Since the blocks on the diagonal are symmetric, only their upper triangle parts are permuted. This is done by vectorizing the upper triangular matrix, permuting elements of the vector, then reshaping the vector back into the upper triangle (referred to as permuteondiag). For the offdiagonal blocks, since the submatrix is not necessarily symmetric, the entire submatrix is vectorized. The vector is then permuted and reshaped back to a matrix (referred to as permuteoffdiag).
leading eigenvectors of
and the diagonal matrix of the corresponding leading eigenvectors respectively.Compute the singular value decomposition of
, and let be the matrix of the leading left singular vectors.In Algorithm 6, returns the percentile of the array .
Mathematical proofs
Proof of Proposition 2
Proof.
We need to show that and are conditionally independent. Let be the edge after permutation of edge , they shall lie in the same block. Note that in SBM, edges within a block are sampled i.i.d., namely, if , where is the block membership function.
Given that and are conditionally independent, we know for all within same block. And since under the model of correlated SBMs, every pair of edges within a block is sampled i.i.d, we have where . Since permutes within each block, we have for all . Therefore, we have for all . Hence, and are conditionally independent with same marginals as before, thus . ∎
Proof of Theorem 3
Proof.
We shall first prove it on binary graphs, and show that under ER for population Pearson, DCorr and MGC. For Pearson correlation, it is clear that by basic probability on Bernoulli random variables. For vertex set with adjacency matrix and vertex set with adjacency matrix , we denote edges between vertices in as and , and edges between vertices in as and . For population distance covariance denoted as , we have
(1) 
where the second line follows because under ER or SBM. Therefore DCov is the same as Pearson covariance, and similarly distance variance is same as Pearson Variance, such that DCorr is the same as Pearson correlation. By [8], the population MGC statistic is the same as population DCorr under a linear relationship, so the same holds for MGC as well. As the sample correlations all converge to the respective population, all three sample correlations using the adjacency kernel are consistent estimators of the under ER.
The above equivalence among Pearson, DCorr, and MGC holds for SBM. Thus it suffices to only investigate the numerator and denominator of DCorr:
(2)  
which is a weighted average within each block, and is thus linear with respect to as controls the covariance within each block. Similarly
where both variance terms are fixed constants for given marginals. Therefore for any SBM with given marginals, is a linear function of . Note that at , is not necessarily due to the block structure. However, when using block permutation test, all three sample correlations are consistent because there is a unique constant such that if and only if .
Finally, the above arguments directly extend to the weighted Gaussian graphs: For correlated Gaussian ER, by design the Pearson correlation must equal ; then Equation 8 still holds since the adjacency matrix is generated the same way other than the weight no longer being binary, thus DCorr and MGC still equal Pearson. For correlated Gaussian SBM, DCov in Equation 2 is also linear with respect to (and still a weighted average of each block), and the variance terms are still fixed constants for given marginals. ∎