1 Introduction
Complex network analysis provides a novel approach to examining how networked systems in nature are originated and evolving according to what basic principles, and moreover armed with such discovered principles, constructing efficient, robust as well as flexible manmade networked systems under different constraints. Among all studies about complex networks, structure analysis is the most fundamental, and the ability to discover and visualize the underlying structure of a realworld network in question will be greatly helpful for both topological and dynamic analysis applied to it [1]. So far, scientists have uncovered a multitude of structural patterns ubiquitously existing in social, biological, ecological or technological networks; they may be microscopic such as motifs [2], mesoscopic such as modularities[3] or macroscopic such as small world [4] and scalefree phenomena [5]. These structural patterns observed at different granular levels may collectively unveil the secrets hidden in complex networked systems. All these works have greatly triggered the common interesting as well as boosted the progress of exploring complex networks in both scientific and engineering domains. However, the topological structure analysis of complex networks, even restricted to the mesoscopic level, remains one of the major challenges in network theory mainly because most of the realworld networks are usually resulted from a combination of heterogeneous mechanisms which may collaboratively shape their nontrivial structures. More specifically one can raise the following issues.
Above all, beyond modularity the most extensively studied at the mesoscopic level [3, 6], a wide spectrum of structural patterns have been reported in the literature including bipartites or more generally multipartites [7, 8, 9], hubs , authorities and outliers [10, 11, 12], bow ties [13, 14, 15] or others. Moreover, these miscellaneous patterns may simultaneously coexist in the same networks, or they may even overlap with each other such as the overlaps between communities [16]. Fig. 1 shows an illustrative example, in which a social network encoding the coappearances of 77 characters in the novel ”Les Miserables” is visualized in terms of both network and matrix representations. One would observe two hubs and a number of outliers coexisting with four wellformed communities. The two hubs, corresponding to Valjean and Javert, are the main characters of the novel, and their links are across all other clusters by connecting about 48% of the overall characters. It indicates that the two roles interacting with different characters in different chapters are the two main clues going through the whole story. Four detected communities can be seen as four relatively independent social cliques. As an example, we go into details of group 4 that is almost separated from the rest of the network. Interestingly, this small social clique consists of 4 parisian students Tholomyes, Listolier, Fameuil and Blacheville with their respective lovers Fantine, Dahlia, Zephine and Favourite. Group 5 consists of 39 outliers, which connect to either two hubs or one of four communities by only a few links. Correspondingly, these outliers are the supporting roles of this novel. Besides this example, more complex structural patterns in realworld networks will be demonstrated and discussed in the rest of this article.
Secondly, such multiple structural patterns may be nested. That is, realworld networks can contain hierarchical organizations with heterogeneous patterns at different levels. In the literature, hierarchical structures are usually studied in a homogeneous way, in which patterns observed in each layer of hierarchies show homophily in terms of either fractal property [18] or modularity in more general cases [19, 20]. A recent study reveals that the patterns demonstrated in each layer of a dendrogram can be assortative (or modular) or disassortative (or bipartite) [21]. The ability to observe heterogeneous patterns at different levels brings new clues for understanding the dynamics of realworld complex networks.
Due to the above two reasons, for an exploratory network about which one often knows little, it is very hard to know what specific structural patterns can be expected and then be obtained by what specific tools. Biased results will be caused if an inappropriate tool is chosen; even if we know something about it beforehand, it is still difficult for a tool exclusively designed for exploring very specific patterns, say modularity, to satisfactorily uncover a multiple of coexisting patterns possibly overlapped or even nested with each other (we call them multiplex structures in this work) from networks. To the best our knowledge, there have been no studies in the literature being able to address both of the above issues. On the other hand, human beings have the capability of simultaneously discovering multiplex and significant structural patterns for various objectives. It has been believed that this kind of capability is an important form of human cognitive and intelligent functions [22].
Back to the matrix as shown in the Fig. 1, one would observe an intuitive phenomenon: Any significant pattern contained in the underlying structures of the network can be statistically highlighted by a group of homogenous individuals with identical or quite similar connectivity profiles. For instance, individuals from the same communities prefer to intensively interact with each other but rarely interact with the rest; hubs would prefer to connect many individuals from different parts of the whole network, whereas all outliers tend to seldom play with others by emitting only a few connections. Based on this naive observation, if one can group the majority of individuals into reasonable clusters according to their connectivity profiles, the coexisting structural patterns can be unveiled by further inferring the linkage among clusters. In this way, the first issue listed above can be promisingly solved.
The idea of grouping nodes into equivalent clusters in terms of their connection patterns is quite similar with the philosophy of the blockmodeling first proposed by Lorrain and White [23], in which nodes with structural equivalence (defined in terms of local neighborhood configurations) or more generally regular equivalence [24] (defined in terms of globally physical connections to all others) or more softly stochastic equivalence [25, 26]
(defined in terms of linking probabilities between groups) will be grouped into the same blocks.
Based on the same idea, a very related study has been proposed recently by Newman and Leicht, which first (to our knowledge) shows the motivation to detect unpredefined structural patterns from exploratory networks [27]
. From the perspective of machine leaning, their method can be seen as a version of naive Bayes algorithm applied to networks, whose objective is to group nodes with similar connection features into a predefined number of clusters. Although their work only shows the ability to determine whether an exploratory network is assortative or disassortative by manually analyzing the obtained clusters, it has provided one good proof supporting that the idea of grouping nodes into equivalent clusters can be an initial step of the whole process aiming to unveil multiplex structures from networks.
In this work, we will propose a novel model by introducing the concept of granularity into stochastic connection profiles in order to properly model multiplex structures, and then show how the task of recognizing multiplex structures can be reduced to a simplified version of the isomorphism subgraph matching problem. To test our ideas and strategies proposed here, different sources of networks have been analyzed. It is encouraging that our methods show a good performance, capable of uncovering multiplex structures from the tested networks in a fully automatic way.
2 The Model
2.1 Granular couplings
We will model the connection profiles of nodes in terms of probabilities instead of physical connectivity. In this way, it is expected not only to find out multiplex structures but also to provide an explicit probabilistic interpretation for these findings within the Bayesian framework. We term such probabilities as couplings in that they are not just the mathematical measures subjectively defined for modeling or computing, but they do exist in many realworld systems, encoding different physical meanings such as social preferences in societies, predation habits in ecosystems, coexpression regularities in gene networks or cooccurrence likelihood of words in languages, which will be valuable to predict their situated systems.
Here the notation of granularity should be interpreted in terms of the resolution and precision. On one hand, in our model we define two kinds of couplings with different resolutions, i.e., node couplings and block couplings. Formally, we define node feedforwardcoupling matrix and node feedbackcoupling matrix , where and respectively denote the probabilities that node expects to couple with or to be coupled by node . In the cases of indirected networks we have (see SI for proofs). We assume nodes will independently couple with others regulated by such couplings. Nodes with similar feedforward as well as feedback coupling distributions will be clustered into the same blocks. In terms of matrices, homogeneous feedforward and feedbackcouplings guarantee homogeneous row and column connection profiles, respectively. Correspondingly, given the block number , we define block feedforwardcoupling matrix and block feedbackcoupling matrix , where and respectively denote the probabilities that block expects to couple with or to be coupled by block . We will show later that block couplings can be inferred from node couplings and vice versa. Node couplings with a fine granularity are used to model networks in order to capture their local information as much as possible; while block couplings with a coarser granularity are used to define and recognize global structural patterns by intentionally neglecting trivial details. On the other hand, in the nested patterns, node couplings and block couplings in different hierarchies will have different precisions in order to properly abstract and construct hierarchical organizations. In our model, the couplings on higher layers are the approximations of the related ones on the lower layers. Therefore, as the layers moving from the bottom (corresponding to the original networks) to the top (corresponding to the finally reduced networks) of the nested organizations, the precision of node or block couplings will gradually degenerate.
2.2 Defining multiplex structures
The main steps of our strategies for detecting multiplex structures from networks can be stated as follows: 1) simultaneously estimating all kinds of couplings mentioned above and clustering all nodes into nested blocks with a proposed granular blocking algorithm; and 2) in each layer of the nested blocks, recognizing structural patterns by matching predefined isomorphism subgraphs from a reduced blocking model in which trivial couplings are neglected, as illustrated in Fig.
1d. Multiplex structures can be defined in terms of blocks and their couplings. Fig. 2 shows a schematic illustration by means of some conceptual networks. By referring to them, we give following definitions.A community is a selfcoupled block. An authority is a selfcoupled block which is coupled by a number of other blocks. A hub is a selfcoupled block which couples with a number of other blocks. An outlier is a block without selfcoupling which is coupled by a hub or couples with an authority. A bowtie is a subgraph consisting of a block and two sets of blocks and , which satisfy with: 1) is coupled by and couples with the blocks of and , respectively; 2) the intersection of and is empty or ; and 3) there are no couplings between and ; A multipartite is a subgraph consisting of a set of blocks without selfcouplings which reciprocally couple with each other. As a special case of multipartite, a bipartite is a subgraph consisting of two blocks without selfloop couplings which unilaterally or bilaterally coupled with each other. A hierarchical organization is a set of nested blocks, in which block subgraphs in lower layers are directly or indirectly nested in the bigger blocks on higher layers.
The above definitions imply that there may exist overlaps between different patterns in the sense that the same blocks can be simultaneously involved in a multitude of subgraphs. For example, a block which is determined as a community can be also a hub, a authority or the core of one bow tie. Moreover, beyond the predefined patterns, users are allowed to define novel even more complex patterns by designing new subgraphs of blocks, which can be identified by matching their isomorphism counterparts from blocking models.
2.3 Granular blocking model
Let be a directed and binary network, where denotes the set of nodes and denotes the set of directed links. In the case of undirected network, we suppose there are two direct links between each pair of nodes. Let be the adjacency matrix of , where is the number of nodes.
Suppose all nodes of are divided into blocks, denotes by , where if node is in the block , otherwise . When each block is considered to be inseparable, the granularity of network can be measured by the average size of blocks . As increasing from 1 to , the granularity of degenerates from the finest to the coarsest. Let denote the blocking model with a granularity , and we expect to cluster all its blocks into a reasonable number of clusters so that the nodes of blocks within the same cluster will demonstrate homogeneous coupling distributions. Let matrix denote such clusters, where is the cluster number and if block is labeled by cluster , otherwise . Since the coupling distributions of nodes within the same clusters are expected to be homogeneous, one can characterize such distributions for each cluster instead of for each node. Given , define , where denotes the probability that any node out of cluster expects to couple with node ; define , where denotes the probability that any node out of cluster expects to be coupled by node ; define , where
denotes the prior probability that a randomly selected node will belong to the cluster
. It is easy to show and (see SI).Let be a model with respect to and . We expect to select an optimal from its hypothesis space to properly fit as well as to precisely predict the behaviors of under in terms of node couplings characterized by it. According to the MAP principle (maximum a posteriori), the optimal for a given network under
will be the one with the maximum posterior probability. Moreover, we have:
, where , and denote the posteriori of given and , the likelihood of given and and the priori of given , respectively.3 Learning Methods
3.1 Likelihood maximization
We first consider the simplest case by assuming all the prioris of given are equal. In this case, we have: . Let , we have (see SI):
(1) 
where .
(2) 
Considering the expectation of on , we have:
(3) 
where , i.e., the probability that block will be labeled as cluster given and .
Let be a Lagrangian function constructed for maximizing with a constraint , we have:
(4) 
According to the Bayesian theorem,we have (see SI):
(5) 
Using the similar treatment as proposed by Dempster and Laird [28], we can prove that a local optimum of Eq.10 will be guaranteed by recursively calculating Eqs.4 and 12 (see SI). The time complexity of this iterative computing process is , where is the iterations required for convergence, which is usually quite small. An approximate but much faster version with a time is given in the SI.
3.2 Priori approximation
Without considering the priori of the model, the aboveproposed likelihood maximization algorithm will be cursed by the overfitting problem. That is, will monotonically increase as approaching to . In this section, we will discuss how to approximately estimate the prior by means of the information theory.
Note that , which implies that the coarser granularity the smaller . It will be shown in the following that a smaller will indicate a less complexity of . So, we have: a coarser granularity prefer simpler models, which can be mathematically written as , where the function measures the complexity of in terms of its parameters. In this work, we set , where is the priori of in the hypothesis space in which can be freely valued from 1 to . According to Shannon and Weaver [29], is the minimum description length of with a prior in its model space. Let denote the optimal coding schema for , and let be the minimum description length of under this schema. We have: . Now, to estimate the prior is to design a good coding (or compressing) schema as close to as possible.
In terms of the parameter of , i.e.,, , and , we have (see SI):
(6) 
where .
Note that matrices and can be compressed into a map , where if the entry of is equal to one. Given , and , node couplings and can be measured by:
(7) 
Eq.7 says that, all node couplings can be approximately characterized by , and . In other word, the compressing schema close to we have found out is:
(8) 
Now, we have,
Moreover, we have:
(9) 
Eq.9 tries to seek a good tradeoff between the accuracy of model (or the precision of fitting observed data) measured by the likelihood of a network, and the complexity of a model (or the generalization ability to predict new data) measured by its optimal coding length.
3.3 Model selection
For a given , the penalty term is a constant, and thus to maximize is to maximize . In our algorithm, will be systematically checked from 1 to , and the model with the minimum sum of negative likelihood and penalty will be returned as the optimal one. In the landscape of and , a welllike curve will be shaped during the whole search process (see SI). So, in practice, rather than mechanically checking for exact times, an ongoing searching can be stopped after it has safely passed the well bottom. By means of this greedy strategy, the efficiency of our algorithm would be greatly improved. The complete algorithm is given in the SI.
3.4 Hierarchy construction
Assume we have constructed layers, in which the layer corresponds to a blocking model characterized by . Now, we want to construct the layer by selecting an with a maximum posterior given a set of blocking models on different layers. We have shown that (see SI): . This Markovlike process indicates that the new model to be selected for layer is only based on the information of the layer . So, for a given network, its hierarchical organization can be incrementally constructed as follows.
Firstly, we construct the first layer by taking each node as one block, and cluster it into clusters by selecting an model with a maximum . Next, we form by capsuling each cluster on the first layer as one block, and cluster these blocks into clusters by selecting an new model with a maximum
, which forms the second layer. We repeat the same procedure until it converges (the cluster number obtained keeps fixed). In this way, the number of layers of a hierarchical organization will be automatically determined. The above procedure can be seen as a semisupervised learning process; in the cases that the granularity is more than one, we have already known a priori that which nodes will be definitely within the same clusters. As the layer in the constructed hierarchy increases, the homogeneity of the nodes within the same cluster in terms of their connection profiles keeps degenerating, and correspondingly more global patterns are allowed to be observed by tolerating such increasing diversities.
3.5 Isomorphism subgraph matching
Based on the obtained blocks in the level of the hierarchy, all potential patterns hidden in the level can be revealed through an isomorphism subgraph matching procedure, whose input is the block feedforwardcoupling matrix . First we construct a reduced blocking model by taking each block as one node, and for each pair of blocks and we insert a link from to if is above a threshold computed based on (see SI). Then for each block, we pick up the matched isomorphism subgraphs it will be involved in, and put them into categorized reservoirs labeled by different patterns. During this procedure, the subgraph to be put into a reservoir will be discarded if it is a subgraph of an existing one, as illustrated by Fig.3d.
4 Applications
4.1 Exploring the cash flow patterns of the world trade system
The discovered multiplex structures as well as their granular couplings can be used to understand some dynamics of networks. Here we give one example to show how cash possibly flow through a world trade net. Fig. 3a shows a directed network encoding the trade relationship among eighty countries in the world in 1994, which was originally constructed by Nooy [30] based on the commodity trade statistics published by the United Nations. Nodes denote countries, and each arc denotes the country imported high technology products or heavy manufactures from the country . Analogous to the ”structural world system positions” initially suggested by Smith and White based on their analysis of the international trade from 1965 to 1980 [31], the eight countries in 1994 were categorized into three classes according to their economic roles in the world: core, semiperiphery and periphery [30]. Accordingly, in the visualized network, the countries labeled by them are distributed along three circles from inside to outside, respectively.
A twolayer hierarchical organization has been constructed by our system, as illustrated in Fig. 3d. A reduced blocking model is shown in the first layer by neglecting trivial couplings, in which six blocks are obtained. By referring the matrix of the network as presented in Fig. 3b, one can observe that the nodes within the same blocks demonstrate homogeneous row as well as column connection profiles. By referring to their couplings, ten isomorphism subgraphs of the patterns as defined in Fig. 2 are recognized, which, respectively, are one authority, four communities, three hubs and two outliers, as shown in Fig. 3e. Quite a few interesting things can be read from these uncovered multiplex structures. Globally, the countries near center tend to have larger outdegrees, while those far from center have smaller even zero outdegrees. Specifically, (1) according to the coupling strength from strong to weak, three detected hubs can be ranked into the sequence of blocks 4, 1 and 3. The first two hubs consist of the ”core” of the trade system except for Spain and Denmark; (2) the countries from blocks 3, 2 and 6 consist of the backbone of the ”semiperiphery”; (3) more than a half of ”periphery” countries (10 of 17) have zero outdegrees; (4) interestingly, the detected blocks are also geographyrelated, as illustrated in Fig. 3c. Most countries of hubs 4 and 1 are from western Europe expect America, China and Japan; most of hub 3 are from southeastern Asia; most of the community block 6 are from north or south America; most of the outlier block 2 are from eastern Europe; most of the outlier block 5 are from Africa or some areas of Asia.
In the second layer, a macroscopic huboutlier pattern with strong couplings is recognized. Hub blocks 4 and 1 in the first layer collectively form a more global hub of the whole network as the core of the entire trade system; other blocks form a more global outlier of the network corresponding to the semiperiphery and periphery of the trade system. This nested huboutlier patterns perhaps give us an evidence about how cash flowed through the world in different levels in 1994. Note that arc denotes that country imported commodities from country , which also indicates that the spent cash has flowed from to . In this way, one can image cash flows along these arcs from one country to another. According to the global pattern in the second layer, the dominant cash flux will flow from the core countries to themselves with a probability 0.6, and to the rest with a probability 0.57. Locally, the blocking model in the first layer shows the backbone of the cash flow through the entire world with their respective strength in terms of block couplings, as illustrated in Fig. 3d.
4.2 Mining granular association rules from networks
When a network encodes the cooccurrence of events, its underlying node or blockcoupling matrices would imply the probabilistic associations among these events in different granular levels, respectively. Formally, we have: node association rule (NAR): , and block association rule (BAR): . A NAR means that event would happen with a probability given event happens. A BAR means that any event of block would happen with a probability given any event of block happens. Such association rules with different granularities can be used in making prediction in a wide range of applications, such as online recommender systems. We will demonstrate this idea with a political book copurchasing network compiled by V. Krebs, as given in Fig. 4(a), where nodes represent books about US politics sold by the online bookseller Amazon.com, and edges connects pairs of books that are frequently copurchased, as indicated by the ”customers who bought this book also bought these other books” feature on Amazon. Moreover, these books have been labeled as ”liberal”, ”neutral” or ”conservative” according to their stated or apparent political alignments based on the descriptions and reviews of the books posted on Amazon.com [32].
A twolayer hierarchical organization has been detected by our system as shown in Fig. 4. The blocking model of the first layer is shown in Fig. 4(b). By matching isomorphism subgraphs in its reduced blocking model, nine patterns are recognized, which respectively are five communities (blocks 1,2,4,6,7), two cores (blocks 2 and 7), two outliers (blocks 3 and 5) and a bow tie (blocks 1,2,3). Note that, in indirected networks, the core of a bow tie (block 2 in this case) can be seen as the overlapping part of its two wings (blocks 1 and 3 in this case) by neglecting the direction of links. In the second layer, a macroscopic community structure is recognized, as shown in Fig. 4(c). Interestingly, the left and right communities can be globally labeled as ”leftwing” and ”rightwing” according to the types of the books they contain respectively. Such a global pattern can be seen as one macroscopic BAR: the books with common labels would be copurchased with a great chance (about 15%); while, those with different labels are rarely copurchased (only with a chance of 1%).
When zooming in to both global communities in the second layer, one will obtain mesoscopic BARs encoded by the blockcoupling matrix , as illustrated by the weighted arrow lines Fig. 4(b). As an example, we list the BRAs related to the block 2 in a decreasing sequence of association strength. ; ; ; ; ; ; . Such mesoscopic association rules would help booksellers adaptively adjust their selling strategies to determine what kinds of stocks they should increase or decrease based on the statistics of past sales. For example, if they find the books labeled as block 2 are sold well, they may correspondingly increase the order of books labeled as blocks 1 and 3 besides block 2, while they may simultaneously decrease the order of books labeled as blocks 5 or 7.
More specifically, with the aid of NARs, booksellers would be able to estimate the chance that customers will spend their money on a book if they have already bought book by referring to . Such microscopic rules would provide booksellers the suggestions on what specific books should be listed according to what sequence in the recommending area of the web page advertising a book. For example, for the book ”The Price of Loyalty” , the most worth recommended books are listed as follows according to the coupling strength to it: Big Lies; Bushwhacked; Plan of Attack; The Politics of Truth; The Lies of George W.Bush; American Dynasty; Bushwomen; The Great Unraveling; Worse Than Watergate.
5 Conclusions
In this work, we have shown through examples that the structural patterns coexisting in the same realworld complex network can be miscellaneous, overlapped or nested, which collaboratively shape a heterogeneous hierarchical organization. We have proposed a framework based on the concept of granular couplings and the proposed granular blocking model to uncover such multiplex structures from networks. From the output of patterns, hierarchies and granular couplings generated by our approach, one can analyze or even predict some dynamics of networks, which are helpful for both theoretical studies and practical applications.
Moreover, based on the rationale behind this work, we suggest that the evolution of a realworld network may be driven by the coevolution of its structural patterns and its underlying couplings. On one hand, statistically significant patterns would be gradually highlighted and emergently shaped by the aggregations of homogeneous individuals in terms of their couplings. On the other hand, individuals would adaptively adjust their respective couplings according to the currently evolved structural patterns.
References
 [1] Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU(2006) Complex networks: Structure and dynamics. Physics Reprots 424:175308.
 [2] Milo R, Orr SS, Itzkovitz S, Kashtan N, Chklovskii D, Alon U(2002) Network Motifs: simple building blocks of complex networks. Science 298:824827.
 [3] Girvan M, Newman MEJ(2002) Community structure in social and biological networks. Proc Natl Acad Sci USA 9:78217826.
 [4] Watts DJ, Strogatz SH(1998) Collective Dynamics of SmallWorld Networks. Nature 393:440442.
 [5] Barabasi AL, Albert R(1999) Emergence of Scaling in Random Networks. Science 286:509512.
 [6] Fortunato S(2010) Community detection in graphs. Physics Reports 486:75174.
 [7] Holme P, Liljeros F, Edling CR, Kim BJ(2003) Network bipartivity. Phys Rev E 68:056107.
 [8] Guillaume JL, Latapy M(2004) Bipartite structure of all complex networks. Inform Process Lett 90:215221.
 [9] Brady A, Maxwell K, Daniels N, Cowen LJ(2009) Fault Tolerance in Protein Interaction Networks: Stable Bipartite Subgraphs and Redundant Pathways. PLoS ONE 4:e5364.
 [10] Kleinberg JM(1999) Authoritative Sources in a Hyperlinked Environment. Journal of ACM 46:604632.
 [11] Albert R, Jeong H, Barabasi AL(2000) The Internet’s Achilles heel: Error and attack tolerance of complex netowrks. Nature 406:378382.
 [12] Sporns O, Honey C, Kotter R(2007) Identification and classification of hubs in brain networks. PLoS ONE 2:e1049.
 [13] Broder A, Kumar R, Maghoul F, Raghavan P, Rajagopalan S, Stata R, Tomkins A, Wiener J(1999) Graph structure in the Web. COMPUT NETW 33:309320.
 [14] News Feature(2000) The web is a bow tie. Nature 405:113.
 [15] Ma HW, Zeng AP(2003) The connectivity structure, giant strong component and centrality of metabolic networks. Bioinformatics 19:14231430.
 [16] Palla G, Derenyi I, Farkas I, Vicsek T(2005) Uncovering the overlapping community structures of complex networks in nature and society. Nature 435:814818.
 [17] Knuth DE(1993). The Stanford GraphBase: A Platform for Combinatorial Computing (AddisonWesley press, Reading, MA).
 [18] Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabasi AL(2002) Hierarchical organization of modularity in metabolic networks. Science 297:15511555.
 [19] Zhou C, Zemanova L, Zamora G, Hilgetag CC, Kurths J(2006) Hierarchical Organization Unveiled by Functional Connectivity in Complex Brain Networks. Phys Rev Lett 97:238103.
 [20] Pardo MS, Guimera R, Moreira AA, Amaral LAN(2007) Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci USA 104:78217826.
 [21] Clauset A, Moore C, Newman MEJ(2008) Hierarchical structure and the prediction of missing links in networks. Nature 453:98101.
 [22] kemp C, Tenenbaum JB(2008) The discovery of structural form. Proc Natl Acad Sci USA 105:1068710692.
 [23] Lorrain F, White HC(1971) Structural equivalence of individuals in social networks. J MATH SOCIOL 1:4980.
 [24] White DR, Reitz KP(1983) Graph and semigroup homomorphism on networks of relations. Social Networks 5:193235.
 [25] Fienberg SE, Wasserman S(1981) Categorical data analysis of single sociometric relations. Sociological methodology 12:156192.
 [26] Holland PW, Laskey KB, Leinhardt S(1983) Stochastic blockmodels: Some first steps. Social Networks 5:109137.
 [27] Newman MEJ, Leicht EA(2007) Mixture models and exploratory analysis in networks. Proc Natl Acad Sci USA 104:95649569.
 [28] Dempster AP, Laird NM, Rubin DB(1977) Maximum likelihood from incomplete data via the EM algorithm. J R Statist Soc B 39:185197.
 [29] Shannon CE, Weaver W(1949) The mathematical theory of communication (University of Illinois Press, Urbana).
 [30] Nooy WD, Mirvar A, Batagelj V(2004) Exploratory social network analysis with Pajeck (Cambridge University Press)
 [31] Smith DA, White DR(1992) Structure and Dynamics of the Global Economy  Network Analysis of InternationalTrade 19651980. Social Forces 70:857893.

[32]
Newman MEJ(2006) Finding community structure in networks using the eigenvectors of matrices.
Phys Rev E 74:036104.
Appendix A Proofs and algorithms
Proposition 1.
For an indirected network, its feedforwardcoupling matrix is equal to its feedbackcoupling matrix .
Proof.
where denote the event that node couples with , and denote the event that is labeled by cluster ; if is labeled by cluster , otherwise . So we have
.
where denote the event that node except to be coupled by . So we have
.
If is symmetry, from the Eq.4 in the article, we have
.
So we have . ∎
Proposition 2.
(10) 
where .
Proof.
Let denote the event that a node with linkage structure will be observed in network . Let denote the event that the cluster label assigned to a node is equal to . Let denote the event that node link to node or not depending on . Let denote the event that node will be linked by node or not depending on . We have:
∎
Proposition 3.
(11) 
Proof.
Let denote the cluster label assigned to node under the given partition , we have:
.
∎
Notice that, in the proofs of Props 2 and 3, all probabilities such as and are discussed under the conditions of and . To simplify the equations, we omit them without losing correctness.
Proposition 4.
(12) 
Proof.
let be the probability that node belongs to cluster given and . We have:
where is the probability of selecting node from block .
According to the Bayesian theorem, we have:
.
Based on the proof of Prop.2, we have:
.
So, we have Eq.12. ∎
As an approximate version of Eq.12, we have:
(13) 
where denotes that randomly selecting a node from block .
That is, instead of averaging all nodes in the block , the real value of can be approximately estimated by a randomly selected node from block .
Correspondingly, an approximate version of the loglikelihood of Eq.10 is given by:
(14) 
where denotes the size of block .
The time calculating Eqs.12 and 10 will be bounded by . While, the time calculating Eqs.13 and 14 will be bounded by . This will be much efficient for constructing the hierarchical organizations of networks.
Theorem 1.
A local optimum of Eq.10 will be guaranteed by recursively calculating Eqs.4 and 5 in the article.
Proof.
From the Proposition2, we have:
(by Jensen’s inequality)
.
Furthermore, we have:
.
Let , we have:
.
So, we have:
.
Recall that, the , and of can be computed in terms of by Eq.4 in the article. So, we have:
.
Recall that , we have:
.
That is to say, the obtained in the current iteration will be not worse than obtained in last iteration. So, we have the theorem. ∎
Proposition 5.
In terms of the parameter of , , , and , we have:
(15) 
where .
Proof.
We have
where denotes node is in the cluster with a size , and is the probability of selecting node from cluster . Furthermore, we have:
.
Similarly, we have:
.
So, we have
.
∎
Algorithm 1.
Searching the optimal model given and
=GBM(,)
01. ;
02. ;
03. for =2:
Comments
There are no comments yet.