Constant State of Change: Engagement Inequality in Temporal Dynamic Networks

10/03/2019 ∙ by Hadar Miller, et al. ∙ 0

The temporal changes in complex systems of interactions have excited the research community in recent years as they encompass understandings on their dynamics and evolution. From the collective dynamics of organizations and online communities to the spreading of information and fake news, to name a few, temporal dynamics are fundamental in the understanding of complex systems. In this work, we quantify the level of engagement in dynamic complex systems of interactions, modeled as networks. We focus on interaction networks for which the dynamics of the interactions are coupled with that of the topology, such as online messaging, forums, and emails. We define two indices to capture the temporal level of engagement: the Temporal Network (edge) Intensity index, and the Temporal Dominance Inequality index. Our surprising results are that these measures are stationary for most measured networks, regardless of vast fluctuations in the size of the networks in time. Moreover, more than 80 weekly changes in the indices values are bounded by less than 10 are stable between the temporal evolution of a network but are different between networks, and a classifier can determine the network the temporal indices belong to with high success. We find an exception in the Enron management email exchange during the year before its disintegration, in which both indices show high volatility throughout the inspected period.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Dynamic complex systems of interactions are often modeled as a sequence of snapshots of networks in time holme2012temporal . While this is a rather simplistic representation, it is widely accepted that the structural properties of a network play a significant role in determining its actors’ behavior granovetter1983strength ; burt2000network ; haynie2001delinquent ; spencer2003global ; snijders2005models ; kossinets2006empirical ; perra2012activity . The last decade’s abundance of temporal information paved the path to a further understanding of the dynamics of networks lazer2009life ; artime2017dynamics and the effect on their actors fowler2008dynamic ; phelps2010longitudinal ; hellmann2014evolution ; ilany2015topological .

The intensity of interactions, also referred to as ties’ strength, has long been recognized as a fundamental property barrat2007architecture . Human contacts are of different durations barabasi2005origin ; onnela2007structure ; Human relationships are of varying strength granovetter1973strength ; Human flight fluxes differ across routes opsahl2008prominence , and more. The heterogeneity in edge intensity, i.e., the duration, strength, or capacity of the above interactions has been modeled utilizing edge weights and weighted networks barrat2007architecture ; barrat2004architecture ; newman2004analysis ; opsahl2010node . The intensity of interactions opsahl2008prominence was used in a variety of applications, such as an aiding tool in the assessment of the level of conflicts within organizations nelson1989strength , and the understanding of human communication patterns gilbert2009predicting ; miritello2011dynamical .

Here, we utilize weighted networks modeling to research temporal indices of engagement, such as average intensity and participation inequality in online person-to-person interaction networks, termed connection networks holme2012temporal . Connection networks may refer to organizational email networks, online forums and messaging apps, and online discussions eckmann2004entropy ; sun2016predicting .

Temporal measures of engagement are of interest as they give a measure of member participation, interest, influence, dominance, and more. In organizations, where frequent changes were found to be the norm burke2017organization , following the temporal intensity and dominance of the interactions can help in identifying fluctuations in involvement and engagement prior, during, and after a planned organizational change, as well as assess the reactions to a shock. These temporal measures are of interest also in the case of online social networks engagement, where participation was found to be dominated by a few nielsen2006participation . Recent studies, however, found that participants change their active role in the network and their engagement over time sonnenbichler2010community . Currently, it is unclear whether these changes affect the temporal measures of network activity.

To study the temporal behavior of a network, we define indices of average connection intensity and nodal dominance inequality in temporal networks and measure these quantities over several real-world networks. Surprisingly, we find a stationary behaviors of networks over time, regardless of massive fluctuations in their size. Our results demonstrate that networks converge to a steady state of engagement, regardless of significant variations in the number of participants. Deviations from the steady state are rare and do not correlate with a change in size.

Of specific interest is the case of the Enron managers email network. The dataset was released in a court order after the company has disintegrated, and has been recently used, together with the known set of events, for change point detection (CPD) schemes Peel2015 ; miller2018size

. Unlike anomaly detection techniques that scan for temporal fluctuations from the norm, CPD schemes try to infer the points in time when networks change their norm and thus are termed points of change. We find that throughout the period inspected in the Enron managers email network, both indices cannot be seen as stationary, and the fluctuations in the network’s temporal indices are significantly higher than the ones we found for all other networks.

Our results determine that networks differ by the engagement indices we defined, and can be differentiated by them. To further verify this result, we ran a classification experiment over the weekly indices, and find that the classifier can classify the indices tuples to their corresponding network with high validity.

Our surprising yet robust results have implications to the inference of the behavior of complex systems over time and the dynamics of networks. Of interest is the understanding of the origin of the different engagement indices between networks, and whether they can be utilized to characterize networks. The robustness of the result across size changes in the networks is of importance for the understanding of stationary properties in networks, and their implications for dynamic systems and collective behavior.

2 Related

Complex systems of interacting elements, from human (social and organizational) to physical and biological ones, can be modeled as interaction networks, with nodes representing the elements and edges representing their interactions. When the interactions are dynamic, i.e., human and social interactions, a complete model that captures the longitudinal evolution of the system is comprised of a sequence of networks, each portraying a snapshot of the system at a single point in time. Other models do exist holme2012temporal . In this work, we follow the modeling of temporal sequential periods similarly to pan2011path .

Temporal networks are viewed in recent years as a natural way to investigate dynamical systems utilizing networks holme2012temporal ; artime2017dynamics ; sekara2016fundamental ; li2017fundamental , where ”the system under study should consist of agents that interact pairwise, so that the interactions have both some degree of randomness and some regularity” gautreau2009microdynamics ; holme2012temporal . Dynamic online interactions have been studied to model conflicts yasseri2012dynamics , temporal ego networks and strength of links over time karsai2014time .

In this work, we model dynamics of electronic one-to-one communication such as emails and instant messages. The case of online forums can be considered as a one-to-many communication holme2012temporal yet in this work it was modeled utilizing the replies and hence also as a form of one-to-one communications.

Temporal networks of electronic messages have been investigated mainly in the context of information spreading and contagious rodriguez2011uncovering ; gomez2012inferring ; rosvall2014memory ; nadini2018epidemic . Structural dynamics and properties of temporal networks also receive much attention, such as temporal paths length, centrality, community and motif measures pan2011path ; perra2012activity ; kovanen2011temporal ; taylor2017eigenvector .

Complex networks of interactions are dynamic and heterogeneous by nature corrado2019dynamics . One of the cornerstones of heterogeneity is the nodal degree, or in weighted networks, node intensity. Intensity patterns are heterogeneous with a few nodes having a significantly higher degree or intensity level, hence more dominant in the network barrat2004architecture ; barrat2007architecture ; opsahl2008prominence ; corrado2019dynamics . Dominance in systems mostly refers to the dominant role of its members. In social networks of interactions, groups of roles are inferred by analyzing the structure of networks rossi2015role ; gupte2017role ; costa2018mining

. Studies found that in online social networks the most prominent group is that of active influencers, estimated at merely 1% of the members, while accounting for almost all the network activity 

nielsen2006participation . Role groups differ in size. Nielsen nielsen2006participation found that most online communities have a highly unequal role group sizes, with 90% of members never contributing, 9% that contribute little, and 1% that account for almost all network activity. Interestingly, roles are temporal and members often transition between roles sonnenbichler2010community .

Hence, we continue to define measures of engagement in networks, and explore their temporal nature. For a suggested organizational change, for example, such measures can determine levels of engagement in the change: If communication inequality is low, then many participate in discussions. If inequality is high, only a few dominate the conversation and are actively involved. The intensity of the conversations can be identified by comparing to the intensity in other periods.

3 Network Intensity Measures

Figure 1: Two networks with three nodes and weighted edges. The size of the edges correspond to their weights. The panels present two different interaction patterns and edge intensities.

We are interested to capture both the average intensity of interactions, regardless of the number of different interactions, and the interactions variance in a network, which is a measure of inequality. A measure of average intensity of the edge interactions in a network differs from average nodes’ strength, as the measure should not favor the number of active connections a node has. Measures of nodal strength favor nodes that have many active connections. Additionally, we suggest to measure the inequality of nodal interaction in a network. Figure 

1 illustrates two networks, each consists of three nodes and their interactions. In the examples illustrated, on the left (a) node B interacts intensively with A and C, while on the right (b), all three nodes communicate with each other at the same intensity. An estimate of a network average intensity level should account for the number of active connections in a network. In the case of Network (a) there are only two such connections, and in the case of Network (b) there are three. We devise indices that would show that in Network (a) the average intensity is four, while the nodes show high inequality, and in Network (b) the average edge intensity is two, and the nodes engage equally. To the best of our knowledge, current measures do not capture the intensity and inequality of Network (a) as described here.

3.1 Average Interaction Intensity

We describe here a measure for deriving a network average edge intensity level. To compute the average edge intensity in a network, we build upon a measure devised for nodes in a weighted network opsahl2010node . This measure allows considering for each node not only the number of nodes in the network it interacts with but also the intensity level of these interactions:

(1)

Where is the tuning parameter, is the number of nodes the focal node is connected to, and is its weighted degree, computed by:

(2)

Where is the total number of nodes in this network, and is a non-zero value for the strength of edges that disseminate from the focal node .

The tuning parameter, , determines the importance of each of these parts. When the edge strength is ignored, and only its existence is taken into account, resulting in a measure that is similar to the one in  freeman1978centrality . Conversely, when only the edges weights are considered, while the binary structure is not opsahl2010node .

Taking a network-wide approach, we continue and define the weighted sum of the node degrees given the tuning parameter as follows:

(3)

is a metric that depending on the chosen value for the tuning parameter describes with a scalar the weighted sum of the network degrees. Specifically, when the tuning parameter is set to zero the metric corresponds to the number of edges in the graph; Alternatively, when the tuning parameter is set to one the metric corresponds to the sum of all edge weights in the network, that is, the overall intensity of interactions in a network.

We then propose a level of intensity index for networks that is the ratio between the overall intensity of edge interactions in the network and the binary number of edges. We formally propose the following index:

(4)

holds for all graphs. In the case where edge weights are based on a ratio scale opsahl2009clustering then is bounded by that ratio. Otherwise, it is unbounded. When the network intensity level is very low, and the vast majority of edges have a low weight. In social networks of interactions, low intensity corresponds to a low number of interactions between any two members in the network. Accordingly, when , the network intensity level is high. High intensity, in this case, implies the existence of edges representing interactions of high volume, also referred to as strong ties.

In this work, we did not take the direction of the interactions into account, yet clearly, the intensity index can be computed for in-degrees and out-degrees separately. In organizations, for example, it corresponds to those disseminating information and those on the receiving side; in online forums to conversation initiators and responders, correspondingly.

3.2 Temporal Network Intensity Index

Intensity level can be computed over the entire timeline of a network. To understand how the intensity changes with time or in response to events a temporal definition is needed. We continue then to propose a temporal index of intensity. Formally, we propose a measure of Temporal Network Intensity as follows:

(5)

Where is a sequence of graphs representing consecutive network snapshots in a period .

Interactions indicate how information flows in a network. Understanding the flow of information in a network over time is fundamental in the research of social networks and organizations. The proposed temporal intensity metric enables an additional layer of knowledge on the flow of information, as it gives a measure of volume. It captures interactions occurring during a measured period that do not change the structure but still carry additional information on the complex system behavior. It thus enriches our understanding of the network’s temporal complexity. For example, today’s organizations are in a constant state of change burke2017organization . Following the temporal intensity of the interactions in an organization can help in identifying fluctuations in the level of intensity in the organization prior, during, and after a planned organizational change.

3.3 Network Dominance Inequality Index

Complex networks are heterogeneous with a few dominant nodes. We explore here the measure and extent of this inequality. Measuring the disparity in the level of communication, for example, enables an understanding of the variance in the level of members’ engagement in a network.

We continue to study the inequality in nodal dominance in a graph while considering the intensity of nodes’ interactions. In organizations, for example, when a change is introduced, high interactions can be found among its supporters and opposers. Members that have yet to make up their mind would exhibit less intensity in their interactions burke2017organization . In this case, understanding the level of inequality in the intensity of the participation can aid in understanding the balance between change-involved members versus those who are not.

We measure the inequality in nodal interactions dominance utilizing the Gini inequality index  gini1921measurement ; atkinson1970measurement for measuring income inequality. The Gini index is a measure of the mean absolute difference, and in our case, the difference is in nodal engagement, i.e., weighted degree. To follow the temporal changes of dominance in a network, we use a temporal measure of this index per period, which we term Dominance Inequality.

4 Measuring Temporal Intensity and Dominance Inequality in Real Networks

To measure our temporal indices in real networks we collect six different datasets of real networks of interactions. We concentrate on contact networks, i.e., organizational emails, online forums, and messaging applications, as detailed in Table 1. To capture the evolutional dynamics of the longitudinal evolution of the system we follow the modeling of temporal sequential periods similarly to pan2011path and divide the temporal information to a sequence of networks, each portraying a snapshot of the system during a week.

Name #Nodes #Edges Duration
min, max min, max (in weeks)
AskUbuntu paranjape2017motifs 1458, 2832 2108,4325 198
Facebook Wall Posts viswanath2009evolution 1566, 11325 1514, 13384 124
Wikipedia Conflict brandes2009network 2011, 7250 3749, 33623 156
Wikipedia Talk sun2016predicting 15154, 53236 26494, 73356 132
Manufacturing Emails michalski2011matching 104, 148 587, 1335 38
EU Research Institutional Emails paranjape2017motifs 52, 667 46, 3197 74
Enron Management Emails Klimt2004 33, 107 27, 212 78

Table 1: Rael Datasets Descriptions

4.1 Robustness of the Temporal Network Intensity

Figure 2: Temporal average intensity for the six different datasets, denoted by the strong blue line. The values are denoted by the y-axis to the left. The light grey dashed line denotes the number of participating nodes during each period, i.e., the temporal size of the network. The size scale (i.e., number of nodes) is given by the right y-axis.

For each of the datasets described in Table 1222The Enron Management dataset will be discussed in detail in Section 5. we calculate the weekly temporal network intensity, as defined in Equation 5. The x-axis denotes the timeline, which is different for each network. The blue dots correspond to the calculated temporal network intensity, and their values are denoted by the y-axis. The network size, as measure by the number of weekly nodes, is denoted by light grey, and its scale is denoted by the right y-axis. Surprisingly, the networks exhibit a rather stable temporal behavior in their intensity, regardless of the fluctuations in size. The Facebook network, on the lower left panel, show a steady increase in network size from several hundreds up to more than 10000 weekly participants. Still, the average temporal intensity is quite robust. On the upper left panel we see the temporal intensity of the AskUbuntu forum over time. The average intensity hardly changes in time, despite large fluctuations in the number of participants in the discussions over the different periods. Similar results are seen for the WikipediaConflict dataset, in the left middle panel. Interestingly, in the Wikipedia talk network, on the upper right panel, we see that weeks that have sparks in the number of participants are somewhat less intense, on average. It is interesting that despite the spike in general interest during these weeks, the average intensity of conversation has not increased, and is even lower. It is also interesting to note that although the Intensity is not bounded in value, in all these networks the average intensity is low (minimal intensity is calculated from zero as explained above).

Figure 3: Aggregated network measures: (A) The PDF of the weekly Temporal Network Intensity for each dataset. (B) The cumulative distribution of the relative change in the measured Temporal Network Intensity between every two consecutive weeks for each dataset

To understand the exact nature of the temporal network intensity in networks we continue to plot the PDF of the Temporal Network Intensity as described in Equation 5, i.e., with minimal intensity of 1, for each network, as denoted in Figure 3

(A). The measured networks have a vivid normal distribution of intensity level over the consecutive weeks of activity. We continue to understand the average temporal change between consecutive weeks by computing the percentage of change between every two consecutive weeks. Figure 

3(B) denotes the cumulative distribution of the relative change in the measured Temporal Network Intensity between every two consecutive weeks for each dataset. For both AskUbuntu and Facebook less than 0.05 change in the temporal activity accounts for more than 90% of the consecutive weeks. The network of EU emails is also almost as stable. In all networks, however, more than 80% of the changes are of less than 15%.

4.2 Robustness of Temporal Dominance Inequality

Figure 4: Temporal Dominance Inequality for the six different datasets, denoted by the strong blue line. The values are denoted by the y-axis to the left. The light grey dashed line denotes the number of participating nodes during each period, i.e., the temporal size of the network. The size scale (i.e., number of nodes) is given by the right y-axis

Similarly to the Temporal Network Intensity we plot here the values for the Temporal Dominance inequality for the networks. First, it is interesting to note that the measured inequality is in the range of for all measured networks. Intuitively, an Erdös-Rényi (ER) random network ER60 would yield very low inequality values, as all nodes have a similar chance for communicating, and a pure Preferential Attachment (PA) BA99 network would give a very high inequality value. As the measured networks are known to be heterogeneous, we expect the inequality to be rather high. The somewhat low value of inequality can be attributed to the fact that we measure the undirected graph. Indeed, when measured over a directed graph the inequality results were indeed higher. More importantly, though, is that also for this metric, temporal results are mostly stationary for each network, and differ between the networks.

Figure 5: Aggregated network measures: (A) The PDF of the weekly Temporal Dominance Inequality for each dataset. (B) The cumulative distribution of the relative change in the measured Temporal Dominance Inequality between every two consecutive weeks for each dataset

5 A Network Nearing its End: Enron Emails

Figure 6: The cumulative distribution of the relative change between every two consecutive weeks for each dataset as measured for (a) Temporal Network Intensity and (b) Dominance Inequlity.

An important question is how would the devised metrics behave for the Enron dataset. We deviate here for a paragraph, to give the needed background on the once billion dollar company known for its Bankruptcy in December of 2001 and its disintegration in the following year. Enron, originally a gas company, has ”created Enron Online (EOL) in October 1999, an electronic trading website that focused on commodities. Enron was the counterparty to every transaction on EOL; it was either the buyer or the seller.. When the recession hit in 2000, Enron had significant exposure to the most volatile parts of the market..By the fall of 2000, Enron was starting to crumble under its own weight”enron2019 . Shortly after its demise, the company’s entire email exchange was released by a judge order.

Given the known set of events and their timeline, and the availability of the entire management email corpus, Enron’s emails are used for change point detection algorithms, who compare their found events with actual ones Peel2015 ; miller2018size .

We checked our indices over the Enron management emails, on a weekly basis, from August 2000 to March 2002. Our intriguing results, presented in Figure 6(A) show that the Enron network is different from the other networks examined in terms of the range of Temporal Network Intensity index and the percentage of changes measured in the index. The network displays Temporal Network Intensity in the range of , well above the index range for the other networks. In addition, the index volatility is very high and the changes between weekly measurements are high. The Temporal Dominance Inequality, as presented in Figure 6(B), while is similar in range to that of other networks, also shows high volatility compared to the other networks.

Overall, during the entire checked period both the Temporal Networks Intensity and the Dominance Inequality indices exhibit high weekly changes that unlike the rest of the networks, cannot be defined as stationary.

6 Predicting a Network from its Engagement

The networks examined were characterized by stability in two selected indices, the activity index and the Gini index. This stability comes both in the range of the values measured for each network over time and in the level of changes within the indices between successive periods. In measuring the distribution of the percentage of changes between successive periods it appears that volatility of up to in the Temporal Network Intensity covers over 90% of the network’s operating time. Similar results were obtained for the Dominance Inequality index. The values measured for the indices between networks, however, differ by 0.4 to 0.7.

To examine how typical are the Temporal Network Intensity and the Temporal Dominance Inequality indices for each network we perform a classification task over the temporal indices with the target of classifying the class (dataset) that produced it. We perform the experiment over all seven datasets as appear in Table 1.

Classification methodology:

Our target function is to classify to seven different classes, each corresponds to a network dataset. Hence, we decided to avoid the binary classifiers, like support vector machine, which will enforce additional algorithms like ”one-vs-one” or ”one-vs-all” to compare its classification efficiency and calculate the overall confusion matrix. Also, as our features have only two dimensions (Inequality, Activity) we skipped classifiers that focus on dimension reduction, such as neural networks. We therefore chose three multiclass classifiers. All are well known, robust, yet simple classifiers. The first is the K-Nearest Neighbors (KNN) classifier. Several well-known algorithms are implementing KNN such as Brute force, K-D tree, and more. We utilized 

muja2009fast

to automate the algorithm configuration. In addition, we also ran a Decision Tree 

breiman2017classification

and Random Forests 

breiman2001random . We ran the classification using Python Scikit-learn pedregosa2011scikit with five folds cross-validation kohavi1995study and calculated the Precision, Recall, Accuracy, and F1. We summarize the results in Table 2.

Our classes (datasets) were not equal in size, when considering the number of weeks (see Table 1

). We, therefore, employ two known balancing techniques. The first is multiplying the small datasets to balance the scale of each class; the other is Stratified Folds that preserves the probability distribution of each class for all folds 

kohavi1995study .

Classification Results:

We present the results for each classification algorithm and each balancing method in Table 2.

Classifier Balancing Accuracy Precision Recall F1
Avg.,Std Avg.,Std Avg.,Std Avg.,Std
KNN Stratified Fold 0.83, 0.04 0.83, 0.03 0.85, 0.04 0.86, 0.01
KNN Data multiplication 0.87, 0.01 0.88, 0.01 0.87, 0.01 0.87, 0.01
Decision Tree Stratified Fold 0.84, 0.03 0.84, 0.03 0.85, 0.02 0.85, 0.01
Decision Tree Data multiplication 0.90, 0.01 0.90, 0.01 0.90, 0.01 0.89, 0.01
Random Forest Stratified Fold 0.86, 0.03 0.86, 0.03 0.87, 0.03 0.86, 0.03
Random Forest Data multiplication 0.95, 0.01 0.95, 0.01 0.95, 0.01 0.95, 0.01
Table 2: Classification results of genres according to the movie’s emotional vector

All algorithms were able to infer with in the range of and high accuracy the correct network dataset from its weekly indices over all folds. To test the dependency of the success per class, we repeatedly re-ran the tests while excluding one class (dataset) at a time, and compared the overall results. The difference in the results was insignificant across all experiments, showing that the overall result is robust across the datasets.

Figure 7 visualizes the weekly indices for each dataset while coloring each dataset differently over a planar space. The visualization indicate a limited center of mass for most networks. The Enron dataset again shows a very high variety in the indices between the weeks, and is typically much more intense than the rest of the networks.

Figure 7: Planar view of the weekly measured metrics, Activity Intensity (axis x) and Dominance Inequality (axis y) for all datasets

7 Discussion and Conclusions

In this work we set to understand how temporal engagement in networks changes with time. To that end we defined two indices to capture the temporal network activity. The first, Temporal Network Intensity, can be roughly described as the average edge intensity in the network over a period. The second, the Dominance Inequality, is a measure of the engagement variance. Our surprising results are that for most emails and forum networks checked, the indices were stationary, implying a steady state. For a network known to be nearing a disintegration, Enron, the indices were volatile.

A similar stationary value was found in gautreau2009microdynamics for the average degree of the flux of people from airports. However, airports’ physical limitations may give a plausible explanation for this measure. In the datasets examined in this work these limitations do not exist. Interestingly, both our indices can be derived utilizing the average degree. We believe that these findings need to be further researched over a wider variety of networks exhibiting different dynamics.

The robustness of the indices regardless of significant size changes of the underlying network in time, is itself intriguing. For example, when the size of the network decreases, in a process of preferential detachment it is expected that the level of engagement and hence the indices would be also effected. We intend to further research this counter intuitive result.

We focus here on the complex temporal interactions and utilize them to gain an understanding on the system’s temporal behavior. By moving from a nodal-centric view to an interaction-centric view, we suggest a novel understanding on the dynamics of complex networks. Lastly, our result show that the indices we devised fluctuated significantly in a network that was dealing with a shaky situation that let to the company’s disintegration. In a future research, we intend to further understand the behavior of the indices for different network models and dynamics.

References

  • (1) P. Holme, J. Saramäki, Physics reports 519(3), 97 (2012)
  • (2) M. Granovetter, Sociological theory 1(1), 201 (1983)
  • (3) R.S. Burt, Research in organizational behavior 22, 345 (2000)
  • (4) D.L. Haynie, American journal of sociology 106(4), 1013 (2001)
  • (5) J.W. Spencer, Journal of International Business Studies 34(5), 428 (2003)
  • (6) T.A. Snijders, Models and methods in social network analysis 1, 215 (2005)
  • (7) G. Kossinets, D.J. Watts, Science 311(5757), 88 (2006)
  • (8) N. Perra, B. Gonçalves, R. Pastor-Satorras, A. Vespignani, Scientific reports 2, 469 (2012)
  • (9) D. Lazer, A.S. Pentland, L. Adamic, S. Aral, A.L. Barabasi, D. Brewer, N. Christakis, N. Contractor, J. Fowler, M. Gutmann, et al., Science (New York, NY) 323(5915), 721 (2009)
  • (10) O. Artime, J.J. Ramasco, M. San Miguel, Scientific reports 7, 41627 (2017)
  • (11) J.H. Fowler, N.A. Christakis, et al., Bmj 337, a2338 (2008)
  • (12) C.C. Phelps, Academy of Management Journal 53(4), 890 (2010)
  • (13) T. Hellmann, M. Staudigl, European Journal of Operational Research 234(3), 583 (2014)
  • (14) A. Ilany, A.S. Booms, K.E. Holekamp, Ecology letters 18(7), 687 (2015)
  • (15) A. Barrat, M. Barthelemy, A. Vespignani, in Large scale structure and dynamics of complex networks: from information technology to finance and natural science (World Scientific, 2007), pp. 67–92
  • (16) A.L. Barabasi, Nature 435(7039), 207 (2005)
  • (17) J.P. Onnela, J. Saramäki, J. Hyvönen, G. Szabó, D. Lazer, K. Kaski, J. Kertész, A.L. Barabási, Proceedings of the national academy of sciences 104(18), 7332 (2007)
  • (18) M.S. Granovetter, American journal of sociology pp. 1360–1380 (1973)
  • (19) T. Opsahl, V. Colizza, P. Panzarasa, J.J. Ramasco, Physical review letters 101(16), 168702 (2008)
  • (20) A. Barrat, M. Barthelemy, R. Pastor-Satorras, A. Vespignani, Proceedings of the national academy of sciences 101(11), 3747 (2004)
  • (21) M.E. Newman, Physical review E 70(5), 056131 (2004)
  • (22) T. Opsahl, F. Agneessens, J. Skvoretz, Social networks 32(3), 245 (2010)
  • (23) R.E. Nelson, Academy of Management Journal 32(2), 377 (1989)
  • (24) E. Gilbert, K. Karahalios, in Proceedings of the SIGCHI conference on human factors in computing systems (ACM, 2009), pp. 211–220
  • (25) G. Miritello, E. Moro, R. Lara, Physical Review E 83(4), 045102 (2011)
  • (26) J.P. Eckmann, E. Moses, D. Sergi, Proceedings of the National Academy of Sciences 101(40), 14333 (2004)
  • (27) J. Sun, J. Kunegis, S. Staab, in 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) (IEEE, 2016), pp. 128–135
  • (28) W.W. Burke, Organization Change: Theory and Practice (SAGE Publications, 2017)
  • (29) J. Nielsen, http://www. useit. com/alertbox/participation_inequality. html (2006)
  • (30) A.C. Sonnenbichler, arXiv preprint arXiv:1006.4271 (2010)
  • (31) L. Peel, A. Clauset, 29th AAAI Conference on Artificial Intelligence (AAAI) pp. 1–11 (2015). URL http://arxiv.org/abs/1403.0989
  • (32) H. Miller, O. Mokryn, arXiv preprint arXiv:1809.09613 (2018)
  • (33) R.K. Pan, J. Saramäki, Physical Review E 84(1), 016105 (2011)
  • (34) V. Sekara, A. Stopczynski, S. Lehmann, Proceedings of the national academy of sciences 113(36), 9977 (2016)
  • (35) A. Li, S.P. Cornelius, Y.Y. Liu, L. Wang, A.L. Barabási, Science 358(6366), 1042 (2017)
  • (36) A. Gautreau, A. Barrat, M. Barthélemy, Proceedings of the National Academy of Sciences 106(22), 8847 (2009)
  • (37) T. Yasseri, R. Sumi, A. Rung, A. Kornai, J. Kertész, PloS one 7(6), e38869 (2012)
  • (38) M. Karsai, N. Perra, A. Vespignani, Scientific reports 4, 4001 (2014)
  • (39) M.G. Rodriguez, D. Balduzzi, B. Schölkopf, arXiv preprint arXiv:1105.0697 (2011)
  • (40) M. Gomez-Rodriguez, J. Leskovec, A. Krause, ACM Transactions on Knowledge Discovery from Data (TKDD) 5(4), 21 (2012)
  • (41) M. Rosvall, A.V. Esquivel, A. Lancichinetti, J.D. West, R. Lambiotte, Nature communications 5, 4630 (2014)
  • (42) M. Nadini, K. Sun, E. Ubaldi, M. Starnini, A. Rizzo, N. Perra, Scientific reports 8(1), 2352 (2018)
  • (43) L. Kovanen, M. Karsai, K. Kaski, J. Kertész, J. Saramäki, Journal of Statistical Mechanics: Theory and Experiment 2011(11), P11005 (2011)
  • (44) D. Taylor, S.A. Myers, A. Clauset, M.A. Porter, P.J. Mucha, Multiscale Modeling & Simulation 15(1), 537 (2017)
  • (45) A.J. Corrado, Dynamics of complex systems (CRC Press, 2019)
  • (46) R.A. Rossi, N.K. Ahmed, IEEE Transactions on Knowledge and Data Engineering 27(4), 1112 (2015)
  • (47) P.V. Gupte, B. Ravindran, S. Parthasarathy, in 2017 IEEE 33rd International Conference on Data Engineering (ICDE) (IEEE, 2017), pp. 771–782
  • (48) G. Costa, R. Ortale, ACM Transactions on Knowledge Discovery from Data (TKDD) 12(2), 18 (2018)
  • (49) L.C. Freeman, Social networks 1(3), 215 (1978)
  • (50) T. Opsahl, P. Panzarasa, Social networks 31(2), 155 (2009)
  • (51) C. Gini, The Economic Journal 31(121), 124 (1921)
  • (52) A.B. Atkinson, Journal of economic theory 2(3), 244 (1970)
  • (53) A. Paranjape, A.R. Benson, J. Leskovec, in Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (ACM, 2017), pp. 601–610
  • (54) B. Viswanath, A. Mislove, M. Cha, K.P. Gummadi, in Proceedings of the 2nd ACM workshop on Online social networks (ACM, 2009), pp. 37–42
  • (55) U. Brandes, P. Kenis, J. Lerner, D. van Raaij, in Proceedings of the 18th international conference on World wide web (ACM, 2009), pp. 731–740
  • (56) R. Michalski, S. Palus, P. Kazienko, in International Conference on Business Information Systems (Springer, 2011), pp. 197–206
  • (57)

    B. Klimt, Y. Yang, Machine Learning (2004)

  • (58) P. Erdös, A. Rényi, Publ Math Inst Hungar Acad Sci 5, 17 (1960)
  • (59) A.L. Barabási, R. Albert, SCIENCE 286, 509 (1999)
  • (60) T. Segal. Enron scandal: The fall of a wall street darling (2019). URL https://www.investopedia.com/updates/enron-scandal-summary/
  • (61) M. Muja, D.G. Lowe, VISAPP (1) 2(331-340), 2 (2009)
  • (62) L. Breiman, Classification and regression trees (Routledge, 2017)
  • (63) L. Breiman, Machine learning 45(1), 5 (2001)
  • (64) F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Journal of machine learning research 12(Oct), 2825 (2011)
  • (65) R. Kohavi, et al., in Ijcai, vol. 14 (Montreal, Canada, 1995), vol. 14, pp. 1137–1145
  • (66) J. Leskovec, A. Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)

List of abbreviations

CPD Change Point Detection
GHRG General Hierarchical Random graph
GBTER Generalized Two Block Erdos-Renyi
KPGM Kronecker Product Graph Mode
CDF Cumulative Distribution Function
KS Kolmogorov-Smirnov
PA Preferential Attachment
ER Erdos-Renyi

Acknowledgments

This work was supported by the Israel Science Foundation Grant 328/17.