Emergence through Selection: The Evolution of a Scientific Challenge

One of the most interesting scientific challenges nowadays deals with the analysis and the understanding of complex networks' dynamics and how their processes lead to emergence according to the interactions among their components. In this paper we approach the definition of new methodologies for the visualization and the exploration of the dynamics at play in real dynamic social networks. We present a recently introduced formalism called TVG (for time-varying graphs), which was initially developed to model and analyze highly-dynamic and infrastructure-less communication networks such as mobile ad-hoc networks, wireless sensor networks, or vehicular networks. We discuss its applicability to complex networks in general, and social networks in particular, by showing how it enables the specification and analysis of complex dynamic phenomena in terms of temporal interactions, and allows to easily switch the perspective between local and global dynamics. As an example, we chose the case of scientific communities by analyzing portion of the ArXiv repository (ten years of publications in physics) focusing on the social determinants (e.g. goals and potential interactions among individuals) behind the emergence and the resilience of scientific communities. We consider that scientific communities are at the same time communities of practice (through co-authorship) and that they exist also as representations in the scientists' mind, since references to other scientists' works is not merely an objective link to a relevant work, but it reveals social objects that one manipulates, select and refers to. In the paper we show the emergence/selection of a community as a goal-driven preferential attachment toward a set of authors among which there are some key scientists (Nobel prizes).



page 14


Computational Human Dynamics

This thesis summarises my scientific contributions in the domain of netw...

Infinite Hierarchical MMSB Model for Nested Communities/Groups in Social Networks

Actors in realistic social networks play not one but a number of diverse...

Time-Varying Graphs and Social Network Analysis: Temporal Indicators and Metrics

Most instruments - formalisms, concepts, and metrics - for social networ...

The Natural Selection of Conservative Science

Social epistemologists have argued that high risk, high reward science h...

Burgeoning Data Repository Systems, Characteristics and Development Strategies: Insights of Natural Resources and Environmental Scientists

Nowadays, we have the emergence and abundance of many different data rep...

Evaluating the effect of topic consideration in identifying communities of rating-based social networks

Finding meaningful communities in social network has attracted the atten...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

One of the most interesting scientific challenges nowadays deals with the analysis and the understanding of social networks’ dynamics and how their processes lead to emergence according to the interactions at play among their components. The research efforts in this area strive to understand what are the driving forces behind the evolution of social networks and how they are articulated together with social dynamics, e.g., opinion dynamics, the epidemic or innovation diffusion, the teams formation and so on (amblard01 ; Moore2000 ; Lelarge09 ; Carley02 ; Powell05 ; Guimera05 ; QuattrociocchiPC09 ; castellano07 ; quattrociocchi2010e ). In this paper we approach the definition of new methodologies for the visualization and the exploration of the dynamics within real dynamic social networks. As an example, we chose the case of scientific communities by analyzing a portion of the ArXiv repository (ten years of publications in physics) focusing on the social determinants (e.g. goals and potential interactions among individuals) behind the emergence and the resilience of scientific communities. In particular, the analysis addresses the co-existence of co-authorships’ and citations’ behaviors of scientists by focusing on the most proficient and cited authors interactions’ patterns and, in turn, on how they are affected by the selection process of citations. Such a “social” selection a) produces self-organization because it is played by a group of individuals which act, compete and collaborate in a common environment in order to advance Science and b) determines the success (emergence) of both topics and scientists working on them.

On the one hand, the studies on scientific network dynamics deal with the understanding of the factors that play a significant role in their evolution, not all of them being neither objective nor rational – e.g., the existence of a star system Wagner2005 , Newman2001 , Newman2004 , Barabasi2002 the blind imitation concerning the citations MacRoberts96 , the reputation and community affiliation bias Gilbert77 . On the other hand, having some elements to understand such dynamics could enable for a better detection of the hot topics and of the vivid subfields and how the scientific production is advanced with respect to selection process inside the community itself. Among the available data to analyze such a system, a subset of the publications in a given field is the most frequently used such as in Solla1965 , Newman2001a , Newman2004a , and Radicchi2009 . The scientific publications correspond to the production of such a system and clearly identify who are the producers (the authors), which institution they belong to (the affiliation), which funded project they are working on (the acknowledgement) and what are the related publications (the citations), having most of the time a public access to these data explain also a part of its frequent use in the analyses of the scientific field. Classical analyses concern either the co-authorships network (Barabasi2002 ; Newman2001 ) or the citation network (Hummon89 ; Redner05 ), more rarely the institutional network (Powell05 ). Moreover, such networks are often considered as static and their structure is rarely analyzed overtime (an exception is the one performed by Radicchi2009 on Physical Review).

The illustrative analysis presented in the paper passes through different data transformations aimed at providing different perspectives on the scientific network and its evolutions. On a first level we discuss the dataset by means of both static and temporal analysis of citations and co-authorships networks. A second level of analysis consists in transforming the data in order to explicit the interdependencies between the co-authorships and citations by analyzing the scientists’ representations of the collaboration structure within the scientific field. Such a representation is captured through the network of cited collaborations (QA2010a ), i.e. from a publication we have several references to other papers, each one corresponds to a promotion of the scientists authoring the work.

One of the problem when trying to characterize such a dynamic structure is that classical indicators from either graph theory or social network analysis cannot be applied directly. Therefore, we used an algebra, the Time-Varying Graphs (TVG) (CFQS2010 ) that enables to take into account the dynamical aspects of networks and allows for the definition of temporal indicators (ACFQS2010a ) to characterize patterns in evolving structures.

Through our approach, we capture the attractiveness played by famous authors on co-authorship behaviors and on the sub-communities structural evolution.

2 Context

In Newman2001 the network of scientific collaborations, explored upon several databases, shows a clustered and small world structure. Moreover, several differences between the collaborations’ patterns of the different fields studied are captured. Such differences have been deepened in Newman2004 with respect to the number of papers produced by a given group of authors, the number of collaborations and the topological distances between scientists. Peltomaki and Alava in peltomaki2006 propose a new emulative model aimed at approximating the growth of scientific networks, by incorporating bipartition and sub-linear preferential attachment. A model for the self-assembly of creative teams based on three parameters (e.g. team size, the rate of newcomers in the scientific production and the tendency of authors to collaborate with the same group) has been outlined in Guimera05 . Connectivity patterns in a citations network have been studied in relation to the development of the DNA theory Hummon89 . The work of Klemm and Eguiluz ( Klemm02 ) observed that real networks (e.g. movie actors, co-authorship in science, and word synonyms) growing patterns are characterized by a clustering trend that reaches an asymptotic value larger than regular lattices of the same average connectivity.

In the field of social network analysis several works have approached the problem of temporal metrics Holme05 ; Kostakos09 ; KosKW08 . Actually, the focus is on the definition of instruments able to capture the intrinsic properties of complex systems’ evolution, that is, characterizing the interdependencies and the co-existence between local behaviors (interactions) and their global effects (emergence) Davidsen2002 ; Mataric92 ; Woolley1994 ; amblard01 ; quattrociocchi2010e . The research approach to characterize the evolution patterns of social networks, at the very beginning was mainly based upon simulations, while in the past few years, due to the large availability of real datasets, either the methodology of analysis and the object of research have changed (Roth10b ; LES07 ; KosKW08 ; castellano07 ; LESK10 ). In particular, in the latter paper Leskovec states as central problem, for the social networks in general and for the scientific communities networks analysis in particular, the definition of mathematical models able to deal and to reproduce all the properties of dynamical real world networks such as the shrinking diameter (Les05 ), or the “small world” effect WATTS99 . Actually instruments and paradigms affording this challenge are mainly based upon stochastic definitions (Les05b ) or conceptualized as a sequence of snapshots of the network at different times TSM+09 .

3 Preliminaries

3.1 Time-Varying Graphs

The time-varying graph (TVG) formalism, recently introduced in CFQS2010 , is a graph formalisms based on an interaction-centric point of view and offers concise and elegant formulation of temporal concepts and properties ACFQS2010a .

Let us consider a set of entities (or nodes), a set of relations among entities (edges), and an alphabet labeling any property of a relation (label); that is, . The set enables multiple relations between any given pair of entities, as long as these relations have different properties, that is, for any , .

Relationships between entities are assumed to occur over a time span , namely the lifetime of the system. The temporal domain is assumed to be for discrete-time systems or for continuous-time systems. The time-varying graph structure is denoted by the set , where , called presence function, indicates whether a given edge is present at a given time, and , called latency function, indicates the time it takes to cross a given edge if starting at a given date. As in this paper the focus is on the temporal and structural analysis of a social network, we will deliberately omit the latency function and consider TVGs described as .

TVGs as a sequence of footprints.

Given a TVG , one can define the footprint of this graph from to as the static graph such that . In other words, the footprint aggregates interactions over a given time window into static graphs. Let the lifetime of the time-varying graph be partitioned in consecutive sub-intervals ; where each can be noted . We call sequence of footprints of according to the sequence SF.

Expressing other Temporal Concepts

A sequence of couples , such that is a walk in is a journey in if and only if , and . The and are respectively the starting date and the last date of a journey .

Journeys can be thought of as paths over time from a source to a destination and therefore have both a topological and a temporal length. The topological length of is the number of couples in (i.e., the number of hops); its temporal length is its end-to-end duration: .

4 Exploring the Dataset

4.1 The Dataset

As mentioned in the introduction, the scientific community analyzed in this work has been extracted from the hep-th (High Energy Physics – Theory) portion of the arXiv website, an on-line repository available at http://arxiv.org/.

The dataset is composed by a collection of papers and therefore their related citations over the period within January 1992 to May 2003. For each paper the set of authors, the dates of the on-line publications on arXiv.org, and the references are provided. There are 352 807 citations within the total amount of 29 555 papers written by 59 439 authors. The broadness of the time window covered allows us to explore the dataset in order to extract, capture and characterize the evolution of the interactions patterns within the community by means of different data transformations.

4.2 The Networks Description

From the dataset, we can easily derive two graphs. The first, namely the co-authorships network, having authors as nodes and the undirected links standing for the relation of co-authoring a paper. The second, the citations network, where nodes are the papers and the links (directed) are the references among papers. More formally, the derived graphs can be defined as:

  • the co-authorship network as where nodes in are the authors and links connect nodes co-authoring a paper.

  • the citations network as where the nodes in are the papers and each edge corresponds to a reference to another paper.

Network Indicators
Network Diameter 26 37
Network Modularity 0,706 0.617
Network Average Clustering Coefficient 0.5006 0.156
Table 1: Co-authorships and Citations Graph Measures

In Table 1 we provide measures about the citations and collaborations networks.

The diameters - e.g., the longest shortest path between to pairs of nodes (respectively authors and papers) - of both networks have high values. The modularity, measuring how a network can be partitioned into modules or subparts, has high values on both graphs. Whereas the average clustering coefficient of the collaborations graph is higher than in the citations graph.

The networks are composed by several connected islands with few interconnections within them, and the co-authorships network is more clustered of the citations graph.

5 Temporalizing the Dataset

In this section we are going to explicit the temporal aspects, (i.e. the structural evolution) of the citations and co-authorships networks. The transformation is performed through the time-varying graphs formalism defined in the previous section.

We derive two time-varying graphs: the temporal co-authorships network, with undirected edges and authors as nodes where a link stands for the relations of co-authoring a paper; and the temporal citations network having papers as nodes and the links (directed) representing the citations from a paper to another one. The temporal dimension of both networks is derived by the paper’s submission date. The temporal co-authorship network has edges labeled with the date of submission, while the temporal citations network has the nodes labeled with the publication date of papers citing other papers.

More formally, we can define

  • the temporal co-authorships network as a quadruplet where the nodes in are the authors and links connect a couple of scientists co-authoring a paper. The temporal domain of the function , is the lifetime of each node that in this context is assumed as to be the submission date of the paper and .

  • the temporal citations network as a quadruplet where the nodes in the set are the papers and each edge corresponds to a citation to another paper. As for the co-authorships network, the temporal dimension of the presence function of is defined within the submission date of papers and .

5.1 Citations Network Evolution

In this section we show the evolution of the temporal citations network by using the sequence of footprints as defined in section 3.1. The values are computed by aggregating the interactions occurring at each sub-interval having fixed to one year. Figure 1(a) shows the evolution of the clustering coefficient the curve is characterized by a stable trend attesting on low values. The density evolution, which is shown in Figure 1(b), presents the same low and decreasing behavior, meaning that both the distances and interconnections among nodes (citations within papers) are stable for all the time windows observed. Also the modularity, shown in Figure 1(c), has a decreasing but stable trend.

(a) Average Clustering Coefficient
(b) Density
(c) Modularity
Figure 1: Citations Graph Evolution

5.2 Co-authorships Network Evolution

The temporal co-authorships graphs presents a different structural evolution with respect to the temporal graph of citations analyzed in the previous section. As before, here the values are computed by aggregating the interactions at each sub-interval where is fixed to one year. The average clustering coefficient evolution in the time interval observed, that is shown in Figure 2(a), has an oscillating trend with higher values than the ones reached by the temporal citations graph. In addition, has a more modular and denser structure, as shown in Figure 2(c) and Figure 2(b). The captured trends are characterized by a decreasing (and not stable) trend.

(a) Average Clustering Coefficient
(b) Density
(c) Modularity
Figure 2: Co-authorships Graph Evolution

6 Expliciting Interactions

In this section we provide an additional data transformation in order to capture more details about the evolution of our scientific network. The dynamics in scientific communities are based upon competitions and collaborations among authors and groups of scientists. In the analysis we want to capture a) the resulting emerging effects caused by these two opposite motivations and b) how they are expressed in terms of collaborations and citations patterns. The dataset analyzed in the current paper presents two explicit interactions: the papers’ co-authorships and the citations between papers. In addition, there is an implicit level of interaction that depends upon the goals behind any research paper: the quality and, at the same time, the necessity to be highly cited (competition). Hence, often both the collaborations’ and citations’ strategies are optimized in order to have the highest impact with respect to the problem addressed and to collect the highest number of citations. How such processes affect the scientific production and the scientific communities structural evolution?

6.1 Deriving the Interactions Graph

We approach the data transformation in order to explicit the co-authorships and cited collaboration patterns and their interdependencies. The dataset is transformed in an undirected graph, that we call the interaction network, having as nodes the authors, weighted links representing the co-authorship on a paper, and when a paper is cited by another work, the links’ weights, connecting the authors of the referenced paper, are incremented.

More formally, the graph of the cited co-authorships is defined as quadrupled on a discrete time. The nodes are the authors, the set of edges represents the collaborations on a paper’s production. The nodes appear on the graph the first time a paper they wrote has been published, and the interaction is weighted with a variable , namely the strength value of a collaboration, that is incremented at each citation received by a paper produced by a given couple of nodes .

In the following section we analyze the behaviors and the interaction’s strategies within the most cited authors’ network, such a graph, namely , is a subset of the global interaction network .

In particular the nodes considered in the analysis are only the authors having links’ strength values , that is, all the groups having more than 150 citations on a work. Such a network in its maximum expansion, during the 10 years temporal window observed, is composed by 12 583 nodes and 84 512 edges.

6.2 The Phase Transition

Figure 3(b) shows the density values for each element of the temporal sequence of footprints of the interactions network of the most proficient scientists . The time interval is fixed to one year.

The density trend starts with very low values and then an increase of the graph’s sparsity occurs during its evolution with a very low counter-trend during the period between 1999 and 2000.

(a) Average Clustering Coefficient
(b) Density
(c) Modularity
Figure 3: Collaborations Graph Evolution

The growing rate of the modularity, computed on is shown in 3(c). It is characterized by an increasing rate until 1993, then it reaches its highest values during the period between 1999 and 2000, but through a smoothed rate. As far as we can see by the modularity evolution, the interconnections among separated groups of authors starts in 1993, then their interconnection continues, but with a more stable rate. Looking at the curve of the average clustering coefficient shown in (Figure 3(a)

), we can see a phase transition occurring between 1999 and 2000 and separating a monotone trend from a decreasing one.

We can interpret the modularity trend as showing that nodes during the first phase are divided in several and separated groups, while after the phase transition of the clustering coefficient, the connections among these groups start to become denser causing a network structure with a smaller number of larger communities (modules) - e.g. the network tends toward a structural homogeneity.

6.3 Zooming on Interconnections

In order characterize the phenomena behind the phase transition outlined in the previous section, in Figure 4 we show the trend of the average ratio between nodes and edges in both the whole interaction network (in black) and the network of the most proficient scientists (in red). As we can see, the phase transition, evinced in the clustering coefficient evolution in Figure 3(a), is not caused by an increase of the number of authors in the period between 1999 and 2000, neither it is a pattern related to the whole dataset.

Figure 4: The average number of edges per node for the entire network of the cited co-authorships (in black) and for the network of the most proficient authors (in red)

In Table 2 we present the evolution of the average degree, the average path length and of the degree power law within the temporal window observed. As for the previous indicators these values are computed on the sequence of footprints with fixed to one year of .

Year Average Degree Average Path Length Power Law
1992 0,0095 1 0
1993 0,0176 1 -1,386
1994 0,012 1 -1,79
1995 0,0135 1,16 -2,16
1996 0,132 1,13 -2,27
1997 0,0118 1,12 -2,5
1998 0,106 1,12 -2,5
1999 0,066 3,92 -5,08
2000 0,64 3,79 -5,27
2001 0,6 3,82 -5,25
Table 2: Other interaction network’s measurements

In bold the values when the phase transition occurs. Neither the average path length, indicating the average distances among nodes, the power law degree, measuring how closely the degree distribution of a network follows a power-law scale and the evolution of average degree, counting the average number of connections at each node, are immune to the phase transition.

7 Goals and Preferential Attachment

As the Time-Varying graphs is an interaction-centric formalism. In this section we will show how such a modeling approach is compliant with one of the widely diffused platforms for network analysis and how it is possible to show the punctual evolution as in a movie of the temporal networks.

The most interesting emerging phenomenon from the previous section is a phase transition in the evolution of the structure of the most proficient authors’ network occurring between 1999 and 2000. According to the temporal analysis, such changes in the network are caused by a particular trend regarding the interconnections among nodes (authors) of the most cited co-authorships graph .

In this section we outline the network evolution by showing the formation of the biggest community at the beginning of the phase transition (1998) until its maximum expansion (2002). In Figure 5 we provide a sequence of screen-shots showing the nodes’ aggregation patterns. The pictures are obtained through the Gephi platform (Gephi09 ). At the beginning (Figure 5(a)) there are several separated components, that start to connect (Figure 5(b)).

(a) Several separated connected components
(b) that start to connect with each other
Figure 5: Connections within the islands

Notice that the edges are emphasized with respect to the strength value counting the number of citations of each couple of nodes. The component (group of authors) in the center is highly cited and it is playing as an attractor on the neighboring nodes as it is shown in Figure 6(a) until the maximum level of connections in the group is reached, as shown in Figure 6(b).

(a) The number of connections within authors continues to increase.
(b) The maximum level of connectivity is reached.
Figure 6: The growing phase of interaction among authors

7.1 Zooming on the Attractors

In this section we provide a more detailed vision on such a process of aggregation toward the attractors. Let starts by introducing Figure 7 showing the number of citations received at each semester by the most cited paper in our dataset.

Figure 7: the citations trend of the most cited paper for each semester

The citations rate has a strong increase after two semesters. The third semester coincides with the interval (1999-2000) of the phase transition captured in the previous section.

Hence, in order to understand the effect of this paper, in the following we will show a sequence of snapshots of the network structure in the neighbor of the authors of the most cited paper when it appears in our database. Notice that in the following pictures, the nodes’ diameters are proportional to the total amount of citations received by their papers.

At the beginning there are only separated components as shown in Figure 8(a). Then a large node appears (Figure 8(b)), and near appears a node with a smaller diameter but with a higher number of links. The biggest node is one of the authors of the most cited paper and as we can see, the node has a very low number of connections (collaborations) in that time interval.

(a) Before
(b) one of the authors (the biggest node) of the most cited paper and a smaller node with an higher degree appear
Figure 8: The appearence of one of the most cited authors
(a) The group of authors of the most cited papers appears. The authors are the two big nodes and the smaller hub
(b) The portion of the graph becomes denser
Figure 9: Densification through the hub node

In Figure 9(a) the fat node (a Nobel prize) and the hub node are connected, they publish a paper together with another node with a large diameter. Several islands start to link the clique formed and as we can see in Figure 9(b) the process of diffusion continues by means of new hubs.

The sequence of snapshots shows the interactions patterns behind the formulation of the “String Theory” and of its consequent developments.

Authors in the same community start to migrate toward the “island” of the authors of the most cited paper. The increasing community densification causes the formation of a giant component around the group authoring the most cited paper that in turns makes the network become denser and homogeneous as emerged in the analysis in the previous sections.

It is a goal-driven preferential attachment – e.g., the mechanism used to explain the power law degree distributions in social networks - due to the number of citations (representing the emergence through selection) to a given group. Authors tend to join highly cited groups to satisfy both the quality and the possibility to be highly cited requirements. Moreover, considering that at the beginning there are several separated groups, the phenomenon can be interpreted as a three-fold process with a first phase as the exploration of ideas by means of separated works afforded by separated groups, a second one when a part of the ideas explored starts to be cited more than the others, and a third one when authors tend to join groups that have produced highly cited works. The process tripartition resembles the phases of the natural selection, e.g. the exploration, the selection and migration. In this context such a (social) selection a) produces self-organization because it is played by a group of individuals which act, compete and collaborate in order to advance science and b) determines the success (emergence) of a topic and of the scientists working on it.

7.2 Characterizing the Community Evolution

Table 3 summarizes the network evolution by means of a) basic indicators, e.g. the number of nodes, the number of edges and the community’s diameter) and b) aggregated indicators, e.g. the cyclomatic number, the alpha, beta, and gamma index. The cyclomatic number counts the number of cycles on the graph, its magnitude characterizes the development of the nodes’ accessibility. The alpha index is the ratio between the number of cycles in the graph and their possible maximum value. The range of the alpha index spread within 0 to 1, that are from no cycles to a completely interconnected network. The beta index, is a simple measure of connectivity. It relates the total number of edges to the total number of nodes. The higher the value, the greater the connectivity is. The gamma index measures the ratio between the number of edges on the network and the maximum number of possible edges among nodes. The gamma index spreads within 0 and 100, respectively indicating the minimum and the maximum number of edges between nodes.

As we can see from the evolution of these parameters, the aggregation pattern among separated components is evident for each one of the metric proposed. In terms of nodes that join the community and their mutual connections, the diameter over time passes through a phase of expansion and then tends to stabilize.

Measures April 00 October 00 April 01 October 01 April 02 October 02 April 03
Vertices: 23 51 65 66 67 70 72
Edges: 29 75 99 100 106 110 114
Diameter: 6 10 10 10 8 8 8
Cyclomatic: 17 25 35 35 40 41 43
Alpha: 0,73 0,02 0,017 0,016 0,018 0,017 0,017
Beta: 1,69 1,47 1,52 1,51 1,58 1,57 1,58
Gamma: 61,9 51,02 52,38 52,08 54,3 53,92 54,28
Table 3: network measurement of the biggest community

8 Conclusions

In this paper we characterize the evolution of a scientific community extracted by the ArXiv’s hep-th (High Energy Physics – Theory) repository. The analysis starts with a static vision on the dataset by showing the structure of the citations and co-authorships graphs derived by the dataset. Then by adding the temporal dimension on both networks we characterize the structural changes of the co-authorships and citations graphs. The temporal dimension and the metrics used for the analysis were formalized using Time-Varying Graphs (TVG), a mathematical framework designed to represent the interactions and their evolution in dynamically changing environments.

Since we are interested in the relationships between collaborations and citations behaviors of scientists, we focus on the network of most cited authors and on its structural evolution where several interesting aspects emerge. The network evolves toward a denser structure, a phase transition occurs in the 1999-2000 time interval causing the homogenization of communities.

Through our approach, we capture the role played by famous authors on co-authorship behaviors. They act as attractors on the community. The driving force is a sort of preferential attachment driven by the number of citations received by a given group, that in terms of the goal of any scientific community indicates a strategy oriented to the community belonging.

Furthermore, the evolution of the network from a sparse and modular structure to a denser and homogeneous one, can be interpreted as a three-fold process reflecting the natural selection. The first phase is the exploration of ideas by means of separated works, once some ideas start to be cited (selected) more than others, then authors tend to join groups that have produced highly cited works. The selection is performed by individuals in a goal oriented environment and such a (social) selection produces self-organization because it is played by a group of individuals which act, compete and collaborate in order to advance Science. In fact, the driving force is an emergent effect of the interdependencies between citations and the goal of the scientific production since the social selection determines the emergence of a topic and of the scientists working on it by determining the so called preferential attachment toward groups and topics having high potential of citations. Finally, we show that the migration of authors toward the most cited authors (attractors) expresses through a hub node, - e.g. a node with few citations and several co-authorships.

In the next future we are going to outline the behavior of the most proficient scientist in terms of their aggregation patterns, and on how their works are diffused within the community, that is, characterizing the reasons behind the selection process causing the network structural evolution. Such aspects will be addressed both with new analyses on different datasets and by means multi-agent simulations. The former stream will be devoted to the definition of new patterns, the latter will be used for the understanding of how changing some parameters of the network influences the evolution, and consequently the quality, of the scientific production.

9 Acknowledgments

This work was partially supported by the Future and Emerging Technologies programme FP7-COSI-ICT of the European Commission through project QLectives (grant no.: 231200).


  • [1] Jeong H. Neda Z. Ravasz E. Schubert A. Barabasi, A.L. and T. Vicsek. Evolution of the social network of scientific collaborations. Physica A, vol.311, pages 590–614, 2002.
  • [2] M. Bastian, S. Heymann, and M. Jacomy. Gephi: An open source software for exploring and manipulating networks. In International AAAI Conference on Weblogs and Social Media, 2009.
  • [3] C.T Butts and K.M. Carley. Structural change and homeostasis in organizations: A decision-theoretic approach, 2002.
  • [4] A. Casteigts, P. Flocchini, W. Quattrociocchi, and N. Santoro. Time-varying graphs and dynamic networks. Technical Report University of Carleton, Canada, 2010.
  • [5] C. Castellano, S. Fortunato, and V. Loreto. Statistical physics of social dynamics. 2007.
  • [6] Jörn Davidsen, Holger Ebel, and Stefan Bornholdt. Emergence of a small world from local interactions: Modeling acquaintance networks. Phys. Rev. Lett., 88(12):128701, Mar 2002.
  • [7] G. Deffuant, D. Neau, F. Amblard, and Gerard Weisbuch. Mixing beliefs among interacting agents. Advances in Complex Systems, 3:87–98, 2001.
  • [8] N. Gilbert. Referencing as persuasion. Social Studies of Science, vol.7, pages 113–22, 1977.
  • [9] R. Guimera, B. Uzzi, J. Spiro, and L.A. Amaral. Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance. Science, 308(5722):697–702, 2005.
  • [10] P. Holme. Network reachability of real-world contact sequences. Physical Review E, 71(4):46119, 2005.
  • [11] Norman P. Hummon and Patrick Doreian. Connectivity in a citation network: The development of dna theory, 1989.
  • [12] Konstantin Klemm and Víctor M. Eguíluz. Highly clustered scale-free networks. Physical Review E, 65(3):036123+, Feb 2002.
  • [13] G. Kossinets, J. Kleinberg, and D. Watts. The structure of information pathways in a social communication network. In Proc. of the 14th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining (KDD 2008), pages 435–443, 2008.
  • [14] V. Kostakos. Temporal graphs. Physica A: Statistical Mechanics and its Applications, 388(6):1007–1023, 2009.
  • [15] M. Lelarge. Diffusion of innovations on random networks: Understanding the chasm, 2009.
  • [16] Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, and Christos Faloutsos.
  • [17] Jure Leskovec, Deepayan Chakrabarti, Jon M. Kleinberg, Christos Faloutsos, and Zoubin Ghahramani. Kronecker graphs: An approach to modeling networks.

    Journal of Machine Learning Research

    , 11:985–1042, 2010.
  • [18] Jure Leskovec, Jon Kleinberg, and Christos Faloutsos.
  • [19] Jure Leskovec, Jon M. Kleinberg, and Christos Faloutsos. Graph evolution: Densification and shrinking diameters. TKDD, 1(1), 2007.
  • [20] M.H. MacRoberts and B.R. MacRoberts. Problems of citation analysis. Scientometrics, vol.36, no.3, pages 435–444, 1996.
  • [21] M. Mataric. Designing emergent behaviors: From local interactions to collective intelligence. In In Proceedings of the International Conference on Simulation of Adaptive Behavior: From Animals to Animats, volume 2, pages 432–441, 1992.
  • [22] C. Moore and M.E.J. Newman. Epidemics and percolation in small-world networks. Phys. Rev. E, 61:5678–5682, 2000.
  • [23] M. E. J. Newman. Clustering and preferential attachment in growing networks. In Physical Review E, vol.64, 2001.
  • [24] M. E. J. Newman. The structure of scientific collaboration networks. 98(2):404–409, January 2001.
  • [25] M. E. J. Newman. Coauthorship networks and patterns of scientific collaboration. In Proceedings of the National Academy of Sciences, pages 5200–5205, 2004.
  • [26] M. E. J. Newman. Who is the best connected scientist? a study of scientific coauthorship networks. In Complex Networks, lecture notes in Physics, 2004.
  • [27] Matti Peltomaki and Mikko Alava. Correlations in bipartite collaboration networks. J.STAT.MECH., page P01010, 2006.
  • [28] W.W Powell, D.R. White, and K.W. Koput. Network dynamics and field evolution: The growth of interorganizational collaboration in the life sciences. American Journal of Sociology, vol.110, no.4, pages 1132–1205, 2005.
  • [29] D.J. De Solla Price. Networks of scientific papers. Science, vol.149, no.3683, pages 510–515, 1965.
  • [30] W. Quattrociocchi, R. Conte, and E. Lodi. Simulating opinion dynamics in heterogeneous communication systems. ECCS 2010 - Lisbon Portugal, 2010.
  • [31] W. Quattrociocchi, M. Paolucci, and R. Conte. On the effects of informational cheating on social evaluations: image and reputation through gossip. International Journal of Knowledge and Learning, 5(5/6):457–471, 2009.
  • [32] Walter Quattrociocchi and Frederic Amblard. Selection in scientific networks. arxiv:1012.4396, 2010.
  • [33] F .Radicchi, S. Fortunato, B. Markiness, and A. Vespignani. Diffusion of scientific credits and the ranking of scientists. Physical Review E, vol.80, 2009.
  • [34] S. Redner. Citation statistics from 110 years of physical review. Physical Review, Physics Today, vol.58, pages 49–54, 2005.
  • [35] N. Santoro, W. Quattrociocchi, P. Flocchini, A. Casteigts, and F. Amblard. Time varying graphs and social network analysis: Temporal indicators and metrics. Technical Report University of Carleton, Canada, 2010.
  • [36] J. Tang, S. Scellato, M. Musolesi, C. Mascolo, and V. Latora. Small-world behavior in time-varying graphs. Arxiv preprint arXiv:0909.1712, 2009.
  • [37] Carla Taramasco, Jean-Philippe Cointet, and Camille Roth. Academic team formation as evolving hypergraphs. Scientometrics, April 2010.
  • [38] C.S Wagner and K. Leydesdorff. Network structure, self-organization, and the growth of international collaboration in science. Research Policy vol 34 n10, pages 1608–1618, 2005.
  • [39] Duncan J. Watts. Networks, dynamics and the small world phenomenon. AJS, 1999.
  • [40] David R. Woolley. PLATO: The Emergence of Online Community, 1994.