Data-driven Analysis of Complex Networks and their Model-generated Counterparts
Data-driven analysis of complex networks has been in the focus of research for decades. An important question is to discover the relation between various network characteristics in real-world networks and how these relationships vary across network domains. A related research question is to study how well the network models can capture the observed relations between the graph metrics. In this paper, we apply statistical and machine learning techniques to answer the aforementioned questions. We study 400 real-world networks along with 6 x 400 networks generated by five frequently used network models with previously fitted parameters to make the generated graphs as similar to the real network as possible. We find that the correlation profiles of the structural measures significantly differ across network domains and the domain can be efficiently determined using a small selection of graph metrics. The goodness-of-fit of the network models and the best performing models themselves highly depend on the domains. Using machine learning techniques, it turned out to be relatively easy to decide if a network is real or model-generated. We also investigate what structural properties make it possible to achieve a good accuracy, i.e. what features the network models cannot capture.
READ FULL TEXT