Tracking network dynamics: a survey of distances and similarity metrics
From longitudinal biomedical studies to social networks, graphs have emerged as a powerful framework for describing evolving interactions between agents in complex systems. In such studies, after pre-processing, the data can be represented by a set of graphs, each representing a system's state at different points in time. The analysis of the system's dynamics depends on the selection of the appropriate analytical tools. After characterizing similarities between states, a critical step lies in the choice of a distance between graphs capable of reflecting such similarities. While the literature offers a number of distances that one could a priori choose from, their properties have been little investigated and no guidelines regarding the choice of such a distance have yet been provided. In particular, most graph distances consider that the nodes are exchangeable and do not take into account node identities. Accounting for the alignment of the graphs enables us to enhance these distances' sensitivity to perturbations in the network and detect important changes in graph dynamics. Thus the selection of an adequate metric is a decisive --yet delicate--practical matter. In the spirit of Goldenberg, Zheng and Fienberg's seminal 2009 review, the purpose of this article is to provide an overview of commonly-used graph distances and an explicit characterization of the structural changes that they are best able to capture. We use as a guiding thread to our discussion the application of these distances to the analysis of both a longitudinal microbiome dataset and a brain fMRI study. We show examples of using permutation tests to detect the effect of covariates on the graphs' variability. Synthetic examples provide intuition as to the qualities and drawbacks of the different distances. Above all, we provide some guidance for choosing one distance over another in certain types of applications.
READ FULL TEXT