# Discovering underlying dynamics in time series of networks

Understanding dramatic changes in the evolution of networks is central to statistical network inference, as underscored by recent challenges of predicting and distinguishing pandemic-induced transformations in organizational and communication networks. We consider a joint network model in which each node has an associated time-varying low-dimensional latent vector of feature data, and connection probabilities are functions of these vectors. Under mild assumptions, the time-varying evolution of the constellation of latent vectors exhibits low-dimensional manifold structure under a suitable notion of distance. This distance can be approximated by a measure of separation between the observed networks themselves, and there exist consistent Euclidean representations for underlying network structure, as characterized by this distance, at any given time. These Euclidean representations permit the visualization of network evolution and transform network inference questions such as change-point and anomaly detection into a classical setting. We illustrate our methodology with real and synthetic data, and identify change points corresponding to massive shifts in pandemic policies in a communication network of a large organization.

• 15 publications
• 3 publications
• 25 publications
• 91 publications
11/14/2019

### Estimation of dynamic networks for high-dimensional nonstationary time series

This paper is concerned with the estimation of time-varying networks for...
08/24/2012

### Changepoint detection for high-dimensional time series with missing data

This paper describes a novel approach to change-point detection when the...
02/10/2022

### Learning Latent Causal Dynamics

One critical challenge of time-series modeling is how to learn and quick...
09/02/2008

### From Data to the p-Adic or Ultrametric Model

We model anomaly and change in data by embedding the data in an ultramet...
09/12/2019

### DyANE: Dynamics-aware node embedding for temporal networks

Low-dimensional vector representations of network nodes have proven succ...
07/02/2020

### Laplacian Change Point Detection for Dynamic Graphs

Dynamic and temporal graphs are rich data structures that are used to mo...
11/16/2021

### Quantification of fracture roughness by change probabilities and Hurst exponents

The objective of the current study is to utilize an innovative method ca...

## 1 Introduction

The structure of many organizational and communication networks underwent a dramatic shift during the disruption of the COVID-19 pandemic in 2020, a visualization of which can be seen in Figure 1

. Such transformations give rise to several important statistical questions: how to construct useful measures of dissimilarity across networks; how to estimate any such measure of dissimilarity from random network realizations; and how to identify loci of change. Our goal in this paper is to build a robust methodology to model and infer important characteristics of network time series.

To this end, we focus on a class of time series of random networks. We define an intuitive distance between the evolution of certain random variables that govern the behavior of nodes in the networks and prove that this distance can be consistently estimated from the observed networks. When this distance is sufficiently similar to a Euclidean distance, multidimensional scaling extracts a curve in low-dimensional Euclidean space that mirrors the structure of the network dynamics. This permits a visualization of network evolution and identification of change points. Figure

2 is the result of an end-to-end case study using these techniques for a time series of communication networks in a large corporation in the months around the start of pandemic work-from-home protocols: see the dramatic change in both panels beginning in Spring 2020.

Analysis of multiple networks is a key emerging subdiscipline of network inference, with approaches ranging from joint spectral embedding [1, 5, 7, 8, 12, 14]

, tensor decompositions

[7, 10, 22], least-squares methods [11, 15], and multiscale methods via random walks on graphs [9]. Asymptotic properties of these methods depend on particular model assumptions for how the networks evolve over time and relate to one another, and rigorous performance guarantees are challenging and limited in scope.

On the one hand, both single and multiple-network inference problems often have related objectives. For example, if data include multiple network realizations from the same underlying model on the same set of aligned vertices, we may wish to effectively exploit these additional realizations for more accurate estimation of common network parameters—that is, use the replications in a multiple-network setting to refine parameter estimates that govern any single network in the collection. On the other hand, multiple network inference involves statistically distinct questions, such as identifying loci of change across networks or detecting anomalies in a time series of networks.

To establish principled results on inference for network time series, we focus on a particular subclass of random graph models, known as latent position random graphs [6]. Such networks assign to each vertex a typically unobserved vector in some low-dimensional Euclidean space ; edges between vertices then arise independently. The probability of an edge between vertex and vertex is some fixed function , called the link function or kernel, of the two associated latent positions for the respective vertices.

Latent position random graphs have the appealing characteristic of modeling network connections as functions of inherent features of the vertices themselves—these features are encoded in the latent positions—and transforming network inference into the recovery of lower-dimensional structure. More specifically, if we have a series of time-indexed latent position graphs on a common aligned vertex set, then associated to each network is a matrix whose rows are the latent vectors of the vertices. Since the edge formation probabilities are a function of pairs of rows of , the probabilistic evolution of the network time series is completely determined by the evolution of the rows of . As such, the natural object of study for inference about a time series of latent position graphs are the rows of . In particular, anomalies or change-points in the time-series of networks correspond to changes in the process. For example, a change in a specific network entity is associated to a change in its latent position, which can then be estimated.

The evolution of the rows of can be deterministic, as is the case when features of the nodes in a network follow some predictable time-dependent pattern; but it can also be random, as is the case when the actors in a network have underlying preferences that are subject to random shocks. When the latent position vector for some individual vertex is a random variable, we have, as varies, a stochastic process. This collection of random variables can be endowed with a metric, which under certain conditions is Euclidean realizable; that is, the random variables at each time have a representation as points in for some dimension , where the metric space distances between them are equal to the Euclidean distance between these points (see [4] for more on Euclidean realizability of dissimilarity matrices). This allows us to visualize the time evolution of this stochastic process as the image of a map from an interval into .

We use this idea to formulate a novel approach to network time series. We demonstrate methods for consistently estimating a Euclidean representation, or mirror, of the evolution of the latent position distributions from the observed networks. This mirror can reveal important underlying structure of the network dynamics, as we demonstrate in both simulated and real data, the latter of which is drawn from organizational and communication networks, revealing the change-point corresponding to the start of pandemic work-from-home orders.

We organize the paper as follows. In Section 2, we define our network model, our distance measure between pairs of networks in the time series, and precise notions of Euclidean realizability. We show that, under mild assumptions, we can associate the network evolution to a one-dimensional manifold that preserves key structure. We detail several concrete examples of network time-series models, exploring the geometric structure under various model assumptions. In Section 3, we prove that our distance measure can be consistently estimated from the observed networks, obtaining probabilistic concentration as the network size grows. We use this concentration to demonstrate that the Euclidean mirror may be consistently recovered through multidimensional scaling applied to the approximated distance matrix, producing a visualization that captures the true structure of the network evolution. We then test these methods in Section 4, applying our results to both simulated and real data examples. As a particular case study illustrating our approach, we consider a time series of communication networks for a large corporation during the start of pandemic work-from-home orders. In particular, we are able to clearly locate and visualize the pandemic shift. We discuss these results and possible future work in Section 5. The appendices contain proofs and supplemental results.

## 2 Model and Geometric Results

In order to model the intrinsic characteristics of the entities in our network, we consider latent position random graphs, which associate a vector of features in to each vertex in the network. The connections between vertices in the network are independent given the latent positions, with connection probabilities depending on the latent position vectors of the two vertices in question.

###### Definition 1 (Latent Position Graph, Random Dot Product Graph, and Generalized Random Dot Product).

We say that the random graph with adjacency matrix is a latent position random graph (LPG) with latent position matrix having rows and link function if

 P[A|X]=∏i

When , we say that is a random dot product graph (RDPG) and we call the connection probability matrix. When , we say that is a generalized random dot product graph (GRDPG) and we call the generalized edge connection probability matrix, where .

###### Remark 1 (Orthogonal nonidentifiability in RDPGs).

Note that if is a matrix of latent positions and is orthogonal, and give rise to the same distribution over graphs. Thus, the RDPG model has a nonidentifiability up to orthogonal transformation. Analogously, the GRDPG model has a nonidentifiability up to indefinite orthogonal transformations.

Since we wish to model randomness in the underlying features of each vertex, we will consider latent positions that are themselves random variables defined on a probability space . For a particular sample point , let be the realization of the associated latent position for this vertex at time . On the one hand, for fixed , as varies, is the realized trajectory of a -dimensional stochastic process. On the other hand, for a given time , the random variable represents the constellation of possible latent positions at this time. In order for the inner product to be a well-defined link function, we require that the distribution of follow an inner-product distribution:

###### Definition 2.

Let

be a probability distribution on

. We say that is a -dimensional inner product distribution if for all . We will suppose throughout this work that for a -dimensional inner product distribution and , has rank .

We wish to quantify the difference between the random vectors and . Suppose that the graphs come from an RDPG or GRDPG model, where at each time , the latent positions of each graph vertex are drawn independently from a common inner product latent position distribution . Because is a latent position, we necessarily have ; for notational simplicity, we will use and interchangeably. We define a norm, which we call the maximum directional variation norm, on this space of random variables; this norm leads to a natural metric , both of which are described below.

###### Definition 3 (Maximum directional variation norm and metric).

For a random variable , we define

 ∥X∥MV=maxuE[⟨X,u⟩2]1/2=∥E[XX⊤]∥1/22,

where the maximization is over with . We define an associated metric by minimizing the norm of the difference between the random variables over all orthogonal transformations, which aligns these distributions.

 (1)

where the matrix norm on the right hand side is the spectral norm.

Note that the norm only captures the directional variation when is centered. In the cases of interest, we wish to capture the drift in the latent position, : this is the origin of the name for this metric and its associated norm.

###### Remark 2.

The metric is not properly a metric on , since if

a.s. for some orthogonal matrix

, then . However, if we consider the equivalence relation defined by whenever for some orthogonal matrix, this is a metric on the corresponding set of equivalence classes. This means that we are able to absorb the non-identifiability from the original parameterization, obtaining a new parameter space with a metric space structure where the underlying distribution is identifiable.

One of the central contributions of this paper is that captures important features of the time-varying distributions . To describe a family of networks indexed by time, each of which is generated by a matrix of latent positions that are themselves random, we consider a latent position stochastic process.

###### Definition 4 (Latent position process).

Let be a filtration of . A latent position process is an -adapted map such that for each , has an inner product distribution. We say that a latent position process is nonbacktracking if implies for all .

Once we have the latent position stochastic process, we can construct a time series of latent position random networks whose vertices have independent, identically distributed latent positions given by .

###### Definition 5 (Time Series of LPGs).

Let be a latent position process, and fix a given number of vertices and collection of times . We draw an i.i.d. sample for , and obtain the latent position matrices for by appending the rows , . The time series of LPGs (TSG) are conditionally independent LPGs with latent position matrices .

We emphasize that each vertex in the TSG corresponds to a single , which induces dependence between the latent positions for that vertex across time points, but the latent position trajectories of any two distinct vertices are independent of one another across all times. Since these trajectories form an i.i.d. sample from the latent position process, it is natural to measure their evolution over time using the metric on the corresponding random variables, namely . In the definition of this distance, the expectation is over

, which means that it depends on the joint distribution of

and . In particular, depends on more than just the marginal distributions of the random vectors and individually, but takes into account their dependence inherited from the latent position process .

One key question is whether the image has useful geometric structure when equipped with the metric . It turns out that, under mild conditions, this image is a manifold. In addition, the map admits a Euclidean analogue, called a mirror, which is a finite-dimensional curve that retains important signal from the generating stochastic process for the network time series. To make this precise, we define the notions of Euclidean realizability and approximate Euclidean realizability, below, and provide several examples of latent position processes that satisfy these requirements.

###### Definition 6 (Notions of Euclidean realizability).

Let be a latent position process.

We say that is approximately (Lipschitz) Euclidean -realizable with mirror and realizability constant if there exists a Lipschitz continuous curve such that

 ∣∣dMV(φ(t),φ(t′))−∥ψ(t)−ψ(t′)∥∣∣≤C|t−t′| for all t,t′∈[0,T].

For a fixed , we say that is approximately -Hölder Euclidean -realizable if is -Hölder continuous, and there is some such that

 ∣∣dMV(φ(t),φ(t′))−∥ψ(t)−ψ(t′)∥∣∣≤C|t−t′|α for all t,t′∈[0,T].

Rather than -realizable, we may simply say realizable if there is some for which this holds; we simply say that is Hölder Euclidean realizable if the condition holds for some .

Approximate Euclidean realizability is sufficient for the image of to be a manifold.

###### Theorem 1 (Manifold properties of a nonbacktracking latent position process).

Let be a nonbacktracking latent position process which is approximately Euclidean realizable. Then is homeomorphic to an interval . In particular, it is a topological 1-manifold with boundary. If is injective and approximately -Hölder Euclidean realizable, the same conclusion holds.

If we suppose that the trajectories of satisfy a certain degree of smoothness, it turns out that the function also has this degree of smoothness. That is, the sample-path smoothness implies smoothness of the map into the space of random variables, when equipped with our metric.

###### Theorem 2.

(Smooth trajectories and smooth latent position processes) Suppose is -Hölder continuous with some for almost every , such that

 ∥X(t,ω)−X(s,ω)∥≤L(ω)|t−s|α,

where the random variable Then is Hölder continuous with this same

###### Remark 3.

In the above definitions of realizability, regularity conditions are imposed on , which takes values in , rather than on , which gives random variables as output. Moreover, is the Euclidean realization of the manifold in the space of random variables; this as an approximately distance-preserving representation of those random variables, each of which captures the full state of the system with all of the given entities at any time . As we show, estimates of this Euclidean mirror, derived from observations of graph connectivity structure at a collection of time points, can recover important features of the time-varying latent positions.

There are several natural classes of latent position processes that are approximately Lipschitz or -Hölder Euclidean realizable. The next theorem demonstrates approximate

-Hölder Euclidean realizability for any latent position process expressible as the sum of a deterministic drift and a martingale term whose increments have well-controlled variance.

###### Theorem 3 (Approximate Holder realizability of variance-controlled martingale-plus-drift processes).

Suppose is an -martingale with respect to the filtration , and suppose is Lipschitz continuous. Let . Then

 dMV(Xt,Xs)2≤∥Cov(Mt−Ms)∥2+∥γ(t)−γ(s)∥2.

When satisfies , and for some and Lipschitz continuous , then is approximately -Hölder Euclidean realizable with and .

###### Example 1.

Consider , where , is a -dimensional Brownian motion, and is a function describing the mean of over time, with . Then each sample path of is continuously differentiable in , and is as well. If is the canonical filtration generated by Brownian motion, then is not an -martingale. Then

 dMV(Xt,Xt′)2=a2(t−t′)2∥v∥2+σ2[(t−t′)2(t+2t′)/3],

so is approximately Euclidean realizable with .

###### Example 2.

Consider , where is a -dimensional Brownian motion, and is a Lipschitz continuous function of the form . The is approximately -Hölder Euclidean realizable, with , and the Euclidean mirror is .

The latent positions for the vertices in our network are not typically observed—instead, we only see the connectivity between the nodes in the network, from which a given realization of the latent positions can, under certain model assumptions, be accurately estimated. In order to compare the networks at times and , we can consider estimates of the networks’ latent positions at these two times as noisy observations from the joint distribution of , and deploy these estimates in an approximation of the distances . Using these approximate distances, we can then estimate the curve , giving a visualization for the evolution of all of the latent positions in the random graphs over time.

Suppose that is a random dot product graph with latent position matrix , where the rows of are independent, identically distributed draws from a latent position distribution on . Let be the adjacency matrix for this graph. As shown in [17], a spectral decomposition of the adjacency matrix yields consistent estimates for the underlying matrix of latent positions. We introduce the following definition.

###### Definition 7 (Adjacency Spectral Embedding).

Given an adjacency matrix , we define the adjacency spectral embedding with dimension as , where is the matrix of

top eigenvectors of

and is the diagonal matrix with the

largest eigenvalues of

on the diagonal.

As we show in the next section, we will use the ASE of the observed adjacency matrices in our TSG to estimate the distance between latent position random variables over time, and in turn, to estimate the Euclidean mirror, which records important underlying structure for the time series of networks.

## 3 Statistical Estimation of Euclidean Mirrors

Given a time series of graphs with approximately -Hölder Euclidean realizable latent position process , our goal is to estimate the Euclidean mirror . Since this curve belongs to some finite-dimensional Euclidean space, the distances can be used to recover this mirror (up to rigid transformations) from classical multidimensional scaling (CMDS). As such, the crucial estimation problem is one of accurately estimating the distances .

To this end, we define the estimated pairwise distances between any two such latent position matrices and as follows:

 ^dMV(^Xt,^Xs):=minW∈Od×d1√n∥^Xt−^XsW∥2, (2)

where is the set of real orthogonal matrices of order , and denotes the spectral norm. Our central result is that, when our networks have a sufficiently large number of vertices , provides a consistent estimate of .

###### Theorem 4.

With overwhelming probability,

 ∣∣^dMV(^Xt,^Xs)2−dMV(Xt,Xs)2∣∣≤log(n)√n.

In order to characterize the evolution of time series of networks, consider a finite subset of times with , and define the matrices

 Dφ=[dMV(Xs,Xt)]s,t∈T,Dψ=[∥ψ(s)−ψ(t)∥2]s,t∈T, and D^ψ=[^dMV(^Xs,^Xt)]s,t∈T

These are dissimilarity matrices; the first records the pairwise distances between the latent position random variables at times and ; the second records the differences between the Euclidean mirror at these times; and the third records the estimate for the first distance based on the observed network realizations at times and . Suppose is the matrix of squared entries of . We derive the following corollary:

###### Corollary 1.

For fixed , with overwhelming probability,

 ∥D(2)^ψ−D(2)φ∥F≤mTlog(n)√n.

It turns out that for large networks, the invariant subspace associated to is an accurate approximation to the corresponding subspace of , which matches that of when we have approximate Euclidean realizability. This suggests that applying CMDS to the estimated dissimilarity matrix can recover the original curve up to a rotation. We recall that CMDS computes the scaled eigenvectors of the matrix , where is a projection matrix, and is the matrix of all ones. Since this matrix may be written as for some matrix with orthonormal columns and diagonal matrix , this means that , the Euclidean mirror associated to , is simply the th row of the matrix . We will analogously denote the th row of , the output of CMDS applied to , by .

###### Theorem 5.

Suppose is approximately Euclidean -realizable. Let be the top eigenvectors, and be the diagonal matrices with diagonal entries equal to the top eigenvalues of and , respectively, where , and is the all-ones matrix of order . Then there is a constant such that with overwhelming probability, there is a real orthogonal matrix such that

 ∥^U−UR∥F≤23/2λc(D(2)φ)(mTlog(n)√n+mT∑i=c+1λi(D(2)φ)),

and the CMDS output satisfies

 ∥^U^S1/2−US1/2R∥F≤Cλc(D(2)φ)(mTlog(n)√n+mT∑i=c+1λi(D(2)φ)).

In particular, we have

 mT∑i=1∥^ψ(ti)−Rψ(ti)∥2≤Cλ2c(D(2)φ)(mTlog(n)√n+mT∑i=c+1λi(D(2)φ))2.

If all but the top eigenvalues of are sufficiently small—as is the case when is rank — Theorem 5 ensures that the Euclidean mirror can be consistently estimated. As such, if the important aspects of the latent position process, such as changepoints or anomalies, are reflected in low-dimensional Euclidean space, then we recover this mirror consistently through CMDS applied to the estimated distance matrix. We summarize these connections between true distances, their estimates, and associated Euclidean mirrors in Figure 3.

When we have exact Euclidean realizability or when the tail eigenvalues of can be bounded directly, we obtain the following two corollaries of Theorem 5.

###### Corollary 2.

Suppose is a Euclidean distance matrix with dimension . There is a constant such that with overwhelming probability, there is a real orthogonal matrix such that the CMDS output satisfies

 ∥^U^S1/2−US1/2R∥F≤CmTlog(n)√nλc(D(2)ψ).

In particular, we have

 mT∑i=1∥^ψ(ti)−Rψ(ti)∥2≤Cm2Tlog2(n)nλ2c(D(2)ψ).

Suppose that is Lipschitz continuous with constant and has realizability constant . If we further assume that there exists a constant such that for all , then we can bound the sum of the tail eigenvalues of , turning the approximate Lipschitz Euclidean realizability assumption into an eigenvalue bound. Note that we can can always choose from the realizability assumptions, but in certain cases, may be smaller, and in particular, not grow linearly with T. While this corollary is stated for the Lipschitz case, a version of it may be formulated for the -Hölder case as well.

###### Corollary 3.

Suppose is approximately Lipschitz Euclidean -realizable with realizability constant . Suppose for all . Suppose that for . Then

 ∥D(2)φ−D(2)ψ∥F≤2ABT√(m2T−1)/6≤0.82ABTmT.

Then there is a rotation matrix such that the CMDS output satisfies

 ∥^U^S1/2−US1/2R∥F≤CmTλc(D(2)φ)(log(n)√n+0.82ABT).

Together, these theorems ensure that time-dependent underlying low-dimensional structure associated to network evolution can be consistently recovered. In what follows, we will see how this methodology can be employed in real and synthetic data to reveal important structural features and potential anomalies in network time series.

## 4 Real and Synthetic Data Experiments

We start with a more detailed discussion of the communication network example shown in Figure 2. We consider a time series of weighted communication networks, arising from the email communications between 32277 entities in a large organization, with one network being generated each month from January 2019 to December 2020, a period of 24 months. This data was studied through the lens of modularity in [23]. We obtain a clustering by applying Leiden clustering [20] to the January 2019 network, obtaining 33 clusters that we retain throughout the two year period. We make use of this clustering to compute the Graph Encoder Embedding (GEE) of [16], which produces spectrally-derived estimates of invertible transformations of the original latent positions. For each time , we obtain a matrix , each row of which provides an estimate of these transformed latent positions. Constructing the distance matrix , we apply CMDS to obtain the estimated curve shown in the left panel of Figure 2, where the choice of dimension is based on the scree plot of . The nonlinear dimensionality reduction technique ISOMAP [19], which relies on a spectral decomposition of geodesic distances, can be applied to these points to extract an estimated 1-dimensional curve, which we plot against time in the right panel of Figure 2. This one-dimensional curve exhibits some changes from the previous trend in Spring 2020 and a much sharper qualitative transformation in July 2020. What is striking is that both these qualitative shifts correspond to policy changes: in Spring 2020, there was an initial shift in operations, widely regarded at the time as temporary. In mid-summer 2020, nearly the peak of the second wave of of COVID-19, it was much clearer that these organizational shifts were likely permanent, or at least significantly longer-lived.

In Figure 4, we plot the result of our methods applied to the induced subgraphs corresponding to each of the 33 communities; these are represented by the grey trajectories. The trajectories of two subcommunities have been highlighted in the plot above: the green curve shows a constant rate of change throughout the two-year period, and does not exhibit a noticeable pandemic effect. The blue curve, on the other hand, shows a significant flattening in early 2020, followed by rapid changes in summer. Thus, we can see a differential effect of the pandemic on different work groups within the organization.

In Figure 5, both panels show methods for identifying changepoints over the 24 months, with consistent results. In the left panel, for each time starting in June 2019, we plot the sigmage of its ISOMAP embedding relative to the previous 5 months: that is, we measure the distance of the ISOMAP embedding to the mean of the previous 5 months’ embeddings, relative to the standard deviation of those embeddings. Note that since the computation of the sigmages require a window of time-points, we are only able to produce these estimates starting in June 2019. We see apparent outliers in March and April, and again in July-September 2020. The right panel shows the ISOMAP curve with a moving prediction confidence interval of width 5 standard deviations, generated from linear regression applied to the previous 5 time points (which is why we again only have an interval starting in June). This method indicates the same set of outliers as the previous one, but allows for some more detailed analysis: In March and April 2020, it appears that the behavior is anomalous because the network stopped drifting, while the behavior in the summer of that year is anomalous because it made a significant jump from its previous position.

### 4.1 Synthetic Data and Bootstrapping

In the previous section, we apply the GEE embedding to obtain estimates , which are then used for estimates of pairwise distances . Although the GEE differs slightly from the adjacency spectral embedding, it is computationally more tractable and yields similarly useful output. To further illustrate our underlying theory, however, we consider synthetic data. That is, we use real data to obtain a distribution from which we may resample. Such a network bootstrap permits us to test our asymptotic results through replicable simulations that are grounded in actual data. To this end, we consider the true latent position distribution at each time to be equally likely to be any row of the GEE-obtained estimates from the real data, , for Given a sample size , for each time, we sample these rows uniformly and with replacement to get a matrix of latent positions . We treat this matrix as the generating latent position matrix for independent adjacency matrices . Note that if for sample , we choose row of at time , then the same row of will be used for all times for that sample, so that the original dependence structure is preserved. We may now apply the methods described in our theorems, namely ASE of the adjacency matrices followed by Procrustes alignment, to obtain the estimates , along with the associated distance estimates. In Figure 6, we see that ISOMAP applied to the CMDS embedding of the bootstrapped data converges to the original ISOMAP curve, as predicted by our theorems.

We also wish to check whether this procedure consistently demonstrates the pandemic effect. In Figure 7, we show the sigmages for each month, plotted over 100 replicates of this experiment, with for each replicate. The pandemic effect in summer of 2020 is clearly visible in all but a few replicates, while the effect in March-April is still identified in the majority of replicates. We also observe the dramatic change in variance for certain months, over the different replicates: this might indicate the discrepancies between the pandemic effect on different network entities, rendering the final estimate much more sensitive to the sample of rows used to generate the network.

## 5 Discussion

To effectively model time series of networks, it is natural to consider network evolution governed by underlying low-dimensional dynamics. In this paper, we examine latent position networks in which the vertex latent positions follow a stochastic process. Under mild conditions, we can associate to this stochastic process certain geometric structure, and understanding how that structure changes with time allows us to identify transformations in network behavior across multiple scales.

To make this intuition precise, we define the maximum directional variation norm and metric on the space of random latent positions. We describe notions of Euclidean realizability for this metric, characterizing how closely this metric can be approximated by a Euclidean distance metric. If the latent position process is sufficiently regular, its range is a one-dimensional manifold, and we detail several useful examples of such processes.

Of course, the latent position process is typically unobserved; what we have instead is a time series of networks from which these latent positions must be estimated. One of our key results is that the pairwise dissimilarity matrix of maximum directional variation distances between latent positions and at pairs of time points can be consistently estimated by spectrally embedding the network adjacencies at these different pairs of times and computing spectral norm distances between these embeddings. When the latent position process is such that the maximum directional variation metric between any pair of latent positions is approximately Euclidean realizable, we find that classical multidimensional scaling applied to the estimated distances gives us an inferentially valuable low-dimensional representation of network dissimilarities across time. Further dimension reduction techniques, such as ISOMAP, can further clarify changes in network dynamics.

To this last point, ISOMAP is a manifold-learning algorithm; detailed analysis of its effect on embeddings of estimated pairwise network distances can bring us closer to provable guarantees for change-point detection. More broadly, the interplay between the probabilistic structure of the underlying latent position process and the geometric structure of the Euclidean mirror is a key component of the estimated Euclidean representation of relationships between networks across time.

We consider two estimates for the maximum directional variation distance , namely the spectral norm applied to the GEE estimate, and the Procrustes-aligned ASEs. However, these are far from the only options, and it is an open question whether or another metric on the space of random variables is best for downstream inference tasks under certain model assumptions. It is also an open question whether there is a better estimate for the distance itself, either in terms of computational complexity or statistical properties. Of particular interest is the spectral norm distance applied to the omnibus embeddings [12] for the adjacency matrices: this likely converges to another distance on the space of random variables, potentially highlighting different features in the final CMDS embedding. Results quantifying the distribution of the errors in the CMDS embedding are key to formulating hypothesis tests for changepoint detection.

The perspective described in Figure 3, which connects distance metrics for generative processes of networks to their estimates, and in turn translates manifold geometry into Euclidean geometry, is an important theoretical contribution to time series analysis for networks. It provides mathematical formalism for network dynamics; asymptotic properties of estimates of manifold structure; and rigorous guarantees on when time-varying networks can be well-represented in low-dimensional space. Our family of latent position networks are interpretable, estimable, and flexible enough to capture the important features of diverse real-world network time series. As such, this canonical framework invites and accommodates future approaches to joint network inference.

## 6 Acknowledgements

All authors gratefully acknowledge funding from Microsoft Research and the Naval Engineering Education Consortium.

## References

• [1] J. Arroyo, A. Athreya, J. Cape, G. Chen, Carey E. Priebe, and J. T. Vogelstein. Inference for multiple heterogeneous networks with a common invariant subspace.

Journal of Machine Learning Research

, 22(142):1–49, 2021.
• [2] A. Athreya, D. E. Fishkind, K. Levin, V. Lyzinski, Y. Park, Y. Qin, D. L. Sussman, M. Tang, J. T. Vogelstein, and C. E. Priebe. Statistical inference on random dot product graphs: a survey. Journal of Machine Learning Research, 18, 2018.
• [3] Avanti Athreya, Michael Kane, Bryan Lewis, Zachary Lubberts, Vince Lyzinski, Youngser Park, Carey E Priebe, and Minh Tang. Numerical tolerance for spectral decompositions of random matrices. Journal of Computational and Graphical Statistics, to appear, 2022.
• [4] I. Borg and P. J. F. Groenen. Modern multidimensional scaling: Theory and applications. Springer Science & Business Media, 2005.
• [5] Ian Gallagher, Andrew Jones, and Patrick Rubin-Delanchy. Spectral embedding for dynamic networks with stability guarantees. Advances in Neural Information Processing Systems, 34, 2021.
• [6] P. D. Hoff, A. E. Raftery, and M. S. Handcock. Latent space approaches to social network analysis. Journal of the American Statistical Association, 97(460):1090–1098, 2002.
• [7] Bing-Yi Jing, Ting Li, Zhongyuan Lyu, and Dong Xia. Community detection on mixture multilayer networks via regularized tensor decomposition. The Annals of Statistics, 49(6):3181–3205, 2021.
• [8] Andrew Jones and Patrick Rubin-Delanchy. The multilayer random dot product graph. arXiv preprint arXiv:2007.10455, 2020.
• [9] Jason D Lee and Mauro Maggioni. Multiscale analysis of time series of graphs. In International Conference on Sampling Theory and Applications (SampTA)

. Citeseer, 2011.

• [10] Jing Lei, Kehui Chen, and Brian Lynch. Consistent community detection in multi-layer network data. Biometrika, 107(1):61–73, 2020.
• [11] Jing Lei and Kevin Z Lin.

Bias-adjusted spectral clustering in multi-layer stochastic block models.

Journal of the American Statistical Association, pages 1–13, 2022.
• [12] K. Levin, A. Athreya, M. Tang, V. Lyzinski, and C. E. Priebe. A central limit theorem for an omnibus embedding of random dot product graphs. arXiv preprint arXiv:1705.09355, 2017.
• [13] V. Lyzinski, M. Tang, A. Athreya, Y. Park, and C. E. Priebe. Community detection and classification in hierarchical stochastic blockmodels. IEEE Transactions in Network Science and Engineering, 4:13–26, 2017.
• [14] Konstantinos Pantazis, Avanti Athreya, Jesús Arroyo, William N Frost, Evan S Hill, and Vince Lyzinski. The importance of being correlated: Implications of dependence in joint spectral inference across multiple networks. Journal of Machine Learning Research, to appear., 2022.
• [15] Marianna Pensky. Dynamic network models and graphon estimation. The Annals of Statistics, 47(4):2378–2403, 2019.
• [16] Cencheng Shen, Qizhe Wang, and Carey E. Priebe. Graph encoder embedding. CoRR, abs/2109.13098, 2021.
• [17] D. L. Sussman, M. Tang, D. E. Fishkind, and C. E. Priebe. A consistent adjacency spectral embedding for stochastic blockmodel graphs. Journal of the American Statistical Association, 107:1119–1128, 2012.
• [18] M. Tang, A. Athreya, D. L. Sussman, V. Lyzinski, Y. Park, and C. E. Priebe. A semiparametric two-sample hypothesis testing problem for random dot product graphs. Journal of Computational and Graphical Statistics, 26:344–354, 2017.
• [19] J. B. Tenenbaum, V. de Silva, and J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319—2323, 2000.
• [20] Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. From louvain to leiden: guaranteeing well-connected communities. Scientific reports, 9(1):1–12, 2019.
• [21] Y. Yu, T. Wang, and R. J. Samworth. A useful variant of the Davis-Kahan theorem for statisticians. Biometrika, 102:315–323, 2015.
• [22] A. Zhang and D. Xia. Tensor svd: Statistical and computational limits. IEEE Transactions on Information Theory, 64(11):7311–7338, 2018.
• [23] Tiona Zuzul, Emily Cox Pahnke, Jonathan Larson, Patrick Bourke, Nicholas Caurvina, Neha Parikh Shah, Fereshteh Amini, Youngser Park, Joshua Vogelstein, Jeffrey Weston, et al. Dynamic silos: Increased modularity in intra-organizational communication networks during the covid-19 pandemic. arXiv preprint arXiv:2104.00641, 2021.

## Appendix A Proofs and supporting results for Section 2

###### Lemma 1.

The function is a metric on the space of random variables, up to the equivalence relation where if there is some such that almost surely.

###### Proof.

Recall that is defined as

 dMV(X,Y)=minW∥E[(X−WY)(X−WY)⊤]∥1/22=minW∥E[(W⊤X−Y)(W⊤X−Y)⊤]∥1/22.

Clearly, this is symmetric and nonnegative. The triangle inequality holds, since for any and , we have

 dMV(X,Y)2 =minWmaxuu⊤E[(X−QZ+QZ−WY)(X−QZ+QZ−WY)⊤]u =minWmaxuE[⟨u,X−QZ⟩2+2⟨u,X−QZ⟩⟨u,QZ−WY⟩+⟨u,QZ−WY⟩2] ≤minWmaxu(E[⟨u,X−QZ⟩2]1/2+E[⟨u,QZ−WY⟩2]1/2)2,

by the Cauchy-Schwartz inequality applied to the -inner product. This is further bounded as

 dMV(X,Y) ≤minW(maxuE[⟨u,X−QZ⟩2]1/2)+(maxvE[⟨v,QZ−WY⟩2]1/2) =(maxuE[⟨u,X−QZ⟩2]1/2)+minW(maxvE[⟨v,Z−(Q⊤W)Y⟩2]1/2).

Since when , the latter term is just . Since this upper bound holds for any , it must also hold for the minimizer.

Now suppose that . Since the spectral norm is a norm, this tells us that for the achieving the minimum, . Since is positive semidefinite for every , this implies that almost surely. ∎

Suppose is nonbacktracking, so that implies for all , and that is approximately Euclidean realizable. Recall that Theorem 1 states that under these conditions, is homeomorphic to an interval . In particular, it is a topological 1-manifold with boundary. If is injective, remains a 1-manifold with boundary even when is only -Hölder Euclidean realizable.

Proof of Theorem 1. We consider the case where is injective first, since this avoids some of the technical details of the more general case. We may first observe that

 dMV(φ(t),φ(t′))≤∥ψ(t)−ψ(t′)∥+c|t−t′|α≤(L+c)|t−t′|α,

so is also -Hölder continuous (or Lipschitz if ). Since is defined to be , it is apparent that is bijective. Now any closed subset is compact, so is compact, hence closed in , and is a closed map. In other words, is continuous, so is itself the required homeomorphism.

In the case that is nonbacktracking and , we define for any and such that . This is finite and bounded above by from the first part of the proof. We also define , which is a finite, nonnegative number bounded above by . We make the following observations, which are easily proved: (1) is lower semicontinuous in , for any ; (2) is integrable.

Now we define via , where Since is surjective, this allows us to define via for all . We now show that is well-defined and Lipschitz continuous. Let , and given , choose points such that , and for each . Now we observe that

 dMV(μ(γ(t)),μ(γ(t′))) =dMV(φ(t),φ(t′)) ≤k−1∑i=0dMV(φ(si),φ(si+1)) ≤k−1∑i=0L(si,δ)(si+1−si).

Letting the partition of become arbitrarily fine, we see that this upper bound converges to the corresponding integral, giving

 dMV(μ(γ(t)),μ(γ(t′)))≤∫t′tL(s,δ)ds.

Now taking an infimum over and applying dominated convergence, we see that

 dMV(μ(γ(t)),μ(γ(t′)))≤∫t′tL(s)ds=|γ(t′)−γ(t)|.

Observe that is injective: if for some , then for with and , we have that

 φ(t)=μ(γ(t))=μ(x)=μ(y)=μ(γ(t′))=φ(t′),

so for all . Then for , we can take small enough that , and since for all , we see that , and thus , too. Now from the definition of ,

 y−x=γ(t′)−γ(t)=∫t′tL(s)ds=0,

which contradicts the assumption that .

Since is a continuous bijection, it is easy to see that is in fact a homeomorphism. When the trajectories satisfy a Hölder condition with square-integrable constant, then also satisfies this continuity condition, as Theorem 2 states.

Proof of Theorem 2: To show that sufficiently smooth trajectories imply continuity for , note that

 dMV(Xt,Xs) =minW∥E[(Xt−WXs)(Xt−WXs)T]∥1/22 ≤∥E[(Xt−Xs)(Xt−Xs)T]∥1/22 ≤E[∥Xt−Xs∥2]1/2 ≤E[L(ω)2|t−s|2α]1/2 =∥L∥L2(Ω)|t−s|α.

Additional constraints on the probabilistic structure of the stochastic process can render the distance simpler to compute. In Theorem 3, we show that if , where is Lipschitz continuous and is a martingale with certain variance constraints, then is approximately -Hölder Euclidean realizable.

Proof of Theorem 3: Suppose , where , and is Lipschitz continuous with ; is a martingale satisfying We expand using the decomposition

 Xt−WXs=Mt−Ms+γ(t)−Wγ(s)+Ms−WMs.

Since the increment is conditionally mean-zero given , and has mean zero, all cross terms vanish when we take the expected value. This leaves

 E[(Xt−WXs)(Xt−WXs)⊤] =E[(Mt−Ms)(Mt−Ms)⊤]+(γ(t)−Wγ(s))(γ(t)−Wγ(s))⊤ +(I−W)E[MsM⊤s](I−W)⊤

Plugging in and using the triangle inequality guarantees that

 dMV(Xt,Xs)2≤∥Cov(Mt−Ms)∥2+∥γ(t)−γ(s)∥2

as required.

Since , we use the fact that the first and last terms are positive semidefinite to obtain the lower bound , so

 ∥γ(t)−γ(s)∥2≤dMV(Xt,Xs)2≤∥γ(t)−γ(s)∥2+C(t−s),

which completes the proof.

We now demonstrate the properties of the stochastic processes describe in Examples 1 and 2.

Proof for Example  1: Since with , observe that

 dMV(Xt,Xt′)2 =minW∥E[(Xt−WXt′)(Xt−WXt′)