# Persistent Homology of Complex Networks for Dynamic State Detection

In this paper we develop a novel Topological Data Analysis (TDA) approach for studying graph representations of time series of dynamical systems. Specifically, we show how persistent homology, a tool from TDA, can be used to yield a compressed, multi-scale representation of the graph that can distinguish between dynamic states such as periodic and chaotic behavior. We show the approach for two graph constructions obtained from the time series. In the first approach the time series is embedded into a point cloud which is then used to construct an undirected k-nearest neighbor graph. The second construct relies on the recently developed ordinal partition framework. In either case, a pairwise distance matrix is then calculated using the shortest path between the graph's nodes, and this matrix is utilized to define a filtration of a simplicial complex that enables tracking the changes in homology classes over the course of the filtration. These changes are summarized in a persistence diagram—a two-dimensional summary of changes in the topological features. We then extract existing as well as new geometric and entropy point summaries from the persistence diagram and compare to other commonly used network characteristics. Our results show that persistence-based point summaries yield a clearer distinction of the dynamic behavior and are more robust to noise than existing graph-based scores, especially when combined with ordinal graphs.

## Authors

• 2 publications
• 19 publications
• 12 publications
• ### Using Zigzag Persistent Homology to Detect Hopf Bifurcations in Dynamical Systems

Bifurcations in dynamical systems characterize qualitative changes in th...
09/18/2020 ∙ by Sarah Tymochko, et al. ∙ 0

• ### Learning metrics for persistence-based summaries and applications for graph classification

Recently a new feature representation and data analysis methodology base...
04/27/2019 ∙ by Qi Zhao, et al. ∙ 0

• ### Sinkhorn Divergence of Topological Signature Estimates for Time Series Classification

Distinguishing between classes of time series sampled from dynamic syste...
02/14/2019 ∙ by Colin Stephen, et al. ∙ 0

• ### Cycle Registration in Persistent Homology with Applications in Topological Bootstrap

In this article we propose a novel approach for comparing the persistent...
01/03/2021 ∙ by Yohai Reani, et al. ∙ 0

• ### Topological Regularization via Persistence-Sensitive Optimization

Optimization, a key tool in machine learning and statistics, relies on r...
11/10/2020 ∙ by Arnur Nigmetov, et al. ∙ 0

• ### Persistent Homology of Geospatial Data: A Case Study with Voting

A crucial step in the analysis of persistent homology is transformation ...
01/29/2019 ∙ by Michelle Feng, et al. ∙ 0

• ### Homological Time Series Analysis of Sensor Signals from Power Plants

In this paper, we use topological data analysis techniques to construct ...
06/03/2021 ∙ by Luciano Melodia, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

There has been extensive work on understanding the behavior of the underlying dynamical system given only a time series [Kantz2004, Bradley2015]. The revolutionary work of Takens [Takens1981] extended by Sauer et al. [Sauer1991] showed that, given most choices of parameters, the state-space of the dynamical system can be reconstructed through the Takens’embedding. Computationally, this arises as the following procedure. Given a time series , a choice of dimension and lag give rise to a point cloud . Then the goal is to analyze this point cloud, which really is a sampling of the full state space, in a way that the dynamics can be understood. Of course, for practical purposes, not all parameter choices are equally desirable. While some effort has gone into mathematical justification of “best” choices [Casdagli1991]

, we are largely left with heuristics that work quite well in practice

[Fraser1986, Kennel1992].

A first method for analyzing this point cloud arose in the form of a recurrence plot [Eckmann1987, Marwan2007]. Fixing , this is a binary, symmetric matrix where is 1 if , and 0 otherwise. Of course, this can be equivalently viewed as the adjacency matrix of a network111In this paper, we use the words network and graph interchangeably., often called the -recurrence network in this literature [Gao2009, Gao2009a, Marwan2009, Donner2010]. From this observation, a large literature grew on methods to convert time series into networks; see Donner et al. [Donner2011] for an extensive survey.

In this paper, we focus on two of these options. First, given the point cloud , we can construct the (undirected) -nearest neighbor graph, commonly called the -NN graph. This is built by adding a vertex to represent each , and for each , adding an edge to the closest points . This construction, and in particular the investigation of motifs in the resulting graph, has been extensively studied [Shimada2008, Khor2016, Xu2008].

The second network construction method we work with is the recently developed ordinal partition network [Small2013, McCullough2015]. It can be viewed as a special case of the class of transition networks built from time series data, where vertices represent subsets of the state space, and edges are included based on temporal succession. This construction arose as a generalization of the concept of permutation entropy [Bandt2002]. The basic idea of the construction is to replace each with a permutation of the set so that if is the th entry in the sorted order of the coordinates. That is, satisfies . Then we build a graph with vertex set equal to the set of encountered permutations, and an edge included if the ordered point cloud passes from one permutation to the other222While the ordinal partition network can be defined as a directed, weighted graph, we work with the undirected, unweighted analogue. . Utilizing the transition network vantage point, each permutation represents a subspace of given by the intersection of inequalities, and an edge is included based on passing from one of these subspaces to the other in one time step. What makes this construction particularly useful is its robustness to noise [Amigo2007]. See [Amigo2010] for a more extensive introduction.

Many have observed qualitatively that these networks encode the structure of the underlying system [Donner2011]. In particular, periodic time series tend to create networks with overarching circular structure, while those arising from chaotic systems have more in common with a hairball (see, e.g., Fig. 1). However, quantification of this behavior is lacking. Much of the literature to date has focused on using standard quantification methods from network theory such as local measures like degree-, closeness-, and betweenness-centrality, or the local clustering coefficient. Global measures are also used, e.g., the global clustering coefficient, transitivity, and average path length. However, these measures can only do so much to measure the overarching structure of the graph.

It was for this reason that topological data analysis (TDA) [Carlsson2009, Ghrist2014, Ghrist2008, Munch2017, Perea2018] has proven to be quite useful for time series analysis. TDA is a collection of methods arising from the mathematical field of algebraic topology [Hatcher, Munkres2] which provide concise, quantifiable, comparable, and robust summaries of the shape of data. The main observation is that we can encode higher dimensional structure than the 1-dimensional information of a network by passing to simplicial complexes. Like graphs, simplicial complexes are combinatorial objects with vertices and edges, but also allow for higher dimensional analogues like triangles, tetrahedra, etc. To date, the interaction of time series analysis with TDA has only focused on a generalization of the -recurrence network called a Rips complex. The Rips complex for parameter includes a simplex iff for all . That is, it is the largest simplicial complex which has the -recurrence graph as its 1-skeleton.

Unlike the time series analysis literature, where one works hard to find the perfect to construct a single network, TDA likes to work with the Rips complex over all scales in a construction called a filtration. Then, one can analyze the structure of the overall shape by looking at how long features of interest persist over the course of this filtration. One particularly useful tool for this analysis is 1-dimensional persistent homology [Edelsbrunner2002, Zomorodian2004], which encodes how circular structures persist over the course of a filtration in a topological signature called a persistence diagram. This and its variants have been quite successful in applications, particularly for the analysis of periodicity [Robinson2014, Perea2014, Berwald2014a, Perea2016, Xu2018, Sanderson2017, Robinson2015, Tempelman2019], including for parameter selection [Garland2016], machining dynamics [Khasawneh2015, Khasawneh2014, Khasawneh2014a, Khasawneh2017, Khasawneh2018b], gene regulatory systems [Berwald2014b, Perea2015], financial data [Gidea2017, Gidea2017a, Gidea2018], wheeze detection [Emrani2014], sonar classification [Robinson2018], video analysis [Tralie2016a, Tralie2017, Tralie2018a], and annotation of song structure [Bendich2018, Tralie2019].

Unfortunately, while persistence diagrams are powerful tools for summarizing structure, their geometry is not particularly amenable to the direct application of standard statistical or machine learning techniques

[Mileyko2011, Turner2014, Munch2015]. To circumvent the problem, a common trick to deal with persistence diagrams particularly when we are interested in classification tasks is to choose a method for featurizing the diagrams; that is, constructing a map from persistence diagrams into Euclidean space via some method that preserves enough of the structure of persistence diagrams to be reasonably useful. Many of these exist in the literature [Bubenik2015, Adams2017, Adcock2016, CarlssonVerovsek2016, Kalisnik2018, DiFabio2015, Berry2018, Chevyrev2018, Chen2015a, Chazal2014b, Padellini2017, Perea2019], however in this work, we focus on the simplest of these realizations, namely point summaries of persistence diagrams which extract a single number for each diagram to be used as its representative. One summary which we will use in this paper is persistent entropy. This was defined by Chintakunta et al. [Chintakunta2015] and later Rucco, Atienza, et al. [Rucco2017, Atienza2018a] proved that the construction is continuous. This construction, a modification of Shannon entropy, has found use in several applications [Merelli2015, Rucco2016, Piangerelli2016].

In this paper we move away from the standard application of TDA to time series analysis (namely the combination of the Rips complex with 1-D persistence) to implement the following new pipeline and use it to differentiate between chaotic and periodic systems; see Fig. 2. Given a time series, determine a good choice of embedding parameters, and use these to build an embedding of the time series. Then, obtain a graph either by constructing the nearest neighbor graph for the points of the embedding, or by building the ordinal partition network. Construct a filtration of a simplicial complex using this information, compute its persistence diagram, then return one of several point summaries of the diagram. We show experimentally on both synthetic and real data that this pipeline, particularly using persistent entropy, is quite good at differentiating between chaotic and periodic time series.

## 2 Background

### 2.1 Graphs

A graph is a collection of vertices with edges . In this paper, we assume all graphs are simple (no loops or multiedges) and undirected. The complete graph on the vertex set is the graph with all edges included, i.e. .

We will reference a few special graphs. The cycle graph on vertices is the graph with , and ; i.e. it forms a closed path (cycle) where no repetition occurs except for the starting and ending vertex. The complete graph on vertices is the graph with , and . That is, it is the graph with vertices and all possible edges are included.

We will also work with weighted graphs, where gives a weight for each edge in the graph. In this paper, we assume all weights are non-negative, . Given an ordering of the vertices , a graph can be stored in an adjacency matrix where entry if there is an edge and 0 otherwise. This can be edited to store the weighting information by setting if and 0 otherwise.

A path in a graph is an ordered collection of non-repeated vertices where for every . The length of the path is the number of edges used, namely in the above notation. The distance between two vertices and is the minimum length of all paths from to and is denoted . Given an ordering of the vertices, this information can be stored in a distance matrix where . Thus an unweighted graph gives rise to a weighted complete graph on the vertex set by setting the weight .

### 2.2 k-Nearest Neighbor Graph

Given a collection of points in , the -nearest neighbor graph, or -NN, is a commonly used method to build a (weighted) graph. Fix . The (undirected) -NN graph has a vertex set in - correspondence with the point cloud, so we abuse notation and write for both the point , and for the vertex . An edge is included if is among the th nearest neighbors of .

### 2.3 Embedding of time series

Takens’ theorem forms one of the theoretical foundations for the analysis of time series corresponding to nonlinear, deterministic dynamical systems [Takens1981]. It basically states that in general it is possible to obtain an embedding of the attractor of a deterministic dynamical system from one-dimensional measurements of the system’s evolution in time. An embedding is a smooth map between the manifolds and that diffeomorphically maps to .

Specifically, assume that the state of a system is described for any time by a point on an -dimensional manifold . The flow for this system is given by a map which describes the evolution of the state for any time . In reality, we typically do not have access to , but rather have measurements of via an observation function . The observation function has a time evolution , and in practice it is often a one-dimensional, discrete and equi-spaced time series of the form .

Although the state can lie in a higher dimension, the time series is one-dimensional. Nevertheless, Takens’ theorem states that by fixing an embedding dimension , where is the dimension of a compact manifold , and a time lag , then the map given by

 Φϕ,β =(β(x),β(ϕ(x)),…,β(ϕd−1(x))) =(β(xt),β(xt+τ,β(xt+2τ,…,β(xt+(d−1)τ))),

is an embedding of , where is the composition of times and is the value of at time .

Theoretically, any time lag can be used if the noise-free data is of infinite precision; however, in practice, the choice of is important in the delay reconstruction. In this paper we use the first minimum of the mutual information function for determining a proper value for [Fraser1986]. Figure 3 shows pictorially the quantities needed to compute the mutual information function for a fixed . Specifically, the range of the data values is divided into equi-spaced bins with resolution and a specific delay value

is chosen and fixed. The joint probability

of finding point in the th bin and point in the th bin is then computed by counting the number of points laying in the cell indexed by in Fig. 3 and dividing this count by the total number of transitions. In this example, we see that for instance , while the marginal probability density for is given by . Using the probabilities described in Fig. 3, the mutual information function can be obtained according to

 Iϵ(τ)=∑i,jpij(τ)lnpij(τ)−2∑ipilnpi,

where is the natural logarithm function. By plotting for a range of delays , an embedding delay can be chosen by observing the first minimum in . This minimum indicates the first value of at which minimum information is shared between and . We note that the implementation that we describe here is the original implementation described in Kennel et al. [Kennel1992]; however, an adaptive implementation can be found in Darbellay et al. [Darbellay1999] while an entropy-based approach can be found in Krasokov et al. [Kraskov2004].

The other component in Takens’ embedding is the embedding dimension , which must be large enough to unfold the attractor. If this dimension in not sufficient, then some points can falsely appear to be neighbors at a smaller dimension due to the projection of the attractor onto a lower dimension. One of the classical time series analysis tools for finding a proper embedding dimension is the method of false nearest neighbors [Kennel1992]. In this method, it is assumed that an appropriate embedding dimension is given by . Any embedding dimension is therefore a projection from into a lower dimensional space. Consequently, some of the coordinates are lost in this projection and points that are not neighbors in appear to be neighbors in . The idea is therefore to embed the time series in spaces with increasingly larger dimension while keeping track of false neighbors in successive embeddings, i.e., points that appear to be neighbors due to insufficient embedding dimension. If the ratio of the false neighbors falls below a certain threshold at some embedding dimension , then we set .

### 2.4 Ordinal Partition Graph

The ordinal partition graph [Small2013, McCullough2015] is another method for constructing a graph from a time series. Using a sufficiently sampled time series, an ordered list of permutations is collected by first finding a set of

-dimensional embedded vectors

using a similar algorithm to Takens’ embedding. More specifically, the set of -dimensional vectors use an embedding delay and motif dimension . An example of this embedding is shown in Fig. 4 (a) with and . These parameters were chosen as they simplify the demonstration and visualization of the method. However, to automate the method, we suggest that both and are selected by using permutation entropy (PE) as defined by Bandt and Pompe [Bandt2002]. The specific method for selecting and is explained in Section 3. By applying over the entire length of the time series, a sequence of vectors is generated.

For each vector , the associated permutation is the permutation of the set that satisfies . Each vector in is translated into its associated permutation symbol to generate a sequence of permutations , where . An example of this process is shown for the first three vectors in Fig. 4 (b). Finally, using the array of permutations , a directional network is formed by transitioning from one permutation, represented by the graph in Fig. 4-(c), to another in the sequential order. If we want to build an unweighted version of this graph, we include the edge if there is a transition from one permutation to the next. If we want this graph to be weighted, we set to be the number of times this transition occurs.

### 2.5 Simplicial complexes

A simplicial complex can be thought of as a generalization of the concept of a graph to higher dimensions. Given a vertex set , a simplex is simply a collection of vertices. The dimension of a simplex is . The simplex is a face of , denoted if . A simplicial complex is a collection of simplices such that if and , then . Equivalently stated, is a collection of simplices which is closed under the face relation. The dimension of a simplicial complex is the largest dimension of its simplices, . The -skeleton of a simplicial complex is all simplices of with dimension at most , .

Given a graph , we can construct the clique complex

 K(G)={σ⊆V∣uv∈E for all u≠v∈σ}.

This is sometimes called the flag complex. The clique complex of the complete graph on vertices is called the complete simplicial complex on vertices.

A filtration is a collection of nested simplicial complexes

 K1⊆K2⊆⋯⊆KN.

See the bottom row of Fig. 5 for an example of a filtration. A weighted graph gives rise to a filtration we will make use of extensively. Given a weighted graph and , we set

 Ka={σ∈K(G)∣ω(uv)≤a for all u≠v∈σ}.

Since for , this can be viewed as a filtration

 Ka1⊆Ka2⊆⋯⊆KaN

for any collection .

In particular, for this paper, we will build a filtration from an unweighted graph by the following procedure. First, construct the pairwise distance matrix for the vertices of using shortest paths. This can be viewed as a weighting on the complete graph with the same vertex set as . Thus, it induces a filtration on the complete simplicial complex where the 1-skeleton of includes edges between any pair of vertices and for which . See Fig. 5 for an example.

### 2.6 Homology

Traditional homology [Hatcher, Munkres2] counts the number of structures of a particular dimension in a given topological space, which in our context will be a simplicial complex. In this context, the structures measured can be connected components (0-dimensional structure), loops (1-dimensional structure), voids (2-dimensional structure), and higher dimensional analogues as needed.

For the purposes of this paper, we will only ever need 0- and 1-dimensional persistent homology so we provide the background necessary in these contexts. Further, as a note for the expert, we always assume homology with coefficients which removes the need to be careful about orientation.

We start by describing homology. Assume we are given a simplicial complex . Say the -dimensional simplices in are denoted . A -dimensional chain is a formal sum of the -dimensional simplices . We assume the coefficients and addition is performed mod 2. For two chains and , . The collection of all -dimensional chains forms a vector space denoted . The boundary of a given -simplex is

 ∂d(σ)=∑τ≺σ,dim(τ)=d−1τ.

That is, it is the formal sum of the simplices of exactly one lower dimension. If , that is, if is a vertex, then we set . The boundary operator is given by

 ∂d(α)=∂d(ℓ∑i=1aiσi)=∑ai∂d(σi).

A -chain is a cycle if ; it is a boundary if there is a -chain such that . The group of -dimensional cycles is denoted ; the boundaries are denoted .

In particular, any -chain is a -cycle since for any . A -chain is a -cycle iff the 1-simplices (i.e., edges) with a coefficient of 1 form a closed loop. It is a fundamental exercise in homology to see that and therefore that . The -dimensional homology group is . An element of is called a homology class and is denoted for where . We say that the class is represented by , but note that any element of can be used as a representative so this choice is by no means unique.

In the particular case of 0-dimensional homology, there is a unique class in for each connected component of . For -dimensional homology, we have one homology class for each “hole” in the complex.

### 2.7 Persistent homology

Persistent homology is a tool from topological data analysis which can be used to quantify the shape of data. The main idea behind persistent homology is to watch how the homology changes over the course of a given filtration.

Fix a dimension . First, note that if we have an inclusion of one simplicial complex to another , we have an obvious inclusion map on the -chains by simply viewing any chain in the small complex as one in the larger. Less obviously, this extends to a map on homology by sending to the class in with the same representative. That this is well defined is a non-trivial exercise in the definitions [Munkres2].

Given a filtration

 K1⊆K2⊆⋯⊆KN

we have a sequence of maps on the homology

 Hd(K1)→Hd(K2)→⋯→Hd(KN).

A class is said to be born at if it is not in the image of the map . The same class dies at if in but in .

Given all this information, we construct a persistence diagram as follows. For each class that is born at and dies at , we add a point in at . For this reason, we often write a persistence diagram by its collection of off-diagonal points, . See the top right of Fig. 5 for an example. Note that the farther a point is from the diagonal, the longer that class persisted in the filtration, which signifies large scale structure. The lifetime or persistence of a point in a persistence diagram is given by .

Note that it is possible to have multiple points in a single diagram of the same form if there are multiple classes that are born at and die at . In this case, we sometimes employ histograms or annotations to emphasize that a single point seen in a diagram is actually multiple points overlaid.

Further, any filtration has a persistence diagram for each dimension . In this paper, when we wish to emphasize dimension, we write the diagram as ; here we will only use the 1-dimensional diagram.

### 2.8 A first example

Here, we begin to construct the pipeline we will use via an example shown in Fig. 5. We start with a graph with as shown in the top left. We can construct a distance between each pair of vertices using the path length. Then, a filtration on the full simplicial complex with vertices is constructed using the clique complex method described at the end of Section 2.5. Finally, the 1-dimensional persistence diagram is given at the top right. The existence of two points in the persistence diagram means that two circular structures existed over the course of the filtration. The first is the large left loop that can be seen in and persists until it gets filled in at . This is represented by the point in the diagram. The other smaller loop is the right loop that appears in , but is filled in before . This is represented by the point in the diagram.

### 2.9 Point summaries of persistence diagrams

A common issue with persistence diagrams is that they are notoriously difficult to work with as a summary of data. While they are quantitative in nature, determining differences in structure such as “has a point far from the diagonal” is often a qualitative procedure. Metrics for persistence diagrams exist, namely the bottleneck and -Wasserstein333

This metric is closely related to but not the same as the eponymous metric from probability theory.

distances, however these objects are not particularly easy to work with in a statistical or machine learning context. Thus, we will pass to working with the simplest of featurizations, namely point summaries of a given diagram, which we also call scores.

#### Maximum persistence

The first very simple but extremely useful point summary is maximum persistence. Given a persistence diagram , the maximum persistence is simply

 maxpers(D)=maxx∈Dpers(x).

In the example of Fig. 5 with , we have . While this is obviously a very lossy point summary for a persistence diagram, it is quite useful in that, particularly for applications where the existence of a large circle is of interest, it often does what we need. See, e.g., [Khasawneh2015, Tymochko2019].

#### Periodicity Score

Next, we set out to build a point summary which we can use to measure the similarity of our weighted graph to a cycle graph which is independent of the number of nodes. If is an unweighted cycle graph with vertices, then following the procedure of Section 2.8 using the shortest path metric, we have that there is exactly one cycle which is born at 1, and fills in at . See the examples of Fig. 6. This means the persistence diagram has exactly one point , and so we denote the maximum persistence of this diagram as

 Ln=maxpers(D′)=⌈n3⌉−1.

Then, assume we are given another unweighted graph with and persistence diagram . We define the network periodicity score

 P(D)=1−maxpers(D)Ln. (1)

This score is an extension of the periodicity score in [Perea2015] to unweighted networks, and it has the property that , with iff the input graph is a cycle graph. In the example of Fig. 5, we have , so .

#### The ratio of the number of homology classes to the graph order

The next point summary we define is

 M(D)=|D||V|, (2)

which is the reciprocal of the ratio between the number of vertices in the network , i.e., the order of the graph, and the number of classes in the persistence diagram . In the example of Fig. 5, this is .

We can think of this number as an approximation of the reciprocal of the number of vertices in each class, however, this is only an approximation because some classes in -D persistence diagram may share vertices in the network. Note that for a network with nodes, the 0-dimensional persistence diagram will always have points, and so this metric is not particularly useful. In this paper, we only use this summary for -dimensional persistence diagrams.

The logic behind this heuristic is that for a periodic signal we would expect to see a small number of -D homology classes in comparison to a chaotic time series. Therefore, for two networks of similar order but with different dynamic behavior, i.e., one is chaotic and one is periodic, the ratio for the periodic time series will be smaller than its chaotic counterpart.

#### Normalized Persistent Entropy

Persistent entropy is a method for calculating the entropy from the lifetimes of the points in a persistence diagram, inspired by Shannon entropy. This summary function, first given by Chintakunta et al. [Chintakunta2015], is defined as

 E(D)=−∑x∈Dpers(x)L(D)log2(pers(x)L(D)), (3)

where is the sum of lifetimes of points in the diagram. In the example of Fig. 5 and Section 2.8, , so . Thus for this example, .

However, we cannot easily compare this value across different diagrams with different numbers of points. To deal with this issue, we provide the following normalization heuristic. Specifically, we normalize as

 E′(D)=E(D)log2(L(D)). (4)

This normalization allows for an accurate measurement of the entropy even when there are few significant lifetimes. Returning to the example of Fig. 5, .

## 3 Method

In this section, we discuss the specifics of the method studied for turning a time series into a persistence diagram following Fig. 2.

We have two initial choices for how to turn a time series into a network. In the case of the Takens’ embedding, we determine the embedding dimension using false nearest neighbors [Kennel1992], and determine the lag using the mutual information function [Fraser1986]. We then construct the -NN graph for these points. Following Khor and Small [Khor2016], we use .

The second method for constructing a network is the ordinal partition network. As mentioned in Section 2.4, this also requires a decision of dimension and lag, which we determine following Reidl et al. [Riedl2013]. Specifically, we initially fix , and plot the permutation entropy , where is the probability of a permutation , for a range of values. In the resulting versus curve we choose the value of at the location of the first prominent peak as the lag parameter. The dimension is used because it was shown in [Riedl2013] that the first peak of occurs at approximately the same value of independent of the dimension for  [Riedl2013]. The logic behind this approach is that when the time series points are strongly correlated due to the insufficient unfolding of the trajectories, only few regions of the state space are visited resulting in low values for . As is increased, increases and it reaches a maximum when the trajectory unfolding leads to the appearance of a large subset of the possible motifs. We only include vertices in the graph for permutations which have been visited, which keeps us from needing to work with the full set of which quickly becomes computationally intractable.

Using the identified delay at the first maximum of , we then define the permutation entropy per symbol

 h′(d)=1d−1H(d), (5)

where we make a free parameter that we are seeking to determine. The dimension for the ordinal partition network is obtained by plotting for and choosing the value of that maximizes .

Once the graph is constructed, we compute shortest paths using all_pairs_shortest_path_length from the python NetworkX package. Finally, we compute the persistence diagram using the python wrapper Scikit-TDA [scikittda2019] for the software package Ripser [ripser].

### 3.1 Rössler System Example

We demonstrate the method on the Rössler system and the ordinal partition network representation. The Rössler system is defined as

 dxdt=−y−x,dydt=x+ay,dzdt=b+z(x−c). (6)

Equation (6) was solved at a rate of 20 Hz for 1000 seconds with parameters , , and , which results in a 3-period, periodic response. Only the last 200 seconds (see Fig. 7A) are used to avoid the transients.

We form a permutation sequence from the time series using a time delay and dimension , which were found using MPE as described in Section 3. The resulting permutation sequence is shown in Fig. 7B. Next, we form the unweighted ordinal partition network shown in Fig. 7C. Note that this graph is drawn using the electrical-spring layout function provided by NetworkX since the permutations do not have a natural embedding into Euclidean space. Using the network, we build the distance matrix in Fig. 7D. Finally, by applying persistent homology to the distance matrix, we obtain the persistence diagram in Fig. 7E. However, Fig. 7E does not show the possibility of point multiplicity in the persistence diagram. To demonstrate this occurrence we utilize a histogram of the number of classes at each lifetime as shown in Fig 7F. This shows there are actually two points in the persistence diagram with lifetime 1. The point summaries described in Section 2.9 are calculated as , , and .

## 4 Results

This section compares the persistence-based point summaries and the standard network scores, and illustrates the ability of these scores to detect dynamic state changes. Specifically, we compare the point summaries , , and to some commonly used network quantitative characteristics such as the mean out degree

, the out degree variance

, and the number of vertices . These comparisons are shown in Section 4.1 for a family of trajectories from the Rössler system, while Section 4.2 tabulates the different scores for a variety of dynamical systems. In Section 4.3 we contrast the noise robustness of our approach to the standard network scores for ordinal partition networks.

### 4.1 Dynamic State Change Detection on the Rössler System

Letting the parameter in Eq. 6 vary in the range in steps of and setting and , we obtain time series of length seconds for the state . We only retain the last seconds of the simulation to allow the trajectory to settle on an attractor. For the construction of the corresponding -NN networks, we sample the time series at Hz in order to capture a sufficient number of oscillations while avoiding overly large point clouds for computing persistence. For the Takens’ embedding we use the mutual information function approach and the nearest neighbor method, respectively, to choose the parameters and .

For constructing the ordinal partition networks we go with the higher sampling frequency of Hz, and we use MPE to select and . We found that a higher sampling rate for ordinal partition networks and the resulting longer time series is not an issue due to the maximum number of vertices not being dependent on the length of the time series, but rather on the motif dimension . Furthermore, for these networks a higher sampling rate tends to improve the detection of periodic and chaotic time series.

The resulting point summaries were found for both ordinal partition networks (Fig. 8A) and -NN of Takens’ embedding networks (Fig. 8B). The top two graphs in Fig. 8 show the bifurcation diagram depicting the local extrema of and the Lyapunov exponent [Benettin1980], respectively. The shaded regions in the figures represent periodic windows while the unshaded regions represent chaotic response. These windows were identified by investigating the bifurcation digram and the Lyapunov exponent plots.

For the ordinal networks, Figure 8A shows a significant drop in all six scores for the large periodic window corresponding to . There are also less pronounced drops in these scores for the other shorter periodic windows. These drops are especially evident for , , and where the scores significantly decrease in comparison to their surrounding values. However, some scores such as are not normalized, e.g., so that . Given one time series, and not a parameterized set of series, this makes it difficult or even impossible to distinguish between chaotic and periodic regions. On the other hand, the normalized scores that we introduce in this paper, and , suggest periodic regions when and . It should be noted that the difference between chaotic and periodic regions, as shown in Section 4.3, starts degrading as noise levels are increased.

For the -NN Takens’ embedding networks, Figure 8B shows a significant drop in , , and during periodic windows. However, for the traditional graph scores and this drop does not clearly correspond to the beginning and end of the periodic window. Further, for the smaller periodic windows interspersed with the chaotic regions we found that , , and are too noisy to identity the dynamic state changes in these areas. In contrast, our scores and retain the ability to distinguish between dynamics regimes, and for -NN networks of Takens’ embedding we suggest tagging the time series as periodic when and .

### 4.2 Tabulated Results

This section uses a variety of dynamical systems to validate the observations we made for the Rössler system in Section 4.1 related to the point summaries , , and that we introduced in Section 2.9. The results for each system when using ordinal partition networks and the -NN network from Takens’ embedding are provided side by side in Table 1. The model and time series information for all of these systems are provided in Appendix A. The table can be categorized into three types of dynamical systems: (1) systems of differential equations (Chua circuit, Lorenz, Rössler, coupled Lorenz-Rössler, bi-directional Rössler, and Mackey-Glass equations), (2) discrete-time dynamical systems (Logistic map, and Hénon map), and (3) ECG and EEG signals. The paragraphs below discuss the results for each one of these systems.

#### Systems of differential Equations:

As shown in Table 1, our point summaries from both networks yield distinguishable differences between periodic and chaotic time series. The -NN graph results in Table 1 show that periodic time series have , , and . Similarly, the ordinal partition graph scores in Table 1 show that periodic time series have , , and .

#### Discrete dynamical systems:

The results for the discrete dynamical equations in Table 1 show distinguishable differences between periodic maps in comparison to chaotic maps when using ordinal partition networks. Takens’ embedding was not applied to the discrete dynamical systems, and only the ordinal partition network results are reported here because working with these networks is more natural for maps.

#### EEG and ECG Results:

The point summary results from real world data sets (ECG and EEG) shown in Table 1 have inherent noise, which causes the differences between the compared states to be less significant as shown in Fig. 9. The -NN graph results in Table 1 do not show a significant difference between the two groups for either ECG and EEG data. This is most likely due to the sensitivity of Takens’ embedding to noise and perturbations. However, we did find a difference between epileptic and healthy patients through the networks formed by ordinal partitions for ECG [Moody1992] and EEG [Andrzejak2001] data. Section 4.3 discusses the effect of additive noise on the point summaries in more detail. As a note, there have been other methods for characterizing EEG data using TDA and persistent entropy [Piangerelli2016], but our method is different from prior works because we apply persistent homology to the generated networks.

### 4.3 Effects of Additive Noise

In this section we investigate the noise robustness of the point summaries in comparison to some common network parameters—mean out degree , out degree variance , and the number of vertices . The ordinal partition networks are based on time series from the Rössler system defined in Eq. (6) with , , and either or for a periodic or chaotic response, respectively.

To make comparisons on the noise robustness we add Gaussian noise to the signal and calculate the point summaries and network parameters at various Signal-to-Noise Ratios (SNR) for both periodic and chaotic Rössler systems. The chosen SNR values were all the integers from to , and at each SNR value we obtain realizations of noisy signals.

To determine the confidence interval at each SNR, we repeat the calculation of the point summaries and network parameters for all noise realizations at each SNR level, and we set our confidence interval to where and

are the sample average and sample standard deviation, respectively, at a specific SNR value. Figure

9 shows the mean values and confidence intervals for each SNR. To assess the ability of point summaries to assign a distinguishing score to a periodic versus a chaotic system in the presence of noise, we check for an overlap in the confidence intervals for the periodic and chaotic results at each SNR. If for a particular point summary there is an overlap between the scores for periodic and the chaotic time series, then that point summary is not effective in distinguishing the dynamics at that specific SNR. Table 2 summarizes the noise robustness by providing the lowest SNR at which each point summary and network parameter no longer has an overlap between the periodic and chaotic confidence intervals. This result shows a lower capable SNR for the persistence based point summaries than the mean out degree and variance . Another trend that should be noted is the reduction in difference between periodic and chaotic time series for high levels of noise. This should be taken into account when applying the point summaries to real world data with intrinsic noise.

## 5 Conclusions

In this paper we develop a new framework for time series analysis using TDA. We investigate two methods for embedding a time series into an unweighted graph: (1) utilizing standard Takens’ theorem techniques, then building a -NN graph; and (2) using ordinal partition networks to turn (visited) parts of the state-space into symbols, and obtaining a graph by tracking sequential transitions between these subspaces. We then describe how to obtain the -D persistence diagram corresponding to the graph by defining a filtration on the full simplicial complex using the pairwise distances between the graph vertices. The obtained persistence diagram then allows the application of tools from TDA to gain insights into the system’s underlying dynamics. Specifically, a graph embedding of a periodic time series is long connected network loops, while a chaotic time series has many short loops. These characteristics allow persistent homology to accurately distinguish periodic and chaotic time series by measuring the shape of the networks.

In addition to describing this novel approach for time series analysis, another contribution of this work is the introduction of new point summaries for extracting information about the dynamic state (periodic or chaotic) from time series measurements. Specifically, we extend the periodicity score , which was defined on in Ref. [Perea2015], to abstract graph spaces. We also define a heuristic which represents an approximation of the ratio of the number of homology classes to the graph order. The last point summary we define is a normalized version of the persistence entropy [Chintakunta2015].

We found that these point summaries outperform standard graph scores, see Fig. 8. Specifically, our point summaries are more capable of distinguishing shifts in the dynamic behavior than their traditional graph scores counterpart. Further our point summaries, especially the two normalized scores and , enable making inferences about the dynamic behavior from isolated time series, as opposed to tracking changes in the scores of parameterized time series some of which belong to a known dynamic regime. For example, applying our point summaries to ordinal partition networks from a variety of dynamical systems in Table 1, we found that that periodic time series typically have , , and . Similarly, using the networks obtained from -NN embedding shows that periodic time series have , , and , see Table 1. However, for both discrete dynamical systems as well as ECG and EEG data, only the persistent homology of the ordinal partition network was able to distinguish between the two data sets. Additionally, we showed in Fig. 9 that the point summaries of the ordinal partition networks are noise robust down to an SNR of approximately with additive Gaussian noise.

## Appendix A Dynamical System Examples

### a.1 Chua Circuit

The Chua Circuit used is defined as

 dxdt=α(y−x−f(x)),dydt=γ(x−y+z),dzdt=−βy, (7)

where is defined as , with and . Additionally, the Chua circuit had a sampling rate of 50 Hz with parameters , , and for a periodic response or for a chaotic response. This system was solved for 500 seconds and the last 100 seconds were used. The generated time series were downsampled to 7 Hz for -NN networks.

### a.2 Lorenz System

The Lorenz system used is defined as

 dxdt=σ(y−x),dydt=x(ρ−z)−y,dzdt=xy−βz. (8)

The Lorenz system had a sampling rate of 100 Hz with parameters , , and for a periodic response or for a chaotic response. This system was solved for 100 seconds and the last 24 seconds were used. The generated time series were downsampled to 35 Hz for -NN networks.

### a.3 Coupled Lorenz-Rössler System

The coupled Lorenz-Rössler system used is defined as

 dx1dt=−y1−z1+k1(x2−x1),dy1dt=x1+a2y1+k2(y2−y1),dz1dt=b2+z1(x1−c2)+k3(z2−z1),dx2dt=σ(y2−x2),dy2dt=λx2−y2−x2z2,dz2dt=x2y2−b1z2, (9)

with , , , , , , , , and for a periodic response or for a chaotic response. This was solved for 400 seconds with a sampling rate of 50 Hz. Only the last 200 seconds of the x-solution were used in the analysis. The generated time series were downsampled to 2 Hz for -NN networks.

### a.4 Bi-Directional Coupled Rössler System

The Bi-directional Rössler system is defined as

 dx1dt=−w1y1−z1+k(x2−x1),dy1dt=w1x1+0.165y1,dz1dt=0.2+z1(x1−10),dx2dt=−w2y2−z2+k(x1−x2),dy2dt=w2x2+0.165y2,dz2dt=0.2+z2(x2−10), (10)

with , , and for a periodic response or for a chaotic response. This was solved over 4000 seconds with a sampling rate of 10 Hz. Only the last 400 seconds of the x-solution were used in the analysis. The generated time series were downsampled to 1 Hz for -NN networks. The generated time series were downsampled to 2 Hz for -NN networks.

### a.5 Mackey-Glass Delayed Differential Equation

The Mackey-Glass Delayed Differential Equation is defined as

 x(t)=−γx(t)+βx(t−τ)1+x(t−τ)n (11)

with , , , and for a periodic response or for a chaotic response. This was solved for 400 seconds with a sampling rate of 100 Hz. Only the last 300 seconds of the x-solution were used in the analysis. The generated time series were downsampled to 2.5 Hz for -NN networks and 25 Hz for ordinal partition networks.

### a.6 EEG Data

The EEG signal was taken from andrzejak et al. [Andrzejak2001]. More specifically, the first 2000 data points from the EEG data of a healthy patient from set A, file Z-093 was used as the periodic series, and the first 2000 data points from the EEG data of a patient during active seizures from set E, file S-056 was used as the chaotic series. The generated time series were downsampled to 80 Hz for -NN networks.

### a.7 ECG Data

The Electrocardoagram (ECG) data was taken from SciPy’s misc.electrocardiogram data set. This ECG data was originally provided by the MIT-BIH Arrhythmia Database [Moody1992]. We used data points 3000 to 4500 during normal sinus rhythm as the periodic time series and data points 8500 to 10000 during ventricular contractions as the chaotic time series.

### a.8 Logistic Map

The logistic map was generated as

 xn+1=rxn(1−xn), (12)

with and for periodic results or for chaotic results. Equation 12 solved for the first 500 data points.

### a.9 Hénon Map

The Hénon map was solved as

 xn+1=1−ax2n+yn,yn+1=bxn, (13)

where , , , and for a periodic response and for a chaotic response. This system was solved for the first 500 data points of the x-solution.