# The persistence landscape and some of its properties

Persistence landscapes map persistence diagrams into a function space, which may often be taken to be a Banach space or even a Hilbert space. In the latter case, it is a feature map and there is an associated kernel. The main advantage of this summary is that it allows one to apply tools from statistics and machine learning. Furthermore, the mapping from persistence diagrams to persistence landscapes is stable and invertible. We introduce a weighted version of the persistence landscape and define a one-parameter family of Poisson-weighted persistence landscape kernels that may be useful for learning. We also demonstrate some additional properties of the persistence landscape. First, the persistence landscape may be viewed as a tropical rational function. Second, in many cases it is possible to exactly reconstruct all of the component persistence diagrams from an average persistence landscape. It follows that the persistence landscape kernel is characteristic for certain generic empirical measures. Finally, the persistence landscape distance may be arbitrarily small compared to the interleaving distance.

## Authors

• 6 publications
• ### Embeddings of Persistence Diagrams into Hilbert Spaces

Since persistence diagrams do not admit an inner product structure, a ma...
05/11/2019 ∙ by Peter Bubenik, et al. ∙ 0

• ### Nonembeddability of Persistence Diagrams with p>2 Wasserstein Metric

Persistence diagrams do not admit an inner product structure compatible ...
10/30/2019 ∙ by Alexander Wagner, et al. ∙ 0

• ### Finite Mixture Model of Nonparametric Density Estimation using Sampling Importance Resampling for Persistence Landscape

Considering the creation of persistence landscape on a parametrized curv...
11/17/2018 ∙ by Farzad Eskandari, et al. ∙ 0

• ### A General Neural Network Architecture for Persistence Diagrams and Graph Classification

Graph classification is a difficult problem that has drawn a lot of atte...
04/20/2019 ∙ by Mathieu Carrière, et al. ∙ 0

• ### Persistence Lenses: Segmentation, Simplification, Vectorization, Scale Space and Fractal Analysis of Images

A persistence lens is a hierarchy of disjoint upper and lower level sets...
04/25/2016 ∙ by Martin Brooks, et al. ∙ 0

• ### Possibilistic decreasing persistence

A key issue in the handling of temporal data is the treatment of persist...
03/06/2013 ∙ by Dimiter Driankov, et al. ∙ 0

• ### Regularized Potentials of Schrödinger Operators and a Local Landscape Function

We study localization properties of low-lying eigenfunctions (-Δ +V...
03/02/2020 ∙ by Stefan Steinerberger, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

A central tool in topological data analysis is persistent homology [36, 64] which summarizes geometric and topological information in data using a persistence diagram (or equivalently, a bar code).

For topological data analysis, one wants to subsequently perform statistics and machine learning. There are some approaches to doing so directly with persistence diagrams [17, 15, 10, 51, 58]. However, using the standard metrics for persistence diagrams (bottleneck distance and Wasserstein distance) it is difficult to even perform such a basic statistical operation as averaging [60, 52].

The modern approach to alleviating these difficulties and to permit the easy application of statistical and machine learning methods is to map persistence diagrams to a Hilbert space. One way to do so is the persistence landscape [14]. It has the advantages of being invertible, so it does not lose any information, having stability properties, and being parameter-free and nonlinear (see Section 2.2).

The persistence landscape may be efficiently computed either exactly or using a discrete approximation [16]. Since it loses no information (or little information in the case of the discrete approximation) it can be a large representation of the persistence diagram. Nevertheless, subsequent statistical and machine learning computations are fast, which has allowed a wide variety of applications. These include the study of: electroencephalographic signals [62, 63], protein binding [43], microstructure analysis [34]

[35], swarm behavior [31], nanoporous materials [46, 47], fMRI data [59, 7], coupled oscillators [59], brain geometry [39, 40], detecting financial crashes [41], shape analysis [53], histology images [28], music audio signals [49], and the microbiome [54].

In this paper we introduce a weighted version of the persistence landscape (Section 3). In some applications it has been observed that it is not the longest bars that are the most relevant, but those of intermediate length [6, 53]

. The addition of a weighting allows one to tune the persistence landscape to emphasize the feature scales of greatest interest. Since arbitrary weights allow perhaps too much flexibility, we introduce the Poisson-weighted persistence landscape kernel which has one degree of freedom.

Next we show that persistence landscapes are highly compatible with Kalisnik’s tropical rational function approach to summarizing persistent homology [42]. In fact, we show that persistence landscapes are tropical rational functions (Section 4).

In the most technical part of the paper (Section 5), we prove that for certain finite sets of persistence diagrams, it is possible to recover these persistence diagrams exactly from their average persistence landscape (Theorem 5.10). Furthermore, we show that this situation is in some sense generic (Theorem 5.16). This implies that the persistence landscape kernel is characteristic for certain generic empirical measures (Theorem 5.11).

It is known that the distance between the two persistence landscapes associated to two persistence diagrams is upper bounded by the corresponding bottleneck distance [14, Theorem 13]. In the other direction, we show that this distance is not lower bounded by some fixed positive scalar multiple of the corresponding bottleneck distance (Section 6).

#### Related work

There are also many other ways to map persistence diagrams to a vector space or Hilbert space. These include the Euler characteristic curve

[61], the persistence scale-space map [56], complex vectors [33], pairwise distances [21], silhouettes [25], the longest bars [6], the rank function [57], the affine coordinate ring [2], the persistence weighted Gaussian kernel [44], topological pooling [12], the Hilbert sphere [5], persistence images [1], replicating statistical topology [3], tropical rational functions [42], death vectors [53], persistence intensity functions [26][55, 50], the sliced Wasserstein kernel [20], the smooth Euler characteristic transform [32], the accumulated persistence function [9], the persistence Fisher kernel [45], and persistence paths [27]. Perhaps since the persistence diagram is such a rich invariant, it seems that any reasonable way of encoding it in a vector works fairly well.

#### Outline of the paper

In Section 2 we recall necessary background information. The next three sections contain our main results. In Section 3 we define the weighted persistence landscape and the Poisson-weighted persistence landscape kernel. In Section 4 we show that the persistence landscape may be viewed as a tropical rational function. In Section 5 we show that in a certain generic situation we are able to reconstruct a family of persistence diagrams from their average persistence landscape. From this it follows that the persistence landscape kernel is characteristic for certain generic empirical measures. Finally in Section 6 we show that the landscape distance is not lower bounded by a fixed positive scalar multiple of the bottleneck distance.

## 2. Background

### 2.1. Persistence modules, persistence diagrams, and bar codes

A persistence module [18] consists of a vector space for each real number , and for each a linear map such that for , . Persistence modules arise in topological data analysis from homology (with coefficients in some field) of a filtered simplicial complex (or a filtered topological space).

In many cases, a persistence module can be completely represented by a collection of intervals called a bar code [30]. Another representation of the bar code is the persistence diagram [29] consisting of pairs which are the end points of the intervals in the bar code.

In computational settings there are always only finitely many points in the persistence diagram and it is usually best to truncate intervals in the bar code that persist until the maximum filtration value at that value. Thus we make the following assumption.

###### Assumption 2.1.

Throughout this paper, we will assume that persistence diagrams consist of finitely many points with .

One way of measuring distance between persistence modules is the interleaving distance [22]. Similarly, one can measure distance between persistence diagrams is the bottleneck distance [29]. The two distances are related by the isometry theorem [22, 48, 18]. These distances induce a topology on the space of persistence modules and the space of persistence diagrams [19].

Sometimes we will consider sequences of persistence diagrams for fixed . When we do so, we will consider this sequence to be a point in the product space of persistence diagrams with the product metric. That is,

 d((D1,…,Dn),(D′1,…,D′n))=max{dB(D1,D′1),…,dB(Dn,D′n)}. (2.2)

This metric induces the product topology.

### 2.2. Persistence landscapes and average persistence landscapes

We give three equivalent definitions of the persistence landscape [14].

Given a persistence module, , we may define the persistence landscape as the function given by

 λ(k,t)=sup(h≥0 ∣ rankM(t−h≤t+h)≥k).

More concretely, for a bar code, , we can define the persistence landscape by

 λ(k,t)=sup(h≥0 ∣ [t−h,t+h]⊂Ij for at least k% distinct j).

For a persistence diagram , we can define the persistence landscape as follows. First, for , define

 f(a,b)(t)=max(0,min(a+t,b−t)).

Then

 λ(k,t)=kmax{f(ai,bi)(t)}i∈I,

where denotes the th largest element.

The persistence landscape may also be considered to be a sequence of functions , where is called the th persistence landscape function. The function is piecewise linear with slope either , , or . The critical points of are those values of at which the slope changes. The set of critical points of the persistence landscape is the union of the sets of critical points of the functions . A persistence landscape may be computed by finding its critical points and also encoded by the sequences of critical points of the persistence landscape functions [16].

The average persistence landscape [14, 25] of the persistence landscapes is given by

 ¯λ(k,t)=1NN∑j=1λ(j)(k,t).

We can also consider to be given by a sequence of functions .

### 2.3. Feature maps and kernels

Let be a set. A function where is a Hilbert space is called a feature map. A kernel on is a symmetric map such that for every and all and , . A reproducing kernel Hilbert space (RKHS) on a set is a Hilbert space of real-valued functions on such that the pointwise evaluation functional is continuous.

Given a feature map there is an associated kernel given by

 K(x,y)=⟨(F(x),F(y)⟩H.

Given a kernel, , there is an associated reproducing kernel Hilbert space (RKHS), , which is the completion of the span of the functions given by , for all , with respect to the inner product given by .

Now assume that we have a -algebra on . One can map measures on to via the map (when this is well defined). This map is called the kernel mean embedding. Let be a set of measures on . The kernel is said to be characteristic over if the kernel mean embedding is injective on .

### 2.4. Properties of the persistence landscape

We recall some established properties of the persistence landscape.

#### 2.4.1. Invertibility

The following is given informally in [14, Section 2.3]. It is proved more formally and precisely in [8] where it is shown that the critical points of the persistence landscapes are obtained from a graded version of the rank function via Möbius inversion.

###### Theorem 2.3.

The mapping from persistence diagrams to persistence landscapes is invertible.

#### 2.4.2. Stability

The persistence landscape is stable in the following sense.

###### Theorem 2.4 ([14, Theorem 13]).

Let and be two persistence diagrams and let and be their persistence landscapes. Then for all and all ,

 |λk(t)−λ′k(t)|≤dB(D,D′),

where denotes the bottleneck distance.

More generally, we have the following.

###### Theorem 2.5 ([14, Theorem 17]).

Let and be two persistence modules and let and be their persistence landscapes. Then for all and all ,

 |λk(t)−λ′k(t)|≤di(M,M′),

where denotes the interleaving distance.

As a special case of Theorem 2.4, we have the following.

###### Corollary 2.6.

Given persistence diagrams and , let and be the associated persistence landscapes. Then

 ∥∥λ−λ′∥∥∞≤∥∥(a1,b1,…,an,bn)−(a′1,b′1,…,a′n,b′n)∥∥∞.

In [23] it is shown that the average persistence landscape is stable.

#### 2.4.3. The persistence landscape kernel

Since the persistence landscape is a feature map from the set of persistence diagrams to there is an associated kernel we call the persistence landscape kernel [56], given by

 K(D(1),D(2))=⟨λ(1),λ(2)⟩=∑k∫λ(1)kλ(2)k=∞∑k=1∫∞−∞λ(1)k(t)λ(2)k(t)dt.

#### 2.4.4. The persistence landscapes and parameters

One advantage of the persistence landscape is that its definition involves no parameters. So there is no need for tuning and no risk of overfitting.

#### 2.4.5. Nonlinearity of persistence landscapes

Another important advantage of the persistence landscape for statistics and machine learning is its nonlinearity. Call a summary of persistence diagrams in a vector space linear if for two persistence diagrams and , . The persistence landscape is highly non-linear.

#### 2.4.6. Computability of the persistence landscape

There are fast algorithms and software for computing the persistence landscape [16]. In practice, computing the persistence diagram seems to always be slower than computing the associated persistence landscape. The methods are also available in an R package [13].

#### 2.4.7. Convergence results for the persistence landscape

From the point of view of statistics, we assume that data has been obtained by sampling from a random variable. Applying our persistent homology constructions, we obtain a random persistence landscape.

This is a Banach space valued random variable. Assume that its norm has finite expectation and variance. If we take an (infinite) sequence of samples from this random variable then the average landscapes converge (almost surely) to the expected value of the random variable

[14, Theorem 9]

. This is known as a (strong) law of large numbers.

Now if we consider the difference between the average landscapes and the expectation (suitably normalized), it converges pointwise to a Gaussian random variable [14, Theorem 10]. This result was extended in [25]

to prove uniform convergence. These are central limit theorems.

#### 2.4.8. Confidence bands for the persistence landscape

The bootstrap can be used to compute confidence bands [24] and adaptive confidence bands [25] for the persistence landscape. There is an R package that has implemented these computations [37].

#### 2.4.9. Subsampling and the average persistence landscape

A useful and powerful method in large data settings is to subsample many times and compute the average persistence landscape [23, 53]. In [23] it is shown that this average persistence landscape is stable and that it converges.

### 2.5. Tropical rational functions

The max-plus algebra is the semiring over with the binary operations given by

 x⊕y=max(x,y), x⊙y=x+y.

If are variables representing elements in the max-plus algebra, then a product of these variables (with repetition allowed) is a max-plus monomial.

 xa11xa22⋯xann=xa11⊙xa22⊙⋯⊙xann

A max-plus polynomial is a finite linear combination of max-plus monomials.

 p(x1,…,xn)=a1⊙xa111xa122⋯xa1nn⊕a2⊙xa212xa222⋯xa2nn⊕⋯⊕am⊙xam11xam22⋯xamnn

We also call this a tropical polynomial. A tropical rational function [42] is a quotient where and are tropical polynomials. Note that if and are tropical rational functions, then so is .

## 3. Weighted persistence landscapes

In this section we introduce a class of norms and kernels for persistence landscapes. As a special case we define a one-parameter family of norms and kernels for persistence landscapes which may be useful for learning algorithms.

Recall that for real-valued functions on we have a -norm for . For persistence landscapes, we have for ,

 ∥λ∥p=∞∑k=1[∫∞−∞λk(t)pdt]1p,

and for ,

 ∥λ∥∞=supk,tλk(t).

We also have the persistence landscape kernel given by the inner product on ,

 K(D(1),D(2))=⟨λ(1),λ(2)⟩=∑k∫λ(1)kλ(2)k=∞∑k=1∫∞−∞λ(1)k(t)λ(2)k(t)dt.

We observe that one may use weighted versions of these norms and inner products. That is, given any nonnegative function , we have

 ∥λ∥p,w=∥wλ∥p,

and

 Kw(D(1),D(2))=⟨w12λ(1),w12λ(2)⟩.

For example, consider the following one-parameter family of kernels,

 Kν(D(1),D(2))=∞∑k=1Pν(k−1)∫∞−∞λ(1)k(t)λ(2)k(t)dt,

where

is the Poisson distribution with parameter

. Call this the Poisson-weighted persistence landscape kernel

. This additional parameter may be useful for training classifiers using persistence landscapes. It has an associated one-parameter family of norms given by,

Note that the distribution is unimodal with maximum at and . So by varying one increases the weighting of a particular range of persistence landscape functions.

We may consider the kernel to be associated to the feature map which maps to the Hilbert space with inner product or the feature map which maps to the usual Hilbert space .

## 4. Persistence landscapes as tropical rational functions

In this section we will show that the persistence landscape is a tropical rational function.

Let be a persistence diagram with . Recall (Section 2.2) that the th persistence landscape function is given by , where .

First rewrite as a tropical rational expression in one variable, , as follows.

 f(a,b)(t) =max(0,min(a+t,b−t)) =max(0,−max(−(a+t),t−b)) =max(0,−max((a⊙t)−1,t⊙b−1)) =max(0,[(a⊙t)−1⊕(t⊙b−1)]−1) =0⊕[(a⊙t)−1⊕(t⊙b−1)]−1

We may simplify the right hand term by using the usual rules for adding fractions.111That is, . So

 f(a,b)(t)=0⊕(a+b)⊙t⊙(b⊕a⊙t2)−1.

which is the same as

Next consider max-plus polynomials in variables, . The elementary symmetric max-plus polynomials, , are given by

 σk(x1,…,xn)=⊕π∈Snxπ(1)⊙⋯⊙xπ(k),

where the sum is taken over elements of the symmetric group . So is the sum of the th largest elements of . Therefore,

 kmax1≤i≤nxi=σk(x1,…,xn)−σk−1(x1,…,xn).

Thus,

 λk(t)=σk(fi(t))⊙σk−1(fi(t))−1,

where we have written for and for . Hence, for a fixed persistence diagram , we have as a tropical rational function in one variable .

However, we really want to consider as fixed and the persistence diagram as the variable. Let us change to this perspective. To start, consider

 ft(a,b)=0⊕t⊙a⊙b⊙(b⊕2t⊙a)−1,

a tropical rational function in the variables and . Next,

 σk(ft(a1,b1),…,ft(an,bn))=⊕π∈Snft(aπ(1),bπ(1))⊙⋯⊙ft(aπ(k),bπ(k))

is a 2-symmetric max-plus tropical rational function in the variables . Finally,

 λk,t(a1,b1,…,an,bn)=σk(ft(a1,b1),…,ft(an,bn))⊙σk−1(ft(a1,b1),…,ft(an,bn))−1

is also a 2-symmetric tropical rational function in the variables .

By the stability theorem for persistence landscapes (Section 2.4.2), these tropical rational functions are 1-Lipschitz function from with the sup-norm to .

Since the mapping from persistence diagrams to persistence landscapes is invertible [14], the persistence landscape gives us a collections of tropical rational functions from which we can reconstruct the persistence diagrams.

In practice, we do not need to use all of the . If the values of and are only known up to some or if they lie on a grid of step size , then it suffices to use and , where is the maximal dimension of the persistence module (i.e. the maximum number of overlapping intervals in the bar code), and the interval contains all of the and .

## 5. Reconstruction of diagrams from an average persistence landscape

In this section we will show that for certain generic finite sets of persistence diagrams, it is possible to reconstruct these sets of persistence diagrams exactly from their average persistence landscapes. This implies that the persistence landscape kernel is characteristic for certain generic empirical measures.

Let be a sequence of persistence diagrams (Section 2.1). Recall that we assume that our persistence diagrams consist of finitely many points where (Assumption 2.1). Let denote their corresponding persistence landscapes (Section 2.2) and let denote their average landscape. We can summarize this construction as a mapping

 (D1,…,Dn)↦¯λ=¯λ(D1,…,Dn) (5.1)

We will show that in many cases, this map is invertible.

### 5.1. Noninvertibility and connected persistence diagrams

We start with a simple example where the map in (5.1) is not one-to-one and hence not invertible.

Consider , , , and . Then . So the average landscape of equals the average landscape of .

The map (5.1) fails to be invertible because the union of the intervals in the bar code (Section 2.1) corresponding to the persistence diagram is disconnected. However, in many applications we claim that this behavior is atypical. To make this claim precise we need the following definition.

###### Definition 5.2.

Let be a bar code consisting of intervals . Define the graph of to be the graph whose vertices are the intervals and whose edges consists of pairs of intervals with nonempty intersection, .

For many geometric processes [4, Figure 2.2] and in applications [38, Figure 5], as the number of intervals in the bar code increases, the corresponding graphs seem to have a giant component [11, Chapter 6].

### 5.2. Bipartite graph of a persistence diagram

Let be a persistence diagram.

###### Definition 5.3.

Say that the persistence diagram is generic if for each , the four numbers are distinct.

###### Definition 5.4.

Let be a generic persistence diagram. Let be the bipartite graph of consisting of the disjoint vertex sets and and edges consisting of for each and for each pair satisfying .

###### Proposition 5.5.

We can reconstruct a generic persistence diagram from its bipartite graph.

###### Proof.

Let be a generic persistence diagram. Let be its bipartite graph. Let and be the disjoint vertex sets of . By definition, consists of the set of first coordinates of the points in , and consists of the set of second coordinates of the points in . By assumption, these coordinates are unique. Let . By the definition of , there exists such that is an edge in and . Also by definition, for all such at is an edge in , . Thus, for all , let be the maximum element of such that is an edge in . The resulting pairs are exactly . ∎

###### Definition 5.6.

Say that a persistence diagram is connected if the graph (Definition 5.2) of its barcode is connected.

###### Lemma 5.7.

A generic persistence diagram is connected if and only if its bipartite graph is connected.

###### Proof.

Let be a generic persistence diagram. If we set in , the is isomorphic to the graph of the bar code corresponding to . By definition, is connected, if and only if is connected. ∎

### 5.3. Critical points of persistence landscapes

We observe that it is easy to list the critical points of a persistence landscape from its corresponding persistence diagram.

###### Lemma 5.8.

Let be a persistence diagram. Consider the intervals in the corresponding bar code. The critical points in the corresponding persistence landscape consist of

1. the left end points of the intervals;

2. the right end points of the intervals;

3. the midpoints of the intervals; and

4. the midpoints of intersections of pairs of intervals where .

Let denote this set.

###### Proof.

Recall that the critical points of the persistence landscape of consist of the critical points of the functions and the points for which there exist and such that , and . The former are exactly the points in (a), (b), and (c). The latter are exactly the points in (d). ∎

In the set we have the following three-term arithmetic progressions,

 aj,aj+bj2,bjandak,ak+bj2,bj,

which we call interval triples and intersection triples, respectively. Note that we have one interval triple for each point in the persistence diagram and one intersection triple for each pair of points in the persistence diagram that satisfies .

### 5.4. Arithmetically independent sets of persistence diagrams

In this section we introduce assumptions for a set of persistence diagrams.

###### Definition 5.9.

Let be a set of persistence diagrams. We call this set arithmetically independent if it satisfies the following assumptions.

1. Each is generic.

2. The sets are pairwise disjoint.

3. Let be the set of all critical points in . All of the three-term arithmetic progressions in are either interval triples or intersection triples of some .

### 5.5. Reconstruction of persistence diagrams from an average landscape

We are now in a position to state and prove our reconstruction result.

###### Theorem 5.10.

Let be the average landscape of the persistence diagrams . If are connected and arithmetically independent then one can reconstruct from .

###### Proof.

Let be the set of all critical points in the average landscape . Let be the subset of critical points that are the first term in a three-term arithmetic progression in . Let be the subset of critical points that are the third term in a three-term arithmetic progression in .

By assumption and are disjoint. Let be the bipartite graph whose set of vertices is the disjoint union of and and whose edges consist of where and are the first and third term of a three-term arithmetic progression in .

By the assumption of arithmetic independence, vertices in are only connected by an edge if they are critical points of the same persistence diagram. By the assumption of connectedness, all of the critical points of a persistence landscape of one of the persistence diagrams are connected in . Thus, the connected components of are exactly the bipartite graphs .

Using Proposition 5.5, we can reconstruct each persistence diagram from the corresponding bipartite graph. ∎

### 5.6. Persistence landscapes are characteristic for empirical measures

We can restate Theorem 5.10 using the language of characteristic kernels (Section 2.3).

###### Theorem 5.11.

The persistence landscape kernel is characteristic for empirical measures on connected and arithmetically independent persistence diagrams.

### 5.7. Genericity of arithmetically independent persistence diagrams

We end this section by showing that connected and arithmetically independent persistence diagrams are generic in a particular sense.

###### Lemma 5.12.

Let be a persistence diagram. Let . Then there exists a connected persistence diagram with .

###### Proof.

Let and . Choose such that . Let . Then is connected and . Thus is connected and . ∎

###### Lemma 5.13.

Let . Let . Then there is a generic persistence diagram with . Furthermore, if is connected then so is .

###### Proof.

The proof is by induction on . If then the statement is trivial. Assume that is a generic persistence diagram and . Since there are only finitely many numbers to avoid, we can choose and such that is a generic persistence diagram. Note that . Since , if is connected then so is . ∎

###### Proposition 5.14.

Let be a generic persistence diagram. Then there is an such that for all with and , is generic and .

###### Proof.

Let be the set of all coordinates of points in . Let . Let . Let be a persistence diagram with and . Then for all there is a with . So and . By the triangle inequality, the coordinates of points in are distinct.

By the construction of , there is a canonical bijection of the intervals in the barcodes of and . Note that by the definition of , this implies that the nonempty intersections of pairs of intervals in the bar code of have length at least . Since , a pair of intervals in the bar code of intersect if and only if the corresponding pair of intervals in intersect. ∎

###### Corollary 5.15.

Let be a generic and connected persistence diagram. Then there is an such that for all persistence diagrams with and , is generic and connected.

Now consider a sequence of persistence diagrams . Recall that we consider this to be a point in the product space of persistence diagrams (Section 2.1) with associated product metric (2.2) and product topology.

###### Theorem 5.16.

Connected and arithmetically independent persistence diagrams are generic in the following sense.

1. They are dense. That is, given persistence diagrams and an there exist connected and arithmetically independent persistence diagrams with for all .

2. If we restrict to persistence diagrams with the same cardinality then they are open. That is, given connected and arithmetically independent persistence diagrams , there is some such that any persistence diagrams with and for all , are connected and arithmetically independent.

###### Proof.

(1) The proof is by induction on . If then the statement is trivially true. Assume that we have connected and arithmetically independent persistence diagrams with for . By Lemmas 5.12 and 5.13 there exists a generic and connected persistence diagram with . We finish the proof by induction on . If then we are done. Assume that is arithmetically independent. By Corollary 5.15, there exists an such that for all persistence diagrams with and , is generic and connected. Let . Since there are only finitely many numbers to avoid, we can choose and such that is connected and arithmetically independent. Note that .

(2) Let be connected persistence diagrams that are arithmetically independent. Denote this sequence of persistence diagrams by . Using Corollary 5.15 we can choose an such that for any persistence diagrams with and for all , each is connected.

Let be the set of all critical points of the average landscape of . There are only finitely many points such that is part of a three term arithmetic progression in . Let be the set of all such numbers.

Let . Let . Consider persistence diagrams with and for all . Let denote this sequence of persistence diagrams.

The assumptions imply that for each point in one of the persistence diagrams in there is a corresponding point in the corresponding persistence diagram in , and . That is, and . Thus we have the induced bijection between and with corresponding points and satisfying . Notice that since is generic, so is . Also, since the sets are disjoint, so are the sets . Furthermore, the assumptions imply that we have an induced correspondence between and with corresponding points and satisfying . By the triangle inequality for , , . It follows that is arithmetically independent. Let . ∎

## 6. Metric comparison of persistence landscapes and persistence diagrams

In this section we show that the landscape distance can be much smaller than the corresponding bottleneck distance.

Given a persistence diagram , let denote the corresponding persistence landscape. In [14, Theorem 12] it was shown that .

Here we will show the following.

###### Proposition 6.1.

Let . Then there is a pair of persistence diagrams such that .

###### Proof.

Consider

 D1={±(−3n−1+2i,3n−1+2i))}ni=1, and D2={±(−3n+2i,3n+2i)}n−1i=1∩{(−3n,3n),(−n,n)}

See Figure 1 where . Then , but . ∎

### Acknowledgments

The author would like to acknowledge the support of the Army Research Office [Award W911NF1810307], National Science Foundation [DMS - 1764406] and the Simons Foundation [Grant number 594594]. He would also like to thank Pawel Dlotko, Michael Kerber, and Oliver Vipond for helpful conversations, Leo Betthauser, Nikola Milicevic, and Alex Wagner for proofreading an earlier draft, and the Mathematisches Forschungsinstitut Oberwolfach (MFO) where some of this work was started.

## References

• [1] Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, and Lori Ziegelmeier. Persistence images: A stable vector representation of persistent homology. Journal of Machine Learning Research, 18(8):1–35, 2017.
• [2] Aaron Adcock, Erik Carlsson, and Gunnar Carlsson. The ring of algebraic functions on persistence bar codes. Homology Homotopy Appl., 18(1):381–402, 2016.
• [3] Robert J. Adler, Sarit Agami, and Pratyush Pranav. Modeling and replicating statistical topology and evidence for CMB nonhomogeneity. Proc. Natl. Acad. Sci. USA, 114(45):11878–11883, 2017.
• [4] Robert J. Adler, Omer Bobrowski, Matthew S. Borman, Eliran Subag, and Shmuel Weinberger. Persistent homology for random fields and complexes. In Borrowing strength: theory powering applications—a Festschrift for Lawrence D. Brown, volume 6 of Inst. Math. Stat. Collect., pages 124–143. Inst. Math. Statist., Beachwood, OH, 2010.
• [5] Rushil Anirudh, Vinay Venkataraman, Karthikeyan Natesan Ramamurthy, and Pavan Turaga. A riemannian framework for statistical analysis of topological persistence diagrams. In

The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

, June 2016.
• [6] Paul Bendich, J. S. Marron, Ezra Miller, Alex Pieloch, and Sean Skwerer. Persistent homology analysis of brain artery trees. Ann. Appl. Stat., 10(1):198–218, 03 2016.
• [7] Tegan Emerson Bernadette J. Stolz, Satu Nahkuri, Mason A. Porter, and Heather A. Harrington. Topological data analysis of task-based fmri data from experiments on schizophrenia. arXiv:1809.08504 [q-bio.QM], 2018.
• [8] Leo Betthauser, Peter Bubenik, and Parker Edwards. Persistence landscapes are graded persistence diagrams. in preparation.
• [9] Christophe Biscio and Jesper Møller. The accumulated persistence function, a new useful functional summary statistic for topological data analysis, with a view to brain artery trees and spatial point process applications. arXiv:1611.00630 [math.ST], 2016.
• [10] Andrew J. Blumberg, Itamar Gal, Michael A. Mandell, and Matthew Pancia.

Robust statistics, hypothesis testing, and confidence intervals for persistent homology on metric measure spaces.

Found. Comput. Math., 14(4):745–789, 2014.
• [11] Béla Bollobás. Random graphs, volume 73 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, second edition, 2001.
• [12] Thomas Bonis, Maks Ovsjanikov, Steve Oudot, and Frederic Chazal. Persistence-based pooling for shape pose recognition. In 6th International Workshop on Computational Topology in Image Context (CTIC 2016), 2016.
• [13] Jose Bouza. tda-tools.
• [14] Peter Bubenik. Statistical topological data analysis using persistence landscapes. Journal of Machine Learning Research, 16:77–102, 2015.
• [15] Peter Bubenik, Gunnar Carlsson, Peter T. Kim, and Zhi-Ming Luo. Statistical topology via Morse theory persistence and nonparametric estimation. In

Algebraic Methods in Statistics and Probability II

, volume 516 of Contemp. Math., pages 75–92. Amer. Math. Soc., Providence, RI, 2010.
• [16] Peter Bubenik and Pawel Dlotko. A persistence landscapes toolbox for topological statistics. Journal of Symbolic Computation, 78:91 – 114, 2017.
• [17] Peter Bubenik and Peter T. Kim. A statistical approach to persistent homology. Homology, Homotopy Appl., 9(2):337–362, 2007.
• [18] Peter Bubenik and Jonathan A. Scott. Categorification of persistent homology. Discrete Comput. Geom., 51(3):600–627, 2014.
• [19] Peter Bubenik and Tane Vergili. Topological spaces of persistence modules and their properties. Journal of Applied and Computational Topology, page 26pp., (accepted).
• [20] Mathieu Carrière, Marco Cuturi, and Steve Oudot. Sliced Wasserstein Kernel for Persistence Diagrams. arXiv:1706.03358 [cs.CG], 2017.
• [21] Mathieu Carrière, Steve Y. Oudot, and Maks Ovsjanikov. Stable topological signatures for points on 3d shapes. Computer Graphics Forum, 34(5):1–12, 2015.
• [22] Frédéric Chazal, David Cohen-Steiner, Marc Glisse, Leonidas J. Guibas, and Steve Y. Oudot. Proximity of persistence modules and their diagrams. In Proceedings of the 25th annual symposium on Computational geometry, SCG ’09, pages 237–246, New York, NY, USA, 2009. ACM.
• [23] Frédéric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Bertrand Michel, Alessandro Rinaldo, and Larry Wasserman. Subsampling methods for persistent homology. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, volume 37. JMLR: W&CP, 2015.
• [24] Frédéric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Aarti Singh, and Larry Wasserman. On the bootstrap for persistence diagrams and landscapes. Modeling and Analysis of Information Systems, 20(6):96–105, 2014.
• [25] Frédéric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, and Larry Wasserman. Stochastic convergence of persistence landscapes and silhouettes. J. Comput. Geom., 6(2):140–161, 2015.
• [26] Yen-Chi Chen, Daren Wang, Alessandro Rinaldo, and Larry Wasserman. Statistical analysis of persistence intensity functions. arXiv:1510.02502 [stat.ME], 2015.
• [27] Ilya Chevyrev, Vidit Nanda, and Harald Oberhauser. Persistence paths and signature features in topological data analysis. arXiv:1806.00381 [stat.ML], 2018.
• [28] D. R. Chittajallu, N. Siekierski, S. Lee, S. Gerber, J. Beezley, D. Manthey, D.