A Family of Density-Scaled Filtered Complexes

12/06/2021
by   Abigail Hickok, et al.
0

We develop novel methods for using persistent homology to infer the homology of an unknown Riemannian manifold (M, g) from a point cloud sampled from an arbitrary smooth probability density function. Standard distance-based filtered complexes, such as the Čech complex, often have trouble distinguishing noise from features that are simply small. We address this problem by defining a family of "density-scaled filtered complexes" that includes a density-scaled Čech complex and a density-scaled Vietoris–Rips complex. We show that the density-scaled Čech complex is homotopy-equivalent to M for filtration values in an interval whose starting point converges to 0 in probability as the number of points N →∞ and whose ending point approaches infinity as N →∞. By contrast, the standard Čech complex may only be homotopy-equivalent to M for a very small range of filtration values. The density-scaled filtered complexes also have the property that they are invariant under conformal transformations, such as scaling. We implement a filtered complex DVR that approximates the density-scaled Vietoris–Rips complex, and we empirically test the performance of our implementation. As examples, we use DVR to identify clusters that have different densities, and we apply DVR to a time-delay embedding of the Lorenz dynamical system. Our implementation is stable (under conditions that are almost surely satisfied) and designed to handle outliers in the point cloud that do not lie on M.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

02/17/2016

Density-based Denoising of Point Cloud

Point cloud source data for surface reconstruction is usually contaminat...
08/14/2020

Learning Gradient Fields for Shape Generation

In this work, we propose a novel technique to generate shapes from point...
07/20/2020

Approximating the Riemannian Metric from Point Clouds via Manifold Moving Least Squares

The approximation of both geodesic distances and shortest paths on point...
02/22/2016

Implicit LOD for processing, visualisation and classification in Point Cloud Servers

We propose a new paradigm to effortlessly get a portable geometric Level...
09/15/2021

Graph skeletonization of high-dimensional point cloud data via topological method

Geometric graphs form an important family of hidden structures behind da...
11/24/2021

Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion

Chamfer Distance (CD) and Earth Mover's Distance (EMD) are two broadly a...
10/06/2020

Scalable Rendering of Variable Density Point Cloud Data

In this paper, we present a novel proxy-based method of the adaptive hap...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Data in Euclidean space often lie on (or near) a lower-dimensional submanifold . For example, images with many pixels are high-dimensional, but image libraries are often locally parameterized by many fewer dimensions isomap . In chemistry, the conformation space of a molecule may be a manifold or a union of manifolds cyclo . In topological data analysis (TDA), one considers the following question: given a finite sample of points (a point cloud) that lies on or near , what can one infer about the topology (i.e., “global structure”) of ? TDA has been used to study the global structure of data sets in a variety of fields (see, e.g., cortical ; topaz ; materials

). Researchers have also made significant progress towards using the geometric properties of the manifold for dimensionality reduction and data visualization

isomap ; laplacian_eigmap ; hessian_eigenmap ; LLE .

We focus on inferring the homology of . Homology is a quantitative way of characterizing the topology of . For example, the rank of the -dimensional homology is the number of connected components, and the rank of for is the number of -dimensional holes in . If is compact and orientable, then the dimension of is equal to the largest such that is nontrivial. For example, if is the -torus , then there is one connected component, there are two -dimensional holes, and there is one -dimensional hole. Although homology does not uniquely identify a manifold, it provides useful information about a manifold’s global structure, and the homology of a manifold can be used to distinguish it from other manifolds that have different homology.

Methods from persistent homology (PH) can be used to infer the homology of from a point cloud that is sampled from . To approximate the manifold, we construct a filtered complex, a combinatorial description of a topological space (see Definition 1). One of the classical approaches to building a filtered complex is the Čech complex . At each point , one places a ball of radius , where is the filtration level. A -simplex with vertices is added to if the intersection is nonempty. The Nerve Theorem guarantees that is homotopy-equivalent to . The PH of , which we denote by , records how the homology of changes as increases. As grows, new homology classes (which represent -dimensional holes) are “born” and old homology classes “die”.

Conventional wisdom holds that the homology classes with the longest lifetimes are true topological features of and that the homology classes with the shortest lifetimes are noise. However, one can easily observe that this is not always true, even in simple examples such as Figure 1, in which the point cloud is sampled from the disjoint union of two circles of different sizes. The smaller circle represents a homology class that has a much shorter lifetime than the homology class for the larger circle, but both homology classes are true topological features. We visualize this in Figure 1, in which the balls of the Čech complex fill in the smaller circle much earlier than they fill in the larger circle.

(a)
(b)
(c)
(d)
(e)
(f)
Figure 1: The point cloud consists of points that are sampled from the disjoint union of two circles and with radii and . With probability we sample uniformly at random from , and with probability we sample uniformly at random from . For increasing , we display the balls of radius in the Čech complex at filtration level . At , we have the point cloud itself. The smaller circle is filled in immediately at the next step, , but the larger circle is not filled in until .

Following the conventional wisdom, the homology class for the smaller circle might be recorded spuriously as noise. Problems with the conventional wisdom have been noted in many other papers, such as roadmap ; pers_images ; bendich2016 ; stolz2017 ; Bubenik2020 ; feng2021 .

In general, standard distance-based filtered complexes (such as the Čech complex) depend largely on “topological feature sizes,” by which we mean the following concept, introduced in feature_size . The medial axis of a submanifold in is the closure of

The local feature size at , denoted by , is the distance from to the medial axis. The condition number of is equal to , where . For example, if is an -sphere in , then the medial axis is the center of the sphere and the local feature size at any point on the sphere is the radius of the sphere. Niyogi et al. showed that is homotopy-equivalent to when and is sufficiently dense in weinberger . However, whenever is small, the Čech complex may only be homotopy-equivalent to for a very small range of filtration values , even as the number of points sampled from the manifold approaches infinity.

Standard distance-based filtered complexes may perform especially poorly when contains features of different sizes, even if the smallest features have “high resolution” in the point cloud (i.e., the density of points is inversely proportional to the local feature size). For example, consider again the point cloud in Figure (a)a, sampled from the disjoint union of two circles and of different radii. (With probability we sample uniformly at random from , and with probability we sample uniformly at random from .) The product of the probability density function and the local feature size is a constant function; in that sense, the two circles have equally high resolution. However, the corresponding homology classes do not have equally high persistence in the PH of standard filtered complexes.

The dependence on topological feature size is because persistent homology is not a topological invariant. The topology of a manifold is invariant under homeomorphism, but standard distance-based filtered complexes (such as the Čech complex) are not invariant under homeomorphism. More precisely, suppose is a homeomorphism of manifolds and is a point cloud in . The manifolds and are homeomorphic, but and are not necessarily isomorphic (see Definition 9). Indeed, the bottleneck distance between the persistence diagrams (see Section 2.2) for and can be arbitrarily large111For example, consider the scaling homeomorphism defined by for some . For any point cloud with more than one point, the bottleneck distance between and approaches infinity as .. Therefore, the standard Čech complex depends on geometric properties such as size. Standard distance-based filtered complexes are closer to geometric tools than topological tools.

1.1 Contributions

We work in a probabilistic setting. We suppose that is an -dimensional Riemannian manifold and that the point cloud consists of points sampled from a smooth probability density function . It is important that is nonzero everywhere because we cannot observe regions of the manifold where equals zero. The Riemannian metric is necessary because it turns the manifold into a metric space and induces a volume form . We define the probability measure to be , where is a Borel set Rman_stats . We note that all manifolds can be endowed with a Riemannian metric (see Section 2.3), so the requirement of a Riemannian metric is not a restriction on the types of manifolds we can study.

We construct a family of “density-scaled filtered complexes” by modifying the metric such that we effectively shrink the distances between points in sparse regions of the manifold and enlarge the distances between points in dense regions of the manifold. To do this, we define a conformally equivalent metric , where is a scaling factor that we define in Section 3.1. Our scaling factor plays an important role in the convergence property that we prove in Section 4 and discuss below. The metric is defined such that the points in

are uniformly distributed with respect to the volume form

in and such that the balls grow at a slower rate when is larger. We can then apply any existing distance-based filtered complex (such as the Čech complex) in the density-scaled Riemannian manifold .

We show that our density-scaled filtered complexes have two important properties that other filtered complexes do not have:

  1. Convergence: As , the interval of filtration values for which the density-scaled Čech complex is homotopy-equivalent to the manifold grows to in probability, no matter the condition number of or any other geometric property of . (We make this statement precise in Theorem 4.1.) This means that in the PH of the density-scaled Čech complex, one can interpret the homology classes with the smallest birth times and longest lifetimes as the most important features.

  2. Conformal invariance: We show that our density-scaled filtered complexes are invariant under conformal transformations (Theorem 5.1). This means that in contrast to standard complexes, our density-scaled complexes are closer to topological tools and do not depend as much on local feature sizes.

These properties improve our ability to infer the homology of from a point cloud and make it easier to compare the PH of point clouds sampled from different manifolds of possibly different scales.

We implement a filtered complex

that approximates the density-scaled Vietoris–Rips complex. We do this by estimating the density

via kernel-density estimation and estimating Riemannian distances in a similar way as the widely-used Isomap algorithm

isomap . The implementation requires knowledge of the intrinsic dimension

of the manifold, which can be estimated using methods such as local principal component analysis

local_PCA ; intrinsic_dim , the conical dimension estimator conical_dimestimate , the ball expansion rate ball_dimestimate , or the doubling dimension doubling_dimestimate . We prove that our implementation is stable (Theorem 7.4): under suitable conditions that are almost surely satisfied, small perturbations of the input point cloud result only in small changes to the persistence diagram of . Consequently, it is still reasonable to use even when does not lie exactly on the manifold or when there is a small amount of noise in the data. The implementation is designed to handle outliers in the data; in Section 6.2 we discuss how this is done, and in Section 8.3 we test the empirical performance of on a point cloud with outliers. As applications, we use to count the number of clusters in a point cloud whose clusters have different densities (Section 8.4) and the number of equilibrium points in the Lorenz dynamical system from a time-delay embedding (Section 8.5).

1.2 Related Work

Perhaps the most common TDA-based approach to nonuniform data is the

-nearest neighbor (KNN) filtration (see Appendix

9.1.1). This is related to the density-scaled filtrations by the fact that if is the th nearest neighbor of , then converges in probability to a value that is proportional to as , for a choice of that depends on . (See knn_density for a precise statement.) However, the KNN filtration encounters problems when there are regions of the manifold that are close in Euclidean distance but far in Riemannian distance, especially if those regions differ in density. We discuss one example in Section 8.4; several other examples of KNN failures are given in continuous_knn . In continuous_knn , Berry and Sauer constructed a modification of the -nearest neighbors graph (the continuous -nearest neighbors graph) whose unnormalized graph Laplacian converges to the Laplace–Beltrami operator of a slightly different density-scaled Riemannian manifold. (Their density-scaled metric is , where is the original metric.) The authors of continuous_knn proved that the connected components of their graph were consistent with the components of the manifold. They left as conjecture the hypothesis that their graph was topologically consistent (i.e., that the -dimensional homology of the clique complex of their graph converges to the -dimensional homology of the manifold for ).

A qualitatively different family of density-scaled metrics was considered in fermat_tda . For parameter , the density-scaled metric in fermat_tda is . The Riemannian distance induced by the density-scaled metric of fermat_tda is called the Fermat distance fermat . The Fermat distance effectively enlarges the distances between points in sparse regions of the manifold and shrinks the distances between points in dense regions of the manifold; by contrast, the density-scaled metric in the present paper does the opposite.

The density-scaled complexes in the present paper are also reminiscent of weighted complexes weightedPH . (See Appendix 9.1.2 for a review of weighted complexes.) In a weighted Čech complex, the radius of a ball is a function of the filtration parameter and the point at which the ball is centered222The radius function need not depend on density; more typically, the weight is determined by some intrinsic property of the point. For example, in weightedPH , a point cloud that represented the positions of image pixels had weights that were given by pixel intensity.. Weighted Vietoris–Rips complexes are defined analogously. One can define a “density-weighted” radius function

(1)

from which one can define a density-weighted Čech complex and a density-weighted Vietoris–Rips complex. The main advantage of our density-scaled complexes over the density-weighted complexes is that our complexes are more robust with respect to noise. Specifically, if is an outlier in a low-density region, then the ball grows quickly in radius and may engulf balls in high-density regions. This problem can occur even if all the data points lie exactly on the manifold . If is a high-density region, then balls centered at points grow quickly in radius and may engulf points in low-density regions of . In Sections 8.3 and 8.4, we calculate examples and discuss these problems in more detail.

Other density-based filtrations, such as the distance-to-measure (DTM) sublevel filtration dtm_tda and the density sublevel filtration kde_sublevel , are primarily designed for the purpose of noise filtering. Such methods assume that the regions of highest density are the true features of the manifold. For example, consider the point cloud of Figure (a)a again. In these other density-based filtrations, it is the smaller circle whose corresponding homology class has a much longer lifetime in the persistent homology. In our density-scaled filtration, the two circles have equal lifetimes in the persistent homology, which reflects the fact that they have equally high “resolution” in the point cloud.

1.3 Organization

The rest of the paper is organized as follows. In Section 2, we review background from TDA and Riemannian geometry. In Section 3, we introduce our family of density-scaled filtered complexes, including definitions for a density-scaled Čech complex () and a density-scaled Vietoris–Rips complex (DVR). We discuss convergence properties in Section 4 and invariance properties in Section 5. In Section 6, we discuss our algorithm for the implementation of a filtered complex that approximates the density-scaled Vietoris–Rips complex. We prove the stability of our density-scaled complexes (including a stability theorem for ) in Section 7. In Section 8, we compute examples and compare to other filtered complexes. Finally in Section 9, we conclude and discuss some avenues for future research. The code used in this paper is available at https://bitbucket.org/ahickok/dvr/src/main/.

2 Background

2.1 Filtered Complexes

A comprehensive introduction to filtered complexes and TDA can be found in edel_book ; eat . Here we review the standard methods for building a filtered complex. Throughout this section, let denote a metric space and let denote a point cloud in . For any index set , let denote the simplex with vertices for all .

Definition 1

A filtered complex is a collection of simplicial complexes such that for all . We refer to as the filtration level.

Definition 2

The Čech complex is the filtered complex such that the set of simplices in at filtration level is

Equivalently, is the nerve of , where .

The Nerve Theorem provides theoretical guarantees for the Čech complex Borsuk .

Theorem 2.1 (Nerve Theorem)

If is either contractible or empty for all , then is homotopy-equivalent to .

In Euclidean space, all balls are convex (hence their intersections are contractible), and thus the Čech complex at filtration level is homotopy-equivalent to . In an arbitrary metric space, however, balls are not always convex. In a Riemannian manifold, is contractible only when is sufficiently small.

Computing the Čech complex is computationally intensive. In practice, researchers often compute the Vietoris–Rips complex instead, which requires only pairwise distances between the points.

Definition 3

The Vietoris–Rips complex is the filtered complex such that the set of simplices in at filtration level is

The Vietoris–Rips complex and the Čech complex share the same 1-skeleton. When the metric space is Euclidean space, the Vietoris–Rips complex and the Čech complex are related by the Vietoris–Rips lemma edel_book , which says that

for all filtration values . In addition to the Čech and Vietoris–Rips complexes, there are many other methods for constructing a filtered complex from a point cloud. We review other relevant filtered complexes in Appendix 9.1.

2.2 Persistence Modules

In this section, we define persistence modules, persistent homology, and persistence diagrams. We assume the reader is familiar with homology. (A good introduction to homology and algebraic topology is hatcher .) References for the rest of this subsection can be found in fundthm ; pers_modules .

A persistence module over

is a collection of vector spaces

with linear maps that satisfy the composition law for all . If is a filtered complex, the persistent homology of over a field is the persistence module , which we denote by . For all , the inclusion induces a linear map . We sometimes drop the field from our notation when a fixed field is chosen. (All calculations in Section 8 are done with , the default field used by the GUDHI software package.) As increases, new homology classes are born and old homology classes die.

The Fundamental Theorem of Persistent Homology, stated below, shows that we can decompose the persistence module in a way that yields a nice set of generators. If has a finite number of simplices for all (this condition holds for the Čech complex and the Vietoris–Rips complex), then there is a sequence such that for all . The direct sum has the structure of a graded module over the graded ring . The action of on a homogenous element is .

Theorem 2.2 (Fundamental Theorem of Persistent Homology fundthm )

The graded -module is isomorphic to

(2)

for some integers , , , where denotes an -shift upward in grading for any integer .

An summand corresponds to a homology class that is born at filtration level and never dies. An summand corresponds to a homology class that is born at filtration level and dies at filtration level . The information in a persistence module can be summarized by a persistence diagram, which is a multiset of points in the extended plane . Given a decomposition in the form of Equation 2, the persistence diagram includes the points for all , the points for all , and all points on the diagonal. The points on the diagonal are included for technical reasons; one can think of them as homology classes that die instantaneously. We denote the persistence diagram of a persistence module by . The bottleneck distance between two diagrams is defined to be

where the infimum is taken over all bijections .

2.3 Riemannian Geometry

We briefly review the necessary background from Riemannian geometry. For further reading, we recommend a textbook such as petersen . A Riemannian manifold is a smooth manifold with a Riemannian metric that defines a smoothly-varying inner product on each tangent space . More precisely,

is a 2-tensor field on

; to each , the Riemannian metric assigns a bilinear map on the tangent space . A Riemannian metric allows one to define the length of a vector to be . The length of a continuously differentiable path is defined to be .

A Riemannian manifold is a metric space. The distance between two points , in the same connected component of is

If is complete, then the infimum is achieved by a geodesic, a curve that locally minimizes length. If and are in different connected components, then their distance is infinite.

To see that all manifolds can be given a Riemannian metric, recall that all manifolds can be embedded into Euclidean space. Let be an embedding. The canonical Euclidean metric pulls back to a Riemmanian metric on . We call the Euclidean-induced Riemannian metric. On each tangent space , the metric is the restriction of to . A Riemannian metric induces a volume form , the unique -form on that equals on all positively oriented orthonormal bases. In local coordinates, the expression for the volume form is

With a volume form and a smooth probability density function , one can define a probability measure on the manifold. A good reference for probability and statistics on Riemannian manifolds is Rman_stats . The volume form induces a Riemannian measure on . The measure of a Borel set is , and the volume of is . The probability measure is defined to be

for Borel sets .

Two Riemannian metrics , on are conformally equivalent if there is a positive function such that

A conformal transformation is a diffeomorphism such that pulls back to (i.e., ) for some positive function . Conformal transformations preserve angles; one can think of a conformal transformation as a transformation that “locally scales” the manifold. For example, if is a submanifold of and has the Euclidean-induced Riemannian metric, then any global scaling is a conformal transformation.

A special type of conformal transformation is an isometry. An isometry of Riemannian manifolds is a diffeomorphism such that pulls back to (i.e., ). An isometry of Riemannian manifolds is an isometry of metric spaces in the usual sense (i.e., ).

3 Our Family of Density-Scaled Filtered Complexes

3.1 Our Density-Scaled Riemannian Manifold

Let be an -dimensional Riemannian manifold from which we sample points according to a smooth probability density function . We begin by defining a conformally-equivalent Riemannian metric such that the points are uniformly distributed in .

Definition 4

The density-scaled Riemannian metric is

(3)

where is a strictly positive function that satisfies

(4)

as , and where is the threshold filling factor defined in Equation 8 in Section 4.

In this paper, we set

which satisfies Equation 4. However, the convergence properties of Sections 4 hold for any choice of that satisfies the conditions of Equation 4, and the invariance and stability results in Sections 5 and 7 hold for any choice of strictly positive function .

The uniform probability measure on is for all Borel sets , where is the volume form on and is the volume of . Using local coordinates, we see that satisfies

Therefore because

This means that sampling points from with probability density function is equivalent to sampling points uniformly at random from .

3.2 Our Definition of a Density-Scaled Filtered Complex

Definition 5

Let be a Riemannian manifold, and let be a point cloud that consists of points sampled from a smooth probability density function . The density-scaled Cěch complex is the filtered complex

where is the Riemannian distance function in and is defined as in Equation 3. Equivalently, the set of simplices in at filtration level is

where .

Definition 6

Let be a Riemannian manifold, and let be a point cloud that consists of points sampled from a smooth probability density function . The density-scaled Vietoris–Rips complex is the filtered complex

where is the Riemannian distance function in and is defined as in Equation 3. Equivalently, the set of simplices in at filtration level is

More generally, one can define a density-scaled version of any distance-based filtered complex by applying the filtered complex to the point cloud in the metric space , where is the Riemannian distance function in the density-scaled manifold .

Definition 7

Let be a Riemannian manifold, and let be a point cloud that consists of points sampled from a smooth probability density function . If is a distance-based filtered complex, where denotes a metric space, then the density-scaled filtered complex is

where is the Riemannian distance function in and is defined as in Equation 3.

4 Convergence Properties of the Density-Scaled Čech Complex

In Theorem 4.1 below, we show that the density-scaled Čech complex is homotopy-equivalent to for an interval of filtration values that grows arbitrarily large in probability as . We begin by reviewing the relevant concepts. The convexity radius of a Riemannian manifold is

where and where is the Riemannian distance function in . If , the ball is geodesically convex (hence contractible). Furthermore, the intersection of geodesically convex balls is geodesically convex (hence contractible or empty). Let denote the density-scaled Riemannian metric when there are points, and let denote the convexity radius of . The coverage radius of a point cloud in a Riemannian manifold is

Let denote the coverage radius of a point cloud in .

Theorem 4.1

Let be a Riemannian manifold, and let be a point cloud that consists of points sampled from a smooth probability density function . If , then is homotopy-equivalent to . If is compact, then as . If is compact and connected, then in probability as .

Proof

If , then for all , the intersection is convex, so it is either contractible or empty. If , then . By the Nerve Theorem, is homotopy-equivalent to . The second statement of the theorem follows from Lemma 1 below, and the third statement of the theorem follows from Lemma 2 below.

Lemma 1

If is compact, then as .

Proof

The convexity radius of a compact manifold is positive (see, e.g., Proposition 20 in convexity_radius ). Therefore, , so because .

Now we turn to the coverage radius. The behavior of the coverage radius is controlled by the filling factor. On an -dimensional Riemannian manifold from which balls of radius are chosen uniformly at random, the filling factor is

(5)

where is the volume of a Euclidean unit -ball. For small , the filling factor approximates the number of points inside a ball of radius . Let be the number of balls of radius required to cover , assuming the balls are chosen uniformly at random. Let be the volume of a Euclidean -ball of radius . Define

Theorem 4.2 (Theorem 1.1 in flatto_newman )

Let be a compact, connected Riemannian manifold with unit volume. There are constants and , which do not depend on , such that if , then

Corollary 1

Let be a compact, connected Riemannian manifold. Suppose is a point cloud that consists of points sampled uniformly at random from . Suppose is a sequence such that and .

  1. If is such that , then as .

  2. If is such that , then as .

Proof

Case 1:

In this case, the structure of our proof is similar to that of Corollary B.2 in vanishing_homology . First, we observe that the radius of the balls can be expressed by

If is sufficiently large and , then . Let .

  1. If , then

    (6)

    by Theorem 4.2. Because , we have

    We expand the first term as to get

    Because , we have

    Therefore, as . By Equation 6, .

  2. If , then

    (7)

    by Theorem 4.2. Similarly to above, we have

    so as . By Equation 7, .

Case 2:

Let be the Riemannian manifold that is normalized to have unit volume. Let denote the filling factor for and let be the coverage radius for the point cloud in . In , the Riemannian distance function is . Therefore, for any , we have

When the radius of the balls in is , the filling factor in is

Applying Case 1 to completes the proof.

This shows that on a compact, connected Riemannian manifold from which points are sampled uniformly at random, there is a threshold filling factor

(8)

above which the balls are likely to cover and below which the balls are unlikely to cover . There is a corresponding threshold radius . The threshold radius on is

(9)

By Equation 4, we have that as .

Lemma 2

Let be a compact, connected Riemannian manifold, and let be a smooth probability density function from which points are sampled. Then in probability as . Moreover, in probability.

Proof

Let be a sequence such that and . Define the sequence of filling factors

and define to be

which is the radius that corresponds to a filling factor of on . Note that , where is defined as in Equation 9. Because , it must be true that .

Let . For sufficiently large , we have , so . Applying Corollary 1 proves the first statement of the lemma. For sufficiently large , we have

so

Applying Corollary 1 completes the proof.

5 Conformal Invariance

Let and be Riemannian manifolds, and let be a diffeomorphism. If is a smooth probability density function, then we can pull back to a probability density function as follows.

Definition 8 (Pullback of a Probability Density Function)

The pullback of under is the function such that . The probability density function exists because the space of -forms on an -dimensional manifold is spanned by .

The pullback of a probability density function is defined such that sampling a point cloud from is equivalent to sampling a point cloud from and setting .

Proposition 1

Suppose is sampled from and let . Suppose is sampled from , where is the pullback of defined by Definition 8. Then and are identically distributed.

Proof

If is a Borel set, then

Prop 1 justifies a comparison of to . Below, we define what we mean by an isomorphism of two filtered complexes and what we mean by invariance of a filtered complex.

Definition 9 (Isomorphism of Filtered Complexes)

Let and be filtered complexes, and let , be the sets of vertices and simplices, respectively, of . Let be the set of all vertices of . We say that and are isomorphic if there is a bijective map such that induces bijections and for all .

Definition 10 (Invariance)

Let and be Riemannian manifolds, and let be a diffeomorphism. A density-scaled complex is invariant under if is isomorphic to for all smooth probability density functions and point clouds sampled from , where is the pullback of defined by Definition 8.

We restrict ourselves to a suitable class of distance-based filtered complexes that are invariant under global isometry. This class includes the Čech complex, the Vietoris–Rips complex, and many other standard distance-based filtered complexes.

Definition 11 (Invariance Under Global Isometry)

Let and be metric spaces. A distance-based filtered complex is invariant under global isometry if is isomorphic to for all global isometries and all point clouds in .

Theorem 5.1 below shows in particular that the density-scaled Čech complex and the density-scaled Vietoris–Rips complex DVR are invariant under all conformal transformations. As a corollary, this implies that they are invariant under global scaling (Corollary 2). Additionally, they are invariant under diffeomorphisms of -dimensional manifolds (Corollary 3).

Theorem 5.1

Suppose that is a distance-based filtered complex that is invariant under global isometry, and let be the density-scaled filtered complex. Then