1 Introduction: persistence is an isometry invariant
Topological Data Analysis (TDA) was pioneered by Claudia Landi  (initially called size theory) and Herbert Edelsbrunner et al. . The important papers by Gunnar Carlsson , Robert Ghrist  and Shmuel Weinberger  were followed by substantial developments by Fred Chazal  and many others. Persistent homology is a key tool of TDA  and is invariant up to isometry (transformations which maintain inter-point distances).
The famous stability theorem  states that under bounded noise, the bottleneck distance between persistence diagrams of a point set and its perturbation has an upper bound dependent on the magnitude of the perturbation. As such, a small perturbation of a point set results in at most a small change in its corresponding persistent homology.
However, there is no lower bound, which means that a perturbation of a point set can result in the corresponding persistent homology remaining unchanged. This is an issue for applications that require invariants to reliably distinguish point sets up to isometry or similar equivalence relations such as rigid motion or uniform scaling. A uniform scaling also scales persistence, but any non-uniform scaling or a more general continuous deformation of data changes persistence rather arbitrarily. Hence any persistence-based approach analyses data only up to isometry, which is an important equivalence due to the rigidity of many real-life structures, see Fig. 1.
The above facts motivate a comparison of persistent homology with other isometry invariants of point sets. One key application is for rigid periodic crystals whose isometry classification was recently initiated by Herbert Edelsbrunner et al.. A periodic crystal is naturally modeled by a periodic set of points representing atomic centres.
This paper describes how a point sequence of arbitrary size can be added to any finite point set whilst leaving the one-dimensional persistent homology unchanged, and goes on to define a large continuous family of point sets that have trivial persistence. We focus on one-dimensional persistent homology of point sets in any dimension, computed using filtrations of simplicial complexes including Vietoris-Rips, Cech and Delaunay complexes.
Our main Theorem 4.3 and Corollary 4.4 build large open spaces of point sets that all have identical 1D persistence. With similar aims, Curry et al. described spaces of Morse functions on the interval  and the sphere  that have identical persistence. The experiments use the Vietoris-Rips filtration whose persistence is implemented by Ripser. Higher dimensional persistent homology will be included in future updates.
Section 4 introduces definitions and proves auxiliary lemmas needed for our main Theorem 4.3, which describes how, given a finite point set, we can add an arbitrarily large point set without affecting the one-dimensional persistent homology. Section 5 summarises large-scale experiments that reveal interesting information on the prevalence, or more likely lack, of significant persistent features occurring in randomly generated point sets.
2 Three classes of edges important for 1D persistence
This section introduces three classes of edges (short, medium and long) that will help build point sets with identical 1D persistence. Since persistent homology can be defined for any filtration of simplicial complexes on an abstract finite set , the most general settings are recalled in Definition 2.1. Definition 2.2 describes the more explicit filtrations of Vietoris-Rips, Čech and Delaunay complexes on a finite set in any metric space (or just in for Delaunay complexes).
Definition 2.1 (Filtration of complexes ).
Let be any abstract finite set.
(a) A (simplicial) complex on is a finite collection of subsets (called simplices) such that all subsets of and any intersection of simplices are also simplices of .
(b) The dimension of a simplex consisting of points is . We assume that all points of are 0-dimensional simplices, sometimes called vertices of . A 1-dimensional simplex (or edge) between points is the unordered pair denoted as .
(c) An (ascending) filtration is any family of simplicial complexes on the vertex set , paremeterised by a scale such that for all .
Let be any metric space with a distance satisfying all metric axioms. For any points , the edge has length . An example of a metric space is with the Euclidean metric. If , the edge can be geometrically interpreted as the straight-line segment connecting the points .
Definition 2.2 introduces the simplicial complexes and on any finite set inside an ambient metric space , although is possible. For a point and , let denote the closed ball with centre and radius .
A Delaunay complex will be defined for a finite set because of extra complications arising if a point set lives in a more general metric space .
Definition 2.2 (Complexes ).
Let be any finite set of points. Fix a scale . Each complex below has vertex set .
(a) The Vietoris-Rips complex has all simplices on points whose pairwise distances are all at most , so for all distinct .
(b) The Čech complex has all simplices on points such that the full intersection is not empty.
(c) For any finite set of points , the convex hull of is the intersection of all closed half-spaces of containing . Each point has the Voronoi domain
The Delaunay complex has all simplices on points such that the intersection is not empty . Alternatively, a simplex on points is a Delaunay simplex if: (a) the smallest -dimensional sphere passing through has a radius at most ; (b) there is also an -dimensional sphere passing through that does not enclose any points of .
In a degenerate case, the smallest -dimensional sphere above can contain more than points of . If is enlarged to the convex hull of all points , then becomes a polyhedral Delaunay mosaic . For simplicity, we choose any triangulation of into Delaunay simplices. Then a Delaunay complex is a subset of a Delaunay triangulation of the convex hull of , which is unique in general position.
The complexes of the three types above will be called geometric complexes for brevity.
Both complexes and are abstract and so are not embedded in , even if . Though is embedded into , its construction is fast enough only in dimensions . For high dimensions or any metric space , the simplest complex to build and store is . Indeed, is a flag complex determined by its 1-dimensional skeleton so that any simplex of is built on a complete subgraph whose vertices are pairwise connected by edges in .
The key idea of TDA is to view any finite set through lenses of a variable scale . When is increasing from the initial value 0, the points of become blurred to balls of radius and may start forming topological shapes that ‘persist’ over long intervals of . More formally, for any fixed , the union of closed balls is homotopy equivalent to the Čech complex and also to the Delaunay complex by the Nerve Lemma [17, Corollary 4G.3].
For any geometric complex from Definition 2.2, all connected components of are in a 1-1 correspondence with all connected components of the union . Any edge enters when equals the edge’s half-length .
Definition 2.3 (Short, medium, long edges in a filtration).
Let be any filtration of complexes on an abstract finite vertex set , see Definition 2.1. Let an edge between points enter the simplicial complex at scale .
(a) Consider the 1-dimensional graph with vertex set and all edges from except the edge . If the endpoints of are in different connected components of , then the edge is called short in the filtration .
(b) The edge is called long in if has a vertex such that the 2-simplex is contained in and both edges are in for some .
(c) If is neither short nor long, then the edge is called medium in .
Definition 2.3(b) implies that any long edge enters with a 2-simplex at the same scale and the boundary of this 2-simplex is homologically trivial in due to the other two edges that entered the filtration at a smaller scale .
Lemma 2.4 (Long edges in and ).
Let be a finite set in a metric space.
(a) In the Vietoris-Rips filtration , an edge is long if and only if the set has a point such that is a strictly longest edge in the triangle .
(b) In the Čech filtration , an edge is long if and only if includes a triangle such that the edge is strictly longest in and the triple intersection is not empty for .
For both filtrations, an edge enters when . By Definition 2.3(b), a long edge enters together with a 2-simplex . Since the other two edges entered the filtration at a smaller scale, the edge is longest in . For the Čech filtration, the triple intersection should be non-empty to guarantee that includes the 2-simplex , see Definition 2.2(b). ∎
For any 3-point set , let the edges of have lengths . By Definition 2.3, in the edge is long whilst the edges are short. If , then the edge is short but both edges are medium, not long. If , then all three edges are medium.
Let be any geometric complex from Definition 2.2 on a finite set . If consists of four vertices of the unit square, all square sides are medium whilst both diagonals are long. If consists of four vertices of a rectangle that is not a square, the two shorter sides are short, the longer sides are medium and both diagonals are long.
Proposition 2.6 (Three classes of edges).
For any finite set and a filtration from Definition 2.2, all edges are split into three disjoint classes: short, medium, long.
Lemma 2.7 (Circumdisk of a triangle).
Let two triangles , lie on the same side of a common edge . If , for example, if is acute and is non-acute, then the open circumdisk of contains , see Fig. 3 (left).
Let the infinite ray from via meet the circumcircle of at a point . We have equal angles whose sine is , where is the radius of . Since , the point is inside the edge , hence enclosed by . ∎
Lemma 2.8 (Long edges in ).
Let and be a finite set. An edge in the Delaunay complex is long by Definition 2.3(b) if and only if
(a) includes a triangle whose angle at is non-acute or, equivalently,
(b) the set has a point whose angle in the triangle is non-acute.
(a) By Definition 2.3(b) the edge is long if the Delaunay complex includes a triangle whose edge is strictly longest (hence the opposite angle at is strictly largest) and for . Since the intersection is the mid-point of the edge , the triple intersection above is non-empty if and only if . Equivalently, the circumcentre of lies non-strictly outside the triangle or the angle at in is non-acute.
(b) Due to part (a), it suffices to prove that if we have any triangle with a non-acute angle at , we can find such a triangle within . Assume the contrary that all triangles in containing have only acute angles opposite to .
For , the edge can have one or two Delaunay triangles whose edge is on the boundary or inside the convex hull of , see the first two pictures of Fig. 3. In both cases by Lemma 2.7, the above point with a non-acute angle opposite to should be inside the circumdisk of one of these triangles having an acute angle at the vertex opposite to . Then the triangle cannot be in by Definition 2.2(c).
For , consider all -dimensional Delaunay simplices containing the edge . The above point lies (non-strictly) between a pair of successive -dimensional subspaces spanned by two such simplices with common edge . Let be the circumball of the -dimensional simplex with faces .
Choose a 1-parameter family of 2-dimensional planes , , rotating around from to so that and are Delaunay triangles, while one intermediate plane contains with a non-acute angle opposite to , see Fig. 3 (right). By the assumption, both have acute angles opposite to . The circumdisk of each has radius , where is the radius of and is the distance from the centre of to . Then varies from to over , possibly with a maximum corresponding to the plane passing through , so for .
By the sine theorem in each , the angle opposite to has . Since both and have acute angles, the lower bound implies that is acute for all . For the intermediate plane containing the vertex , by Lemma 2.7 the circumdisk of should include the point because is non-acute and is acute.
We get a contradiction with Definition 2.2(c) because the open circumball of the Delaunay simplex includes an extra point . ∎
3 Trivial persistence and tails without medium edges
As usual in TDA, we consider homology groups with coefficients in a field, say .
Proposition 3.1 (No medium edges trivial ).
For any filtration on a finite abstract set from Definition 2.1, when a scale is increasing, a new homology cycle in can be created only due to a medium edge in . Hence, if has no medium edges, then is trivial for .
When building the given complex , if we add a short edge , by Definition 2.3(a), the two previously disconnected components of containing the endpoints of become connected. Hence no 1-dimensional cycle in is created.
By Definition 2.3(b) any long edge enters strictly after two edges , and at the same time as the 2-simplex . Any closed cycle including the new edge is homologically equivalent to the cycle with replaced by the 2-chain . If has other edges of the same length as , the endpoints of are connected by the complementary path , so each cannot be short by Definition 2.3(a), . If any edge , is long by Definition 2.3(b), then can be similarly replaced by a 2-chain of earlier edges.
In all cases, the cycle is homologically equivalent to a cycle in for a smaller . So a long edge cannot create a new class in . Since only medium edges lead to non-trivial cycles, if has no medium edges, then is trivial. ∎
Definition 3.2 (Tail of points).
For a fixed filtration on a finite abstract set from Definition 2.1, a tail is any ordered sequence , where is the vertex of , any edge between successive points is short, and any edge between non-successive points is long for any .
Proposition 3.3 (Tails have trivial ).
If vectors are not explicitly specified, all edges and straight lines are unoriented. We measure the angle between unoriented straight lines as their minimum angle within.
Definition 3.4 (Angular deviation from a ray in ).
In , a ray is any half-infinite line going from a point called the vertex of . For any sequence of ordered points in , the angular deviation of relative to the ray is the maximum angle over all distinct points .
Lemma 3.5 (Tails in ).
Let be a ray with vertex and be any sequence of points with angular deviation .
(a) For any with , the angle is non-acute. The edge between the non-successive points is long in any filtration in Definition 2.2.
(b) Any edge between successive points , , is short in .
Hence has no medium edges in the filtration and is a tail by Definition 3.2.
(a) The condition implies that all points of are ordered by their distance from the vertex to their orthogonal projections to the line through .
Apply a parallel shift to the points so that . In the triangle , the angle is non-acute, hence strictly largest, due to . The edge is long in any filtration by Definition 2.3(b) and (for the Delaunay filtration) Lemma 2.8(b). In particular, the edge is longer than both edges and .
(b) The points remain in disjoint components of after adding all other edges of length . Indeed, we proved above that any other edge connecting points for is longer than the edge between successive points. ∎
Definition 3.6 (Angular thickness ).
Let be a ray with vertex and be any finite sequence of points. The angular thickness of with respect to is the maximum angle over .
4 Persistence for long wedges and with added tails
Definition 4.1 (Long wedges).
For any filtration of simplicial complexes from Definition 2.1, the 1D persistence diagram of this filtration is denoted by .
Theorem 4.2 (Persistence of long wedges).
For any filtration of a long wedge from Definition 4.1, the 1D homology group of the filtration at a given is the direct sum: . Hence the 1D persistence diagram is the union of the 1D persistence diagrams for .
The inclusions induce the homomorphism of the 1D homology groups whose bijectivity follows below. Any long edge in a complex can be replaced by a chain of two edges in for some due to a 2-simplex included into by Definition 2.3(b). Continue applying these replacements until any cycle of edges in becomes homologous to a sum of cycles in , . ∎
Theorem 4.3 (A long wedge with a tail).
Let be any finite set, be a point on the boundary of the convex hull of , and be a ray with vertex so that . Let be any tail with vertex for a filtration , see Definition 3.2. If , then ,
For any points and , we get the non-acute angle
If are in the same half-plane bounded by the line through , the first inequality above becomes equality. Otherwise, .
Corollary 4.4 (Trivial 1D persistence).
If a set in a metric space has , then any long wedge with a tail also has .
5 Experiments, discussion and future work
The experiments in this section increase the understanding of how regularly persistent homology reveals persistent features separated from noise. The experiments depend on two parameters, the size of a point set, and the dimension that the point set lies in. For each in the ranges chosen, we generate 1000 point sets of points uniformly sampled in a unit -dimensional cube.
Figure 5 shows histograms of persistence (deathbirth) of the one-dimensional features for nine configurations of the parameters: point set sizes and dimensions
. Each histogram highlights that the overwhelming majority of one-dimensional persistent features are skewed towards a low persistence, namely less than 10% of the unit cube size. Geometrically, the corresponding dots (birth,death) would be close to the diagonal in a persistence diagram.
Recall that highly persistent features (birth,death) are naturally separated from others with lower persistence deathbirth by the widest diagonal gap in the persistence diagram, see . If we order all pairs (birth,death) by their persistence , the widest gap has the largest difference over . This widest gap can separate several pairs (birth,death) from the rest, not necessarily just a single feature. However, the first widest gap is significant only if it can be easily distinguished from the second widest gap.
So the significance of persistence can be measured as the ratio of the first widest gap over the second widest gap. This invariant up to uniform scaling of given data is called the gap ratio. Figure 6 shows the median gap ratio calculated over 1000 random point clouds in a unit cube for many dimensions and point set sizes .
Figure 6 implies that for higher dimensions N, the median gap ratio quickly decreases to within the range [1,2] as the number of points is increasing. Hence, when a persistence diagram contains at least two pairs (birth,death) above the diagonal, it is becoming harder to separate highly persistent features from noisy artefacts close to the diagonal.
The future updates will include similar experiments for filtrations of Čech complexes and Delaunay complexes. In conclusion, our main Theorem 4.3 describes how we can add an arbitrarily large point set to an existing point set without affecting the one-dimensional persistent homology, whilst Corollary 4.4 states how we can form a large continuous family of sets with trivial 1D persistence, implying that the bottleneck distance between persistence diagrams has no lower bound. We plan further experiments to check how well the bottleneck distance separates point clouds from their perturbations.
Other continuous isometry invariants [22, 21] of finite and periodic point sets are complete in general position, hence distinguish almost all sets in . All counter-examples  to the completeness of past invariants were distinguished in [21, appendix C]. These latest invariants are based on the -nearest neighbour search, a classical problem in Computer Science, which has near-linear time algorithms in the number of points [13, 14].
This research was supported by the £3.5M EPSRC grant ‘Application-driven Topological Data Analysis’ (2018-2023, EP/R018472/1), the £10M Leverhulme Research Centre for Functional Materials Design (2016-2026) and the last author’s Royal Academy of Engineering Fellowship ‘Data Science for Next Generation Engineering of Solid Crystalline Materials’ (2021-2023, IF2122/186).
-  (2021) Ripser: efficient computation of vietoris-rips persistence barcodes. Journal of Applied and Computational Geometry 5 (3), pp. 391–423. External Links: Cited by: §1.
-  (2022) Continuous and discrete radius functions on voronoi tessellations and delaunay mosaics. Discrete & Computational Geometry, pp. 1–32. Cited by: Definition 2.2.
-  (2018) An obstruction to delaunay triangulations in riemannian manifolds. Discrete & Comp. Geometry 59 (1), pp. 226–237. Cited by: §2.
-  (2009) Topology and data. Bulletin of the American Mathematical Society 46 (2), pp. 255–308. Cited by: §1.
-  (2020) Moduli spaces of morse functions for persistence. Journal of Applied and Computational Topology 4 (3), pp. 353–385. Cited by: §1.
-  (2016) The structure and stability of persistence modules. Springer. Cited by: §1.
-  (2005) Stability of persistence diagrams. Discrete & Computational Geometry - DCG 37 (), pp. 263–271. External Links: Cited by: §1.
-  (2018) The fiber of the persistence map for functions on the interval. Journal of Applied and Computational Topology 2 (3), pp. 301–321. Cited by: §1.
-  (1934) Sur la sphere vide. Izv. Akad. Nauk USSR 7, pp. 793–800. Cited by: Definition 2.2.
-  (2008) Persistent homology - a survey. Discrete & Computational Geometry - DCG 453 (), pp. . External Links: Cited by: §1.
-  (2021) The density fingerprint of a periodic point set. In Proceedings of SoCG, pp. 32:1–32:16. Cited by: §1.
-  (2000) Topological persistence and simplification. In Proceedings 41st annual symposium on foundations of computer science, pp. 454–463. Cited by: §1.
-  (2021) A new compressed cover tree guarantees a near linear parameterized complexity for all -nearest neighbors search in metric spaces. arXiv:2111.15478. Cited by: §5.
-  (2022) Paired compressed cover trees guarantee a near linear parametrized complexity for all -nearest neighbors search in an arbitrary metric space. arXiv:2201.06553. Cited by: §5.
Size theory as a topological tool for computer vision. Pattern Recognition and Image Analysis 9 (4), pp. 596–603. Cited by: §1.
-  (2008) Barcodes: the persistent topology of data. Bulletin of the American Mathematical Society 45 (1), pp. 61–75. Cited by: §1.
-  (2001) Algebraic topology. Cambridge University Press. Cited by: §2.
-  (2020) Incompleteness of atomic structure representations. Phys. Rev. Lett. 125, pp. 166001. External Links: Cited by: §5.
-  (2021) Skeletonisation algorithms with theoretical guarantees for unorganised point clouds with high levels of noise. Pattern Recognition 115, pp. 107902. Cited by: §5.
-  (2011) What is… persistent homology?. Notices of the AMS 58 (1), pp. 36–39. Cited by: §1.
-  (2021) Pointwise distance distributions of periodic sets. arXiv:2108.04798 (early draft). External Links: Cited by: §5.
-  (2022) Average minimum distances of periodic point sets. MATCH Communications in Mathematical and in Computer Chemistry 87, pp. 529–559. External Links: Cited by: §5.