Topological Data Analysis (TDA) is an emerging field that considers the “shape” of data, and is gaining traction in a variety of applications [giusti2015clique, lee2017quantifying, rizvi2017single, lawson2019persistent, tymochko2019using, wang2019statistical]. In particular, TDA uses homological features of shape such as connected components, loops, -dimensional voids, etc., to extract information from data. These homological features can be described using a popular descriptor known as the persistence diagram (PD), which offers insight into the geometry and topology of the original data. In this paper, we consider the inverse problem of generating a set of diagrams111Note: we actually require the augmented persistence diagrams, which include all computed points, including the ones on the diagonal; see apd. that can be used to reconstruct the original data. This inverse problem in TDA has had many applications in the field of shape comparison and recognition, and the approach is gathering a lot of recent attention [crawford2016functional, turner2014persistent, hofer2017constructing, belton2018learning, ghrist2018euler, curry2018directions, belton2019reconstructing, betthauser2018topological, fasy2018challenges, oudot2018inverse]. Yet, a deterministic approach for computing the set of diagrams that can reconstruct a simplicial complex in has remained an open problem.
1.1 Existing Reconstruction Methods Using PDs
The inverse problem of recovering the underlying data from a set of PDs was first explored in 2014 when Turner et al. showed that the Persistent Homology Transform (PHT) and Euler Curve Transform (ECT) are injective from the space of simplicial complexes in and
into the space of persistence diagrams and Euler characteristic curves (ECCs), respectively. The PHT and ECT are functions that map each direction vector(for ) to the PD or ECC generated by the height filtration in direction . In this section, we highlight other research in PHT- and ECT-based reconstruction and place our work in this context.
Turner et al.’s result was the first to show that an (uncountably infinite) set of or ECCs could be used to represent simplicial complexes. These techniques have been utilized in practice by several research groups for a diverse range of applications [crawford2016functional, turner2014persistent, hofer2017constructing]. However, the uncountably infinite nature of the result limited its applications. As such, Belton et al., Ghrist et al., and Curry et al., all observed that there exists a finite representation, using topological descriptors, for various types of simplicial complexes [belton2018learning, ghrist2018euler, curry2018directions]. Motivated by these results, Belton et al. introduced an algorithm for reconstructing embedded plane graphs of vertices only using (augmented) PDs [belton2018learning]. Independently, [curry2018directions] proved that the PHT and ECT have finite representations under assumptions on the curvature of the underlying shape. The proof uses the observation that directions from a finite subset of stratums on the sphere from which particular simplices are “observable” suffice. Ghrist et al. [ghrist2018euler] made similar observations about the curvature of the shape inducing a stratification of the sphere. In his dissertation, Betthauser showed many similar properties for the ECT on cubical complexes [betthauser2018topological]. For general shapes, [fasy2018challenges] identified particular arrangements of vertices that are reconstructible using the methods of[belton2018learning], but for which the same algorithm cannot be directly extended to use ECCs. For a more exhaustive overview of the related literature, we refer the reader to [oudot2018inverse]. However, a well-defined algorithm for reconstructing simplicial complexes in using a finite number of PDs or ECCs does not yet exist—such an algorithm for (augmented) PDs is our main contribution.
1.2 Our Contribution
In the current paper, we investigate the question: How can we reconstruct embedded simplicial complexes of arbitrary dimension using a finite number of directional (augmented) persistence diagrams? We answer this question by giving an algorithm for reconstruction (simComp in full-alg). This is the first algorithm for reconstructing an unknown simplicial complex in , where the simplicial complex is not a graph. The heart of this algorithm is a predicate (simpPred of simpPred) that tests whether or not a set of vertices forms a -simplex in our unknown simplicial complex. In the case where the complex is a graph, or if we are only interested in reconstructing the one-skeleton of the complex, we introduce a concept called an edge interval that, for a given vertex , helps us to binary search through remaining vertices to determine which ones are adjacent. This construction allows us to improve upon the best known solution for plane graphs [belton2018learning], and we discuss the trade-offs for an alternative approach for reconstructing vertices in graphs embedded in [belton2019reconstructing].
2 Background Definitions
In this section, we give an overview of necessary background information, following the notation established in [belton2018learning, belton2019reconstructing] For a more complete discussion on foundational computational topology, we refer the reader to [edelsbrunner2010computational, chazal2016structure].
2.1 Lower-star Filtrations and Persistence
Our reconstruction method is based off of the foundational topological data analysis framework of simplicial complexes, (augmented) persistence diagrams, and filtrations. Here, we introduce these topics and related definitions.
In what follows, we consider simplicial complexes in , for , and we denote its -simplices by and let denote the number of -simplices. In particular, is the set of one-simplices (or vertices) and is the set of one-simplices (or edges). We denote the degree of a vertex as . A -simplex is uniquely identified by vertices, which we denote by .
The first assumption makes working with simplices in axis-parallel directions easier, since no two vertices will lie at the same height in any axis-parallel direction. Let be the standard unit basis vectors in . That is, has the value zero as all its coordinates but coordinate and has the value one as its th coordinate. The assumption that no two vertices are equidistant from the origin allows us to use a parabolic lifting map that preserves the previous property (see oracle).
Note that any set of vertices satisfying general has the property that defines a -dimensional affine subspace of , denoted . Given a simplex , we may use the notation to mean .
Note that points in a PD with are often computed (see, e.g., [edelsbrunner2010computational, Ch. VII]), but not included in the output since they correspond to homology features that are born and die at the same time. However, such diagonal points encode additional geometric information leveraged in this paper.
For the remainder of the paper, we may say “diagrams” as shorthand for “directional augmented persistence diagrams.” Next, we make an observation relating birth-death pairs in diagrams to the simplices in ; see, e.g., [edelsbrunner2010computational, pp. – of §] for more details.
In particular, this lemma implies there is a bijection between simplices of and (computed) birth-death events in a diagram, a necessary step in the justification of many claims.
maybe combine this with the previous? cor-simplex-count-stmt
Finally, we define a structure that will be used througout the remainder of the paper to talk about lower-star filtrations in a clear way. This structure helps give a way of visualizing the problem and also gives geometric intuition for several of the proofs that follow.
We note that all hyperplanes inare parallel to each other and perpendicular to the direction . The hyperplane defines all potential locations for . Since the births in the zero-dimensional diagram are in one-to-one correspondence with the vertices of the simplex complex by addSimp, a single diagram suffices to construct and can be done in time. Given filtration hyperplanes for carefully chosen directions, we show a correspondence between intersections of these hyperplanes and vertices in .
2.2 Framework for Oracle
3 Predicates and Constructions
In this section, we develop the constructions and a predicate needed for reconstructing simplicial complexes. The predicate, computed in simpPred, determines whether or not a set of zero-simplices is a -simplex of the underlying simplicial complex. We first describe necessary machinery in inkdeg.
The key piece of machinery we develop for determining whether a simplex exists is the -indegree of a simplex, which is the count of -dimensional cofaces of a simplex below in a particular direction. In order to compute -indegrees we develop a method for choosing directions that “isolate” a face of a simplex. We note that two lemmas cited in sumSwaps (planeFilling and eps) are found in omitted-lemmas, and the proof of sumSwaps is found in faceiso-proof. The method for isolating simplices in sumSwaps is described in simpSwap.
Next we develop a predicate that uses these isolated simplices to test for -simplices. Intuitively, we identify one-simplices by checking all pairs of zero-simplices with this predicate, then identify all two-simplices by checking all triples of one-simplices, etc. using a “bow tie” technique like the approach found in [belton2018learning]. For this predicate to generalize for reconstruction of higher-dimensional simplices, we provide the following definition:
Since is perpendicular to , all zero-simplices of are at the same height in direction . However, as shown in kindegree, not all -simplices at this height contribute to the -indegree of .
Note that kindegree is an example of a case where only one three-simplex contributes to the three-indegree of the two-simplex in question.
Since uniqueiso shows that a single diagram is not sufficent to determine -indegree, we use an inclusion-exlusion style argument to compute the -indegree in inkdeg. Note that the first time this algorithm is called, we have not computed any entries of yet. We prove the correctness of this algorithm in the following theorem:
The next lemma provides the runtime of inkdeg; the proof is in indegTime-proof.
3.2 Simplex Predicate
Using the -indegree, we are able to isolate and determine the presence of -simplices between two hyperplanes centered at a simplex. This idea is a generalization of the “bow tie” technique used for identifying edges in [belton2018learning]. The generalization of a bow tie is a double-cone shaped region that we call a wedge; see wedge that contains exactly one vertex.
In simpPred, we use the difference in the indegree between the two filtration hyperplanes defining a wedge to test for the presence of a -simplex.
4 Reconstruction Algorithm for Simplicial Complexes in
In the following sections, we describe a method for reconstructing simplicial complexes in . Our method first finds the locations of zero-simplices, (vertex), one-simplices (edges), and all higher-dimensional simplices (full-alg).
4.1 Vertex Reconstruction
The main idea behind our vertex reconstruction approach is to generate a set of hyperplanes on which vertices must lie, then to solve for their intersection points. Given one point in , we can use the orthogonal directions. However, for points in , we have possible locations. Choosing our directions wisely, we can ensure a consistent ordering of vertices between our directions, resulting in a sub-exponential algorithm. While we note that [belton2019reconstructing] also offers a method for reconstructing vertices in , the algorithm provided in this work offers a trade-off, computing diagrams (instead of in [belton2019reconstructing]) with a time complexity of (instead of in [belton2019reconstructing]). The details, algorithms, and proofs are included in omitted-vertex and we state our main theorem here.
4.2 Edge Reconstruction
In this section, we describe a method for reconstructing the one-simplices given the coordinates of the zero-simplices. Our approach improves the time complexity of [belton2019reconstructing] from to and the diagram complexity from to . We use a hyperplane sweep in the direction where events occur at vertices; thus, we may use the word “above (below)” as shorthand for “above (below) with respect to the direction.” We discover the edges incident to each vertex by determining regions where potential edges lie and logarithmically search this space using information about edges already discovered below .
The hyperplane sweep may be easier to visualize as a line sweep in . Many of our descriptions utilize this tool, so we begin by formally defining a projection of to : def-project-rtwo
All operations for reconstruction are taking place in . However, the general position ensures that , so discussing edge reconstruction in is reasonable.
To keep track of regions containing edges incident to a vertex , we introduce an edge interval object, which contains an ordered list representing vertices radially sorted about and a count representing the number of edges incident to in the list. In order to search the edge interval in logarithmic time, we define split-wedge and use it to converge on edge intervals containing a single vertex with a count of one.
lem-split-wedge The proof of split-wedge is in split-wedge-proof. edgeAlg shows an example of the execution of split-wedge. Now, we use split-wedge to efficiently identify intervals that contain edges.
The proof of up-edges is provided in up-edges-proof. See edgeAlg for an example of part of the execution of up-edges. We now give an efficient algorithm for edge reconstruction.
alg-find-edges thm-edge-rec The proof of edgerec can be found in edgerec-proof.
4.3 Putting It All Together: Simplicial Complex Reconstruction
Combining the results from the previous subsections, we arrive at an algorithm to fully reconstruct an embedded simplicial complex. We include the proof of main-res in main-res-proof.
Futhermore, we derive additional corollaries, improving the diagram complexity for reconstructing embedded graphs over approaches found in [belton2019reconstructing].
Perhaps even more surprising is that we are able to reconstruct plane graphs with a number of diagrams that is less than exponential in the ambient dimension.
4.4 Reconstructing Two-manifolds
We provide a deterministic algorithm for computing the complete reconstruction of a simplicial complex embedded in arbitrary finite dimension using and time where and . This algorithm also improves on the results of [belton2018learning, belton2019reconstructing] for the case of plane and embedded graphs.
In ongoing work, we hope to improve running time and reduce the required number of persistence diagrams. We also hope to overcome the challenges of reconstructing codimension zero simplices that required us to include a parabolic lifting map in our oracle. The work presented here is closely related to reconstruction of simplicial complexes using the Euler characteristic transform (ECT). This transform is generated from Euler chacteristic curves (ECCs) generated by vectors in [fasy2018challenges, turner2014persistent, ghrist2018euler, curry2018directions]. In [curry2018directions], a bound on the number of ECCs needed for reconstruction of simplicial complexes is given, assuming a lower bound on the curvature of the underlying simplicial complex. In [fasy2018challenges], we identify challenges in reconstructing plane graphs with degree two vertices using a finite number of ECCs when using methods similar to the methods presented in our current paper. We are investigating if the methods of this paper can be extended to using ECT in general simplicial complex reconstruction.
This material is based upon work supported by the National Science Foundation under the following grants: ABI 1661530 (DM, LW), CCF 1618605 (BTF, SM), DBI 1661530 (BTF, DLM), and DMS 1664858 (BTF, AS). All authors thank the members of the CompTaG club at Montana State University for their thoughtful discussions and feedback on this work.
Appendix A Computation of Orthogonal Direction and Associated Full Set
In simpPred (simpPred:choosedir) of simpPred, we need to choose an initial direction that is orthogonal to the -plane but not orthogonal to any other subspace spanned by zero-simplices of . In this appendix, we give the details of how to compute this initial direction .
Appendix B Vertex Reconstruction
The vertex reconstruction algorithm, summarized in vertex, starts by choosing an initial direction. This choice can be arbitrary, so we choose the last cardinal direction, . Next, for each coordinate position , we call coord to find the þcoordinates of all vertices, denoted , using only two specifically chosen for those coordinates.
Now, we show the correctness of coord.
We are now ready to present vertex for reconstructing all of the vertices of .
We recall LABEL:thm:vertexrec and prove the correctness here:
Appendix C Omitted Lemmas
In order to perform the Face Isolation operation described in sumSwaps and simpSwap we need add additional points the affine space defined by the input set. We describe this plane filling operation with planeFilling.
To prove that the ordering of vertices in simpSwap remains consistent, we introduce eps to assist in the proof of the properties in sumSwaps.
Appendix D Omitted Proofs
d.1 Proof of sumSwaps
We recall sumSwaps, which proves the correctness of simpSwap: * proof-sum-swaps
d.2 Proof of indegTime
We recall indegTime, which shows the runtime and diagram complexity of inkdeg: * proof-indeg-time
d.3 Proof of split-wedge
We recall split-wedge, which shows the correctness and runtime of split-wedge for splitting edge intervals: * proof-split-wedge
d.4 Proof of up-edges
We recall up-edges, which shows the correctness and runtime of up-edges for finding all edges above a vertex: * proof-up-edges
d.5 Proof of edgerec
We recall edgerec, which proves the correctness of edge-recon * proof-edge-rec
d.6 Proof of main-res
We recall main-res, which proves the correctness of simComp * proof-full-alg