Topological data analysis (TDA) is a promising tool in fields as varied as materials science, transcriptomics, and neuroscience [8, 11, 14]. Although TDA has been quite successful in the analysis of point cloud data , its purview extends to any data that can be encoded as a topological space. Topological spaces can be described in terms of their homology, e.g., connected components and “holes.” Simplicial complexes, in particular, are the most common representation of topological spaces. In this work, we focus our attention on a subset of simplicial complexes, namely, plane graphs embedded in , with applications to shape reconstruction.
In this paper, we explore the question, Can we reconstruct embedded simplicial complexes from a finite number of directional persistence diagrams? Our work is motivated by , which proves that one can reconstruct simplicial complexes from an uncountably infinite number of diagrams. Here, we make the first step towards providing a polynomial-time reconstruction for simplicial complexes. In particular, the main contributions of this paper are to set a bound on the number of persistence diagrams required to reconstruct a plane graph and to provide a polynomial-time algorithm for reconstructing the graph.
2 Related Work
The problem of manifold and stratified space learning is an active research area in computational mathematics. For example, Zheng et al. study the 3D reconstruction of plant roots from multiple 2D images . Their method uses persistent homology to ensure the resulting 3D root model is connected.
Map construction algorithms reconstruct street maps as an embedded graph from a set of input trajectories. Three common approaches are Point Clustering, Incremental Track Insertion, and Intersection Linking . Ge, Safa, Belkin, and Wang develop a point clustering algorithm using Reeb graphs to extract the skeleton graph of a road from point-cloud data . The original embedding can be reconstructed using a principal curve algorithm . Karagiorgou and Pfoser give an incremental track insertion algorithm to reconstruct a road network from vehicle trajectory GPS data . Ahmed et al. provide an incremental track insertion algorithm to reconstruct road networks from point could data . The reconstruction is done incrementally, using a variant of the Fréchet distance to add curves to the current basis. Ahmed, Karagiorgou, Pfoser, and Wenk describe all these methods in . Finally, Dey, Wang, and Wang use persistent homology to reconstruct embedded graphs. This research has also been applied to input trajectory data . Dey et al. use persistence to guide the Morse cancellation of critical simplices. In contrast, the work presented here uses persistence to generate the diagrams that encode the underlying graph.
Our work extends previous work on the persistent homology transform (PHT) . As detailed in preliminary, persistent homology summarizes the homological changes for a filtered topological space. When applied to a simplicial complex embedded in , we can compute a different filtration for every direction in ; this family of persistence diagrams is referred to as the persistent homology transform (PHT). The map from a simplicial complex to PHT is injective . Hence, knowing the PHT of a simplicial complex uniquely identifies that complex. The proof presented in  relies on the continuity of persistence diagrams as the direction of filtration varies continuously.
Our paper bounds the number of directions by presenting an algorithm for reconstructing the simplicial complex, when we are able to obtain persistence diagrams for a given set of directions. Simultaneous to our investigation, others have also observed that the number of directions can be bounded using the Radon transform; see [7, 3]. In the work presented in the current paper, we seek to reconstruct graphs from their respective persistence diagrams, using a geometric approach. We bound the number of directional persistence diagrams since computing the PHT, as presented in , requires the computation of filtrations from an infinite number of possible directions. Our work provides a theoretical guarantee of correctness for a finite subset of directions by providing the reconstruction algorithm.
In this paper, we explore the question, Can we reconstruct embedded simplicial complexes from a finite number of directional persistence diagrams? We begin by summarizing the necessary background information, but refer the reader to  for a more comprehensive overview of computational topology.
Simplices and Simplicial Complexes
Intuitively, a -simplex is a -dimensional generalization of a triangle, i.e., a zero-simplex is a vertex, a one-simplex is an edge connecting two vertices, a two-simplex is a triangle, etc. In this paper, we focus on a subset of simplicial complexes embedded in consisting of only vertices and edges. Specifically, we study plane graphs with straight-line embeddings (referred to simply as plane graphs throughout this paper). Furthermore, we assume that the embedded vertices are in general position, meaning that no three vertices are collinear and no two vertices share an - or -coordinate.
Let be a plane graph and denote as the unit sphere in . Consider ; we define the lower star filtration with respect to direction in two steps. First, we let be defined for a simplex by where is the inner (dot) product and measures height in the direction of , if
is a unit vector. Intuitively, the height offrom is the maximum height of all vertices in . Then, for each , the subcomplex is composed of all simplices that lie entirely below or at the height , with respect to the direction . Notice for all and if no vertex has height in the interval . The sequence of all such subcomplexes, indexed by , is the height filtration with respect to , notated as . Often, we simplify notation and define .
The persistence diagram is a summary of the homology groups as the height parameter ranges from to ; in particular, the persistence diagram is a set of birth-death pairs . Each pair represents an interval corresponding to a homology generator. For example, a birth event may occur when the height filtration discovers a new vertex, representing a new component, and the corresponding death represents the vertex joining another connected component. By definition , all points in the diagonal are also included with infinite multiplicity. However, in this paper, we consider only those points on the diagonal that are explicitly computed in the persistence algorithm found in , which correspond to features with the same birth and death time. For a direction , let the directional persistence diagram be the set of birth-death pairs for the -th homology group from the height filtration . As with the height filtration, we simplify notation and define when the complex is clear from context. We conclude this section with a remark relating birth-death pairs in persistence diagrams to the simplices in ; a full discussion of this remark is found in [5, pp. – of §].
[Adding a Simplex] Let be a simplicial complex and a -simplex whose faces are all in . Let refer to the -th Betti number, i.e., the rank of the -th homology group. Then, the addition of to will either increase by one or decrease by one.
Thus, we can form a bijection between simplices of and birth-death events in a persistence diagram. This observation is the crux of the proofs of intComp in vRec and Indegree in eRec.
4 Vertex Reconstruction
In this section, we present an algorithm for recovering the locations of vertices of a simplicial complex using three directional persistence diagrams. Intuitively, for each direction, we identify the lines on which the vertices of must lie. We show that by choosing the three directions such that they satisfy a simple property, we can identify all vertex locations by searching for points in the plane where three lines intersect. We call these lines filtration lines:
[Filtration Lines] Given a direction vector , and a height the filtration line at height is the line, denoted , through point and perpendicular to direction , where denotes scalar multiplication. Given a finite set of vertices , the filtration lines of are the set of lines
Notice that all lines in are parallel. Intuitively, if is a vertex in a simplicial complex , then the line occurs at the height where the filtration includes for the first time. If the height is known but the complex is not, the line defines all potential locations for . By addSimp, the births in the zero-dimensional persistence diagram are in one-to-one correspondence with the vertices of the simplex complex . Thus, we can construct from a single directional diagram in time. Given filtration lines for three carefully chosen directions, we next show a correspondence between intersections of three filtration lines and vertices in .
In what follows, given a direction and a point , define as a way to simplify notation.
[Vertex Existence Lemma] Let be a simplicial complex with vertex set of size . Let be linearly independent and further suppose that and each contain lines. Let be the collection of vertices at the intersections of lines in . Let such that for all , . Then, the following two statements hold true:
(2) For all , .
First, we prove Part (1).
() Let . Then, and for . Hence, , as desired.
() Assume, for the sake of contradiction, that and , yet . Since and , some other vertex must have height . Since , we know for . And, by () applied to , we know . Since , both and are in and on the line , but , which is a contradiction.
Next, we prove Part (2) of the lemma. Assume, for contradiction, that there exists such that . As , a vertex exists such that and lies on . However, , which is a contradiction. ∎
In the previous lemma, we needed to find a third direction with specific properties. If we use horizontal and vertical lines for our first two directions, then we can use the geometry of the boxes formed from these lines to pick the third direction. More specifically, we look at the box with the largest width and smallest height and pick the third direction so that if one of the corresponding lines intersects the bottom left corner of the box then it will also intersect the box somewhere along the right edge. In vert, the third direction was computed using this procedure with the second box from the left in the top row. Next, we give a more precise description of the vertex localization procedure.
[Vertex Localization] Let and be horizontal and vertical lines, respectively. Let (and ) be the largest (and smallest) distance between two lines of (and , respectively). Let be the smallest axis-aligned bounding box containing the intersections of lines in . For , let . Any line parallel to can intersect at most one line of in .
Note that, by definition, is a vector in the direction that is at a slightly smaller angle than the diagonal of the box of size by . Assume, by contradiction, that a line parallel to may intersect two lines of within . Specifically, let and let be a line parallel to such that the points for are the two such intersection points within . Notice since the lines of are horizontal and by the definition of , we observe that . Let , and observe . Since the slope of is , we have , which is a contradiction. ∎
We conclude this section with an algorithm to determine the coordinates of the vertices of the original graph in , using only three height filtrations.
[Vertex Reconstruction] Let be a plane graph. We can can compute the coordinates of all vertices of in time from three directional persistence diagrams.
Let and , which are linearly independent. We compute the filtration lines for in time by addSimp. By our general position assumption, no two vertices of share an - or -coordinate. Thus, the sets and each contain distinct lines. Let be the set of intersection points of the lines in and . The next step is to identify a direction such that each line in intersects with only one point , so that we can use vertExist.
Let (and ) be the greatest (and least) distance between two adjacent lines in (and , respectively). Let be the smallest axis-aligned bounding box containing , and let . By buffer, any line parallel to will intersect at most one line of in . Thus, we choose that is perpendicular to . By the second part of vertExist, we now have that each line in intersects . Thus, there are intersections between and in , each of which also intersects with .
The previous paragraph leads us to a simple algorithm for finding the third direction and identifying all the triple intersections. In the analysis below, steps that do not mention a number of diagrams use no diagrams. First, we construct and in time using two directional persistence diagrams. Second, we sort the lines of and by their - and -intercepts respectively in time. Third, we find by computing and from our sorted lines in time. Fourth, we construct in time using one directional persistence diagram. Fifth, we sort the lines in by their intersection with the leftmost line of in time. Finally, we compute coordinates of the vertices by intersecting the -th line of with the -th line of in time. (Observe, this last step works since the vertices correspond to the intersections in , as described above).
Therefore, we use three directional diagrams (two in the first step and one in the fourth step) and time (sorting of lines in the second and fifth steps) to reconstruct the vertices. ∎
5 Edge Reconstruction
Given the vertices constructed in vRec, we describe how to reconstruct the edges in a plane graph using persistence diagrams. The key to determining whether an edge exists or not is counting the degree of a vertex, for edges “below” the vertex with respect to a given direction. We begin this section by defining necessary terms, and then explicitly describing our algorithm for constructing edges.
[Indegree of Vertex] Let be a plane graph with vertex set . Then, for every vertex and every direction , we define:
In other words, the indegree of is the number of edges incident to that lie below , with respect to direction ; see indegree.
Next, we prove that given a direction, we can determine the indegree of a vertex:
[Indegree from Diagram] Let be a plane graph with vertex set . Let be such that no two vertices are at the same height with respect to , i.e., . Let and be the zero- and one-dimensional persistence diagrams resulting from the height filtration on . Then, for all ,
Let such that , i.e., the vertex is lower in direction than . Then, by addSimp, if , it must be one of the following in the filter of defined by : (1) an edge that joins two disconnected components; or (2) an edge that creates a one-cycle. Since edges are added to a filtration at the height of the higher vertex, we see (1) as a death in and (2) as a birth in , both at height . In addition, each finite death in and every birth in at time must correspond to an edge, i.e., edges are the only simplices that can cause these events. Then, the set of edges of types (1) and (2) is and , respectively. The size of the union of these two multi-sets is equal to the number of edges starting at lower than in direction and ending at , as required. ∎
In order to decide whether an edge exists between two vertices, we look at the degree of as seen by two close directions such that is the only vertex in what we call a bow tie at : [Bow Tie] Let , and choose . Then, a bow tie at is the symmetric difference between the half planes below in directions and . The width of the bow tie is half of the angle between and .
Because no three vertices in our plane graph are collinear, for each pair of vertices , we can always find a bow tie centered at that contains the vertex and no other vertex in ; see bowtie.
We use bow tie regions that only contain one vertex, other than the center, to determine if there exists an edge between and ; see edgeExist. We then use edgeExist to decide if the edge exists in our plane graph.
[Edge Existence] Let be a plane graph with vertex set and edge set . Let . Let such that the bow tie at defined by and satisfies: . Then,
Since edges in are straight lines, any edge incident to will either fall in the bow tie region or will be on the same side (above or below) of both lines. Let be the set of edges incident on and below both lines; that is Furthermore, suppose we split the bowtie into the two infinite cones. Let be the set of edges in one cone and be the set of edges in the other cone. We note that is equal to one if there is an edge with or and zero otherwise. Then, by definition of indegree,
which holds if and only if . Then , as required. ∎
Next, we prove that we can find the embedding of the edges in the original graph using directional persistence diagrams.
[Edge Reconstruction] Let be a plane graph, with vertex set and edge set . If is known, then we can compute using directional persistence diagrams.
We prove this theorem constructively. Intuitively, we construct a bow tie for each potential edge and use edgeExist to determine if the edge exists or not. Our algorithm has three steps for each pair of vertices in : Step is to determine a global bow tie width, Step is to construct suitable bow ties, and Step is to compute indegrees. See Example for an example of walking through the reconstruction.
Step 1: Determine bow tie width. For each vertex , we consider the cyclic ordering of the points in around . We define to be the minimum angle between all adjacent pairs of lines through ; see edgebow, where the angles between adjacent lines are denoted . Finally, we choose less than . By Lemmas and of , we compute the cyclic orderings for all vertices in in time. Since computing each is time once we have the cyclic ordering, the runtime for this step is .
Step 2: Constuct bow ties. For each pair of vertices such that , let be a unit vector perpendicular to vector , and let be the two unit vectors that form angles with . Let be the bow tie between and . Note that by the construction, contains exactly one point from , namely .
Step 3: Compute indegrees. Using as the bow tie in Indegree, compute and . Then, using edgeExist, we determine whether exists by checking if . If it does, the edge exists; if not, the edge does not.
Repeating for all vertex pairs requires diagrams and discovers the edges of . ∎
The implications of intComp and edgeEmbed lead to our primary result. We can find the embedding of the vertices by intComp using three directional persistence diagrams. Furthermore, we can discover edges with directional persistence diagrams by edgeEmbed. Thus, we can reconstruct all edges and vertices of a one-dimensional simplicial complex:
[Plane Graph Reconstruction] Let be a plane graph with vertex set and edge set . The vertices, edges, and exact embedding of can be determined using persistence diagrams from different directions.
In this paper, we provide an algorithm to reconstruct a plane graph with vertices embedded in . Our method uses persistence diagrams by first determining vertex locations using only three directions, and, second, determining edge existence based on height filtrations and vertex degrees. Moreover, if we have an oracle that can return a diagram given a direction in time, then constructing the vertices takes and reconstructing the edges takes takes time.
This approach extends to several avenues for future work. First, we plan to generalize these reconstruction results to higher dimensional simplicial complexes. We can show that the vertices of a simplicial complex in can be reconstructed in
time using the complete arrangement of hyperplanes anddirectional persistence diagrams. We conjecture that this bound can be improved to
using the same observation that allows us to do the final step of the vertex reconstruction in linear time. We have a partial proof in this direction, and can likewise extend the bow tie idea to higher dimensions, but the number of directions grows quite quickly. Second, we conjecture that we can reconstruct these plane graphs with a sub-quadratic number of height filtrations by utilizing more information from each height filtration. Third, we suspect a similar approach can be used to infer other graph metrics, such as classifying vertices into connected components. Intuitively, determining such metrics should require fewer persistence diagrams than required for a complete reconstruction. Finally, we plan to provide an implementation for reconstruction that integrates with existing TDA software.
This material is based upon work supported by the National Science Foundation under the following grants: CCF 1618605 (BTF, SM), DBI 1661530 (BTF, DLM, LW), DGE 1649608 (RLB), and DMS 1664858 (RLB, BTF, AS, JS). Additionally, RM thanks the Undergraduate Scholars Program. All authors thank the CompTaG club at Montana State University and the reviewers for their thoughtful feedback on this work.
-  Mahmuda Ahmed, Sophia Karagiorgou, Dieter Pfoser, and Carola Wenk. Map construction algorithms. In Map Construction Algorithms, pages 1–14. Springer, 2015.
-  Mahmuda Ahmed and Carola Wenk. Constructing street networks from GPS trajectories. In European Symposium on Algorithms, pages 60–71. Springer, 2012.
-  Justin Curry, Sayan Mukherjee, and Katharine Turner. How many directions determine a shape and other sufficiency results for two topological transforms. arXiv:1805.09782, 2018.
-  Tamal K. Dey, Jiayuan Wang, and Yusu Wang. Graph reconstruction by discrete Morse theory. arXiv:1803.05093, 2018.
-  Herbert Edelsbrunner and John Harer. Computational Topology: An Introduction. American Mathematical Society, 2010.
-  Xiaoyin Ge, Issam I Safa, Mikhail Belkin, and Yusu Wang. Data skeletonization via Reeb graphs. In Advances in Neural Information Processing Systems, pages 837–845, 2011.
-  Robert Ghrist, Rachel Levanger, and Huy Mai. Persistent homology and Euler integral transforms. arXiv:1804.04740, 2018.
-  Chad Giusti, Eva Pastalkova, Carina Curto, and Vladimir Itskov. Clique topology reveals intrinsic geometric structure in neural correlations. Proceedings of the National Academy of Sciences, 112(44):13455–13460, 2015.
-  Sophia Karagiorgou and Dieter Pfoser. On vehicle tracking data-based road network generation. In SIGSPATIAL ’12: Proceedings of the 20th International Conference on Advances in Geographic Information Systems, pages 89–98. ACM, 2012.
-  Balázs Kégl, Adam Krzyzak, Tamás Linder, and Kenneth Zeger. Learning and design of principal curves. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(3):281–297, 2000.
-  Yongjin Lee, Senja D. Barthel, Paweł Dłotko, S. Mohamad Moosavi, Kathryn Hess, and Berend Smit. Quantifying similarity of pore-geometry in nanoporous materials. Nature Communications, 8:15396, 2017.
-  David L. Millman and Vishal Verma. A slow algorithm for computing the Gabriel graph with double precision. CCCG ’11: Proceedings of the 23rd Annual Canadian Conference on Computational Geometry, 2011.
-  Monica Nicolau, Arnold J. Levine, and Gunnar Carlsson. Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival. Proceedings of the National Academy of Sciences, 108(17):7265–7270, 2011.
-  Abbas H. Rizvi, Pablo G. Camara, Elena K. Kandror, Thomas J. Roberts, Ira Schieren, Tom Maniatis, and Raul Rabadan. Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nature Biotechnology, 35(6):551, 2017.
-  Katharine Turner, Sayan Mukherjee, and Doug M. Boyer. Persistent homology transform for modeling shapes and surfaces. Information and Inference: A Journal of the IMA, 3(4):310–344, 2014.
Ying Zheng, Steve Gu, Herbert Edelsbrunner, Carlo Tomasi, and Philip Benfey.
Detailed reconstruction of 3d plant root shape.
Proceedings of the IEEE International Conference on Computer Vision, pages 2026–2033, 11 2011.
Appendix A Example of Reconstructing a Plane Graph
We give an example of reconstructing a plane graph. Consider the complex given in exampleVertex.
First, we find vertex locations using the algorithm described in vRec. We need to choose pairwise linearly independent vectorsand such that only three-way intersections in exist; note that in this example, . Using the persistence diagrams from height filtrations in directions and , we construct the set of lines . This results in possible locations for the vertices at the intersections in . We show these filtration lines and intersections in Vertx-second_dir. Next, we compute the third direction using the algorithm outlined in intComp. To do this, we need to find the greatest horizontal distance between two vertical lines, and the least vertical distance between two horizontal lines, . Then, we use these to choose a direction perpendicular to (e.g., ). Then, the four three-way intersections in identify all Cartesian coordinates of the original complex. We show filtration lines from all three directions in Vertex-third_dir.
Next, we reconstruct all edges as described in eRec. In order to do so, we first find the we will use to construct bow ties. To do this, we examine each vertex in turn, finding , the minimum angle between adjacent pairs of lines through and . Ordering by increasing -coordinate, we find to be approximately , and radians, respectively. Then, we take to be less than the minimum of these, i.e. .
Now, for each of the pairs of vertices , we construct a bow tie and then use this bow tie to determine whether an edge exists between the two vertices. We go through two examples: one for a pair of vertices that does have an edge between, and one for a pair that does not. First, consider the pair and . To construct their bow tie, we first find the unit vector perpendicular to the vector that points from to , which is . Now, we find such that they make angles with . We choose and . Now, by Indegree, we can use the persistence diagrams from these two directions to compute and . We observe that contains exactly one birth-death pair such that and has one birth-death pair such that . Thus, . On the other hand, contains exactly one birth-death pair such that , but contains no birth-death pair such that . So . Now, since , we know that , by edgeExist.
For the second example, consider the pair of vertices and . Again, we construct their bow tie by finding a unit vector perpendicular to the vector pointing from to . We choose this . Then, the and which form angle radians (e.g ) with are and . Again by Indegree, we examine the zero- and one-dimensional persistence diagrams from these two directions to compute the indegree from each direction for vertex . In , we have one pair which dies at , but in , no pair is born at . So . We see the exact same for , which means that . Since edgeExist tells us that we have an edge between and only if the absolute value of the difference of indegrees is one, we know that there is no edge between vertices and .
In order to reconstruct all edges, we perform the same computations for all pairs of vertices.
update figures so that the (0,1) and (1,0) are swapped (follow what’s going on in the paper)