# Reconstructing Embedded Graphs from Persistence Diagrams

The persistence diagram (PD) is an increasingly popular topological descriptor. By encoding the size and prominence of topological features at varying scales, the PD provides important geometric and topological information about a space. Recent work has shown that particular sets of PDs can differentiate between different shapes. This trait is desirable because it provides a method of representing complex shapes using finite sets of descriptors. The problem of choosing such a set of representative PDs and then using them to uniquely determine the shape is referred to as reconstruction. In this paper, we present an algorithm for reconstructing embedded graphs in R^d (plane graphs in R^2) with n vertices from n^2 - n + d + 1 directional PDs. Lastly, we empirically validate the correctness and time-complexity of our algorithm in R^2 on randomly generated plane graphs using our implementation, and explain the numerical limitations of implementing our algorithm.

## Authors

• 2 publications
• 8 publications
• 2 publications
• 5 publications
• 5 publications
• 3 publications
• 5 publications
• 2 publications
• 4 publications
• ### Learning Simplicial Complexes from Persistence Diagrams

Topological Data Analysis (TDA) studies the shape of data. A common topo...
05/27/2018 ∙ by Robin Lynne Belton, et al. ∙ 0

• ### Persistence Diagrams for Efficient Simplicial Complex Reconstruction

Topological descriptors have been shown to be useful for summarizing and...
12/29/2019 ∙ by Brittany Terese Fasy, et al. ∙ 0

• ### Challenges in Reconstructing Shapes from Euler Characteristic Curves

Shape recognition and classification is a problem with a wide variety of...
11/28/2018 ∙ by Brittany Terese Fasy, et al. ∙ 0

• ### From trees to barcodes and back again: theoretical and statistical perspectives

Methods of topological data analysis have been successfully applied in a...
10/22/2020 ∙ by Lida Kanari, et al. ∙ 0

• ### Computing Zigzag Persistence on Graphs in Near-Linear Time

Graphs model real-world circumstances in many applications where they ma...
03/12/2021 ∙ by Tamal K. Dey, et al. ∙ 0

• ### Planar Pixelations and Image Recognition

Any subset of the plane can be approximated by a set of square pixels. T...
05/13/2011 ∙ by Brandon Rowekamp, et al. ∙ 0

• ### Towards Stratified Space Learning: Linearly Embedded Graphs

In this paper, we consider the simplest class of stratified spaces – lin...
01/12/2021 ∙ by Yossi Bokor, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Topological data analysis (TDA) provides a set of promising tools to help analyze data in fields as varied as materials science, transcriptomics, and neuroscience [giusti2015clique, lee2017quantifying, rizvi2017single]. The wide applicability is due to the fact that many forms of data can be modeled as graphs or simplicial complexes, two widely-studied types of topological spaces. Topological spaces are described in terms of their invariants–such as the homotopy type or homology classes. Persistent homology considers the evolution of the homology groups in a filtered topological space.

#### Motivation

The problem of manifold and stratified space learning is an active research area in computational mathematics. For example, Chambers et al. use persistent homology in a stratified space setting [chambers2018heuristics], describing an algorithm to identify shapes that simplify a noisier shape, and then confirm that the given simplification still satisfies the desired topological properties. Zheng et al. also address a problem in space learning, and study the 3D reconstruction of plant roots from multiple 2D images [rootreconstruction], using persistent homology to ensure the resulting 3D root model is connected. Another reconstruction problem involves reconstructing road networks. Three common approaches to solving these problems involve Point Clustering, Incremental Track Insertion, and Intersection Linking [maps]. Ge, Safa, Belkin, and Wang develop a point clustering algorithm using Reeb graphs to extract the skeleton graph of a road from point-cloud data [ge2011data]. The original embedding can be reconstructed using a principal curve algorithm [kegl2000learning]. Karagiorgou and Pfoser give an algorithm to reconstruct a road network from vehicle trajectory GPS data by identifying intersections with clustering, then using vehicle trajectories to connect them [ili]. Ahmed et al. provide an incremental track insertion algorithm to reconstruct road networks from point cloud data [iti]. The reconstruction is done incrementally, using a variant of the Fréchet distance to partially match input trajectories to the reconstructed graph. Ahmed, Karagiorgou, Pfoser, and Wenk describe all these methods in [maps]. Finally, Dey, Wang, and Wang use persistent homology to reconstruct embedded graphs. This research has also been applied to input trajectory data [dey2018graph]. Dey et al. use persistence to guide the Morse cancellation of critical simplices. We see from these applications the necessity for reconstruction algorithms, and in particular the necessity for reconstruction algorithms of graphs since much of the research involving reconstruction of road networks involves reconstructing graphs.

We explore the reconstruction of graphs from a widely used topological descriptor of data, persistence diagrams. The problem of reconstruction for simplicial complexes has received significant recent attention [turner2014persistent, curry2018directions, ghrist2018persistent]. Our work is motivated by [turner2014persistent], which proves that one can reconstruct simplicial complexes in and from an uncountably infinite number of persistence diagrams. In this work, we use a version of persistence diagrams that includes information that is not normally considered; thus, for clarity, we refer to these descriptors as augmented persistence diagrams (APDs). Our approach to reconstruction differs from those listed above because we provide the first deterministic algorithms using APDs generated from a specific set of directional height filtrations to reconstruct the original graph.

#### Our Contribution

In this work, we focus on graphs embedded in (plane graphs in ) and use directional APDs to reconstruct a graph. In particular, our main contributions are an upper bound on the number of APDs required for reconstructing embedded graphs in , a polynomial-time algorithm for reconstructing plane graphs, and the first deterministic reconstruction algorithm for embedded graphs in arbitrary dimension.

The current paper is an extension of conference proceedings from CCCG 2018 [belton2018learning]. We extend the proceedings paper in the following ways: (1) we revise proofs for clarity; (2) we extend our algorithms for graph reconstruction to ; (3) we expand our literature review to include a discussion of recent results; (4) we publicly release code for our algorithm111The code is available in a git repo hosted on GitHub: https://github.com/compTAG/reconstruction.; and (5) we provide an experimental section to demonstrate the implementation.

## 2 Preliminaries

We begin by summarizing the necessary background information, but refer the reader to [edelsbrunner2010computational] for a more comprehensive overview of computational topology.

#### Plane Graphs

Our main object of study is plane graphs with straight-line embeddings (referred to simply as plane graphs throughout this paper). A plane graph is a set of vertices and a set of straight line connections between pairs of vertices called edges (denoted by and respectively), such that no two edges in the embedding cross. We will frequently denote as the number of vertices in . Throughout this paper, we make assumptions about the positioning of vertices in graphs.

[General Position] Let be a graph embedded in with vertices . We assume that for all where and , for any . Furthermore, we assume that no vertices are coplanar.

#### Height Filtration

Let be a plane graph. Consider a direction  in the unit sphere in ; we define the lower-star filtration with respect to direction in two steps. First, let be defined for a simplex by  where  is the inner (dot) product and measures height of in the direction of , since

is a unit vector. Thus, the height

is the maximum height of all vertices in . Then, for each , the subcomplex  is composed of all simplices that lie entirely below or at height , with respect to direction . Notice  for all  and  if no vertex has height in the interval . The sequence of all such subcomplexes, indexed by , is the height filtration with respect to , denoted . Notice that the complex changes a finite number of times in this filtration. We note that the lower-star filtration is the discrete analog to a lower-levelset filtration, where the complex is intersected with a raising closed half-plane.

#### Axis Aligned Directions

When considering a standard unit basis vector of  in dimension , we will write to denote the direction.

#### Augmented Persistence Diagrams

The (directional) persistence diagram for a simplicial complex is a summary of the homology groups  as the height parameter ranges from to ; in particular, the persistence diagram is a set of birth-death pairs , each with a corresponding dimension . Each pair represents an interval  corresponding to a generator of the -th homology group. In the specific application of plane graphs, a birth event may occur either when the height filtration discovers a new vertex (), representing a new component, or when a one-cycle appears (). Zero-dimensional deaths correspond to connected components merging. One-dimensional deaths occur when two cycles merge together, which includes the case where a cycle is filled in. By definition [edelsbrunner2010computational], all points in the diagonal  are also included with infinite multiplicity. However, in what follows, we use the multi-set of diagonal points that are explicitly computed, as in the algorithm given in Chapter of [edelsbrunner2010computational]. We refer to persistence diagrams containing only these explicitly computed diagonal points, rather than all points on the diagonal, as augmented persistence diagrams (APDs). We denote the space of all APDs by .

For a direction , let the directional augmented persistence diagram be the set of birth-death pairs for the -th homology group from the height filtration . As with the height filtration, we simplify notation and define  when the complex is clear from context. If is not specified, we write to indicate the set of birth death pairs for homology groups in all dimensions from the height filtration . We denote  to be the -th Betti number, i.e., the rank of the -th homology group. In general, the complexity of computing a persistence diagram is matrix multiplication time, with respect to the number of simplicies in the filtration; that is, the complexity is , where corresponds to the smallest known exponent for matrix multiplication time. In some cases (e.g., for computing  or  when  is a height filtration in ), the computation time is , where  is the inverse Ackermann function.

In what follows, for the unknown complex , we assume that we have an oracle that can take a direction  and a value , and return the diagram . We define such that is the time complexity for to return this diagram. Notice that , where denotes the number of off-diagonal points in .

Next, we state a lemma relating birth-death pairs in APDs to the simplices in . We omit the proof, but refer the reader to [edelsbrunner2010computational, pp.  of §] for more details.

[Adding a Simplex] Let be a simplicial complex and be a -simplex such that . Then, the addition of to will either increase  by one or decrease  by one.

Thus, we form a bijection between simplices of and birth-death events in an APD. If is any graph, then the maximum number of edges in is , and so . In the case when is a plane graph,  due the planarity of . Furthermore, an APD will have at least  points from the vertices in corresponding to births in the zero-dimensional diagram. These observations give us the following corollary on the size of APDs for graphs.

[Size of Augmented Persistence Diagrams] Let be an embedded graph and be the number of vertices in . Then an augmented persistence diagram will have birth-death pairs. In the case when is a plane graph, the augmented persistence diagram will have birth-death pairs.

## 3 Related Work

Let be a simplicial complex embedded in , for some . Turner et al. introduced the persistent homology transform , defined by ; see [turner2014persistent]. Intuitively, the considers the persistent homology of a simplicial complex using filtrations induced from every direction in . The Euler characteristic transform (ECT) defined by is a similar function that maps each direction in to an Euler characteristic curve that tracks the Euler characteristic of each subcomplex induced by a height filtration. Turner et al. show that both of these functions are injective when the vertices of  are in general position and . Recently, variations of these functions have attracted interest in other research domains and researchers are realizing the potential of persistent homology as an effective data descriptor. For example, in [crawford2019predicting] the smooth Euler characteristic transform (SECT) is introduced as a method of determining clinical outcomes using MRIs from patients with glioblastoma multiforme (GBM). Furthermore, a recent survey by Oudot and Solomon explores the current state of inverse problems in topological persistence as a potential tool for producing explainable data descriptors [outdot2018inverse]

. The authors suggest new research directions and potential applications in the field of machine learning using approaches such as the PHT and ECT. These experiments and extensions offer insight into new applications and future work for topological summary statistics and demonstrate effectiveness in new research domains, suggesting further exploration of these tools.

In order to leverage the injectivity for shape comparison, two approaches can be taken: (1) provide an algorithm that reconstructs the shape from a subset of the directions; (2) show that has a finite representation. In the current paper, we take the first approach and show that not only is injective for graphs embedded in Euclidean space, but we can select a finite set of directions that will allow us to reconstruct the original complex from the directional augmented persistence diagrams from directions in . In particular, we prove that a quadratic number of directions (with respect to the number of vertices) is sufficient to reconstruct a graph, given an oracle that can compute directional APDs. One method to tackle the second approach is by observing that diagrams only provide new information when a transposition in the ordering of the filtration occurs. Furthermore, changes in the filtration bound changes in the distance between persistence diagrams [cohen2007stability]. For example, consider and a finite geometric simplicial complex . Since transpositions can only happen when two vertices are swapped in the filtration, the set of directions in for which two vertices occur at the same height in the height filtration is finite. In other words, there are two (most likely unequal) ‘hemispheres’ of for which the vertices occur at different heights. For each pair of vertices, we have two different hemispheres that maintain the ordering of these vertices relative to one another in the filtration. We can consider these hemispheres for all pairs of vertices and consider the regions for which the ordering is stable. For each region , the persistence diagram continuously varies and no transpositions (or ‘knees’) are witnessed [cohen2006vines]. This alternate approach was investigated independently by [curry2018directions].

In an independent investigation [ghrist2018persistent], Ghrist, Levanger, and Mai further explore the the persistent homology transform (and other invertible transforms), providing an alternate proof of injectivity. The authors show that can be converted into a Radon integral transform, which is shown to be invertible. However, in contrast to our work, the main result of [ghrist2018persistent] is a proof of theoretical injectivity and does not describe any reconstruction methods that could be implemented in practice.

In an arXiv preprint [curry2018directions], Curry et al. proved a finite bound on the number of directions necessary to reconstruct an unknown geometric simplicial complex in from Euler characteristic curves, i.e., using a finite subset of curves from . First, the embedding for the vertices of  are determined using an assumed lower bound on the local geometry around any given vertex. The vertices are discovered by generating topological summaries (i.e., Euler characteristic curves or persistence diagrams) by choosing directions in based on to ensure that the embedding of each vertex is identified. Then, the

-dimensional sphere is stratified using hyperplanes that intersect pairs of vertices. Directions from each stratum are sampled to generate a set of Euler characteristic curves. Theorem

of [curry2018directions] shows that this set of Euler characteristic curves uniquely identifies the embedding of with a finite number of Euler characteristic curves (or persistence diagrams). However, these bounds are dependent on assumptions about the curvature of the complex and are exponential in the dimension.

It should be noted that persistence diagrams encode at least the same information encoded by Euler characteristic curves, so this finite bound also applies to persistence diagrams. However, extending methods that use persistence diagrams for reconstruction to methods that use Euler characteristic curves for reconstruction introduces its own set of challenges, which are explored in [fasy2018challenges]. In particular, certain arrangements of vertices are identified that are reconstructible by PDs generated by a finite set of directions that are not reconstructible by the ECCs generated from the same set of directions. Our work differs from these approaches by providing an explicit reconstruction algorithm, with complexity analysis and implementation.

## 4 Vertex Reconstruction

Next, we present an algorithm for recovering the locations of vertices of an embedded graph. We begin with a plane graph , where we are able to use three augmented directional augmented persistence diagrams. We then extend this method for any embedded graph in , using directional augmented persistence diagrams.

### 4.1 Vertex Reconstruction for Plane Graphs

Intuitively, for each direction, we identify the lines on which the vertices of must lie. We show how to choose specific directions so that we can identify all vertex locations by searching for points in the plane where three lines intersect. We call these lines filtration lines:

[Filtration Hyperplanes and Filtration Lines] Given a direction and a height , the filtration hyperplane at height is the -dimensional hyperplane, denoted , through point  and perpendicular to direction , where denotes scalar multiplication. Given a finite set of vertices , the filtration hyperplanes of  are the set of hyperplanes

 \pLines\dirV:={\pLine\dir\hFiltFun\dir(v)}v∈V.

In the special case when , we refer to filtration hyperplanes as filtration lines. Notice that all hyperplanes in are parallel, and that . Intuitively, if where is the vertex set of some plane graph , then the line occurs at the height where the filtration  includes  for the first time. If the height is known but the complex is not, then we know that must be contained on the line . By addSimp, the births in the zero-dimensional augmented persistence diagram are in one-to-one correspondence with the vertices of the plane graph . Thus, we can construct  from a directional diagram in  time by iterating through the points in the zero-dimensional augmented persistence diagram. Using filtration lines, we show a correspondence between intersections of three sets of filtration lines and the vertices in . In what follows, given a direction and a point , define as a way to simplify notation.

[Vertex Existence] Let  be a plane graph and let . Let be linearly independent and further suppose that and each contain  lines. Let be the intersection points between lines in and in . Let such that each has a unique height in direction . Then, for all , , i.e., each shares an intersection point with filtration lines from and .

###### Proof.

Assume, for contradiction, that there exists  such that . Since , there is some vertex such that  and lies on . However, is in , contradicting the hypothesis that . ∎

If we generate vertical lines, , and horizontal lines, , for our first two directions, then only a finite number of directions in  have been eliminated for the choice of . In the next lemma, we choose a specific third direction by considering a bounding region defined by the largest distance between any two lines in and smallest distance between any two consecutive lines in . Then, we pick the third direction so that if one of the corresponding lines intersects the bottom left corner of this region then it will also intersect the along the right edge of the region. In vert, the third direction was computed using this procedure with the region having a width that is the length between the left most and right most vertical lines, and height that is the length between the top two horizontal lines. Next, we give a more precise description of the vertex localization procedure.

[Vertex Localization] Let and be  horizontal and vertical lines, respectively. Let  (and ) be the largest (and smallest) distance between two lines of (and , respectively). Let be the smallest axis-aligned bounding region containing the intersections of lines in . Let , i.e., a unit vector oriented towards the point . Any line parallel to can intersect at most one line of in .

###### Proof.

Note that, by definition, is a vector in the direction that is at a slightly smaller angle than the diagonal of the region with width and height . Assume, by contradiction, that a line parallel to can intersect two lines of within . Specifically, let and let  be a line parallel to such that the points for  are the two such intersection points within . Since the lines of are horizontal and by the definition of , we observe that . Let , and observe . Since the slope of is , we have , which is a contradiction. ∎

We conclude the discussion of plane graph reconstruction with an algorithm to determine the coordinates of the vertices of the original graph in , using only three height filtrations.

[Vertex Reconstruction] Let  be a plane graph. We can compute the coordinates of all vertices of  using three directional augmented persistence diagrams in time, where is the time complexity of computing a single directional augmented persistence diagram for .

###### Proof.

We proceed with a constructive proof that is presented as an algorithm in vert_recon. Let be an oracle that can takes a direction  and returns the zero-dimensional directional APD for the unknown plane graph  in direction in time.

We start with requesting two directional augmented persistence diagrams from the oracle, and . Note that, by our general position assumption, no two vertices of  share an - or -coordinate. By PDpoints, the sets  and  (which we do not explicitly construct) each contain  distinct lines. Let be the resulting set of heights of lines in , in increasing order. Likewise, let  be the ordered set of heights of lines in , also in increasing order. We explicitly construct and sort these two sets and : the birth times in the persistence diagrams correspond to heights of the filtration lines in .

Let be the set of intersection points between the lines in and in . Exactly of these points correspond to vertices of . The next step is to identify a third direction such that each line in  intersects with only one point in , which we will use in order to distinguish which intersection points correspond to vertices in .

Let and let  be the minimum of . In words, is the difference between the maximum and minimum heights of lines in and is the minimum height difference between consecutive lines in ; see vert. Note that we can compute in  time from and in time from . Let be the smallest axis-aligned bounding region containing the intersection points , and let be a unit vector perpendicular to the vector . We request the set from our oracle . As before, the heights of the lines in are the birth times of points in . We save this set of heights as in time, and sort in time.

Finally, by vertex-localization, any line in  intersects no more than one line of  within . Furthermore, by vertExist, each line in intersects . Thus, there are exactly intersection points of with the set , locating the vertices in . We compute these intersections by intersecting the -th line of  with the -th line of in time.

In total, this algorithm, summarized in vert_recon, uses three directional diagrams, two requested from the oracle in Line 1 and one requested in Line 8. These two lines take time each, Lines 1 and 8 take time each, and the for loop in Lines 11 through 15 takes time. All other lines are linear or constant, with respect to . Thus, the total time complexity is . ∎

### 4.2 Vertex Reconstruction in \Rd

The vertex reconstruction algorithm of the previous subsection generalizes to higher dimensions. In , a filtration line becomes a filtration hyperplane, a -dimensional hyperplane that goes through one of the vertices in the vertex set (and is perpendicular to a given direction). Similar to filtration lines, filtration hyperplanes generated by a fixed direction are parallel and are in a 1-1 correspondence with the vertices (for almost all directions).

[Generalized Vertex Existence] Let  be a straight-line embedded graph in . Let be linearly independent directions in and further suppose that contain  filtration hyperplanes for each . Choosing one hyperplane in each set , the intersection of these hyperplanes is a point. Let  denote the such intersection points. Let such that for any , and contains filtration hyperplanes. Then, for all , , i.e., each shares an intersection point with filtration lines from .

###### Proof.

Assume, for contradiction, that there exists  such that . Since , there exists a vertex such that  and lies on . However, is in , contradicting the hypothesis . ∎

Just as in the case of plane graphs, we can now describe a method for locating all vertices. The following lemma is a higher-dimensional analogue of vertex-localization.

[Generalized Vertex Localization] Let be a straight-line embedded graph in , with . Choosing one hyperplane in each set  for , the intersection of these hyperplanes is a point. Let denote the such intersection points. Then, we can find a direction such that each hyperplane in intersects at most one of the points in  in  time.

###### Proof.

Let  be the ordered set of heights of lines in . Let be the largest distance between any two hyperplanes in , and let . Let be the smallest height difference between any two (adjacent) hyperplanes in , and let . We describe next how to choose a direction perpendicular to the hyperplane that intersects the origin and each . Let our hyperplane be defined by the rows in

 H=⎡⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢⎣00…00w0…0hd−10w…0hd−1⋮⋮⋱⋮⋮00…whd−1⎤⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥⎦

Then, we choose a vector orthogonal to the hyperplane by solving . We note that there are two solutions, but without loss of generality, we choose to solve the equation.

We now show that satisfies the claim that each hyperplane in intersects at most one of the points in  in time. Let and be points in with , and assume, for contradiction, that they lie on the same hyperplane in . Then, . By the definition of dot product, we have the following equation:

 d−1h(pd−qd)−d−1∑i=11w(pi−qi) =0

Since and are positive numbers, we can rearrange this equality to obtain:

 |pd−qd| =hw(d−1)∣∣ ∣∣d−1∑i=1(pi−qi)∣∣ ∣∣. (1)

Recall that . Therefore, we know that . Applying equality to this inequality, we obtain:

 2h ≤hw(d−1)∣∣ ∣∣d−1∑i=1(pi−qi)∣∣ ∣∣ ≤hw(d−1)(d−1)max1≤i≤d−1|(pi−qi)| ≤hwmax1≤i≤d−1wi

Thus, we have , which is a contradiction to .

We analyze the complexity of computing and . For each direction , we first sort the heights of , which takes time. Then, to compute  from the sorted set is constant time (as it is the maximum value minus the minimum value of the heights), and computing is time. Computing and from the sets and is time. Thus, the bottleneck is sorting in each direction, which makes the total runtime . ∎

Equipped with the above method for finding a suitable st direction to locate vertices in higher dimensions, we conclude this section with a theorem describing the algorithm to compute the coordinates of the vertices of the original embedded graph.

[Higher-dimensional Vertex Reconstruction] Let  be a straight-line embedded graph in  for . We can can compute the coordinates of all vertices of using directional augmented persistence diagrams in  time, where is the time complexity of computing a persistence diagram.

###### Proof.

We proceed with a constructive proof, generalizing the constructive proof from intComp and vert_recon. Let be an oracle that takes a direction and returns the in time.

For , we use this oracle to obtain . Note that, by gp (General Position) and PDpoints, for each of these directions, we have exactly distinct filtration hyperplanes, in one-to-one correspondence with the vertices. Note that, for a given direction , we store the filtration hyperplanes as a list of the vertex heights. Choosing one hyperplane in each direction yields  pairwise orthogonal hyperplanes; their intersection is a point in  and this point is a potential vertex location. In total, we have  potential vertex locations, of which only  are actual vertices. We denote this set of  potential vertex locations by . By PDpoints,  has at least points. Thus, computing these lists of vertex heights takes time per dimension to account for computing and listing the points of the APD.

Let be chosen as in genVertLoc in time . By genVertLoc, each line  intersects at most one point in  for each . Furthermore, by genVertExist, each hyperplane in  intersects . Thus, there are exactly  distinct intersections between  and , in one-to-one correspondence with the  vertices.

Then, to identify vertex locations in , we employ the following brute force algorithm. We check each element  for intersections with any hyperplane . Since  and , we have  checks that we must perform, with each check taking time. Thus, the total time complexity of calculating from the  sets of filtration hyperplanes is  and no additional augmented persistence diagrams are computed.

In total, this algorithm uses directional diagrams. The time complexity of constructing the  sets of filtration hyperplanes is , and an additional  time to compute the actual vertex locations. Thus, the total time complexity is . ∎

## 5 Edge Reconstruction

Given the vertices constructed in vRec, we describe how to reconstruct the edges in an embedded graph using  augmented persistence diagrams. The key to determining whether an edge exists or not is counting the degree of a vertex for edges in the half plane “below” the vertex with respect to a given direction. We begin with a method for reconstructing plane graphs, and then extend our method to embedded graphs in .

### 5.1 Edge Reconstruction for Plane Graphs

We first define necessary terms, and then describe our algorithm for constructing edges.

[Indegree of Vertex] Let be an straight-line embedded graph in with vertex set . Then, for every vertex and every direction , we define:

 \indegv\dir=|{(v,v′)∈E∣\dir⋅v′≤\dir⋅v}|.

Thus, the indegree of is the number of edges incident to that lie below , with respect to direction ; see indegree.

Given a directional augmented persistence diagram, we prove that we can determine the indegree of a vertex with respect to that direction:

[Indegree from Diagram] Let be a straight-line embedded graph in . Let  such that no two vertices have the same height with respect to (and thus ). Let  and  be the zero- and one-dimensional augmented persistence diagrams resulting from the height filtration . Then, for all ,

 \indegv\dir=|{(x,y)∈\dgm0\dir∣y=v⋅\dir}|+|{(x,y)∈\dgm1\dir∣x=v⋅\dir}|.

Furthermore, if and then can be computed in time. If , then can be computed in time.

###### Proof.

Let such that , i.e., the vertex  is lower in direction  than . Let . Then, by addSimp, we have two cases to consider when is added to :

Case 1: joins two disconnected components. If connects two previously disconnected components, then is associated with a death in at height . Moreover, since all deaths in are associated with adding an edge, we know that the set of all edges that fall into this case with as the top endpoint is .

Case 2: creates a one-cycle. In this case, is associated with a birth in at height . Thus, we have that  is the set of edges that fall into this case with as the top endpoint.

The union is the set of all edges ending at  with respect to , hence . Furthermore, by PDpoints, if , then is a plane graph and each of and have points, so we count and sum the points joining two disconnected components or creating one-cycles at height in time time. If , then each of and have points, so we count and sum the points joining two disconnected components or creating one-cycles at height in time time. ∎

In order to decide whether an edge exists between two vertices, we look at the degree of as seen by two close directions such that is the only vertex in what we call a wedge at : [Wedge] Let  , and choose . Then, a wedge at is the symmetric difference between the half planes below in directions and . In the special case when , we refer to the wedge as a bow tie.

Because we assume that no three vertices in our plane graph are collinear, for each pair of vertices , we can always find a bow tie centered at that contains the vertex  and no other vertex in ; see edgeExist. We use bow tie regions to determine if there exists an edge between and . In the next lemma, we show how to decide if the edge exists in our plane graph.

[Edge Existence] Let be a straight-line embedded graph in . Let . Let such that the wedge  at defined by and satisfies: . Then,

 |\indegv\dir1−\indegv\dir2|=1⟺(v,v′)∈E.
###### Proof.

Since edges in  are straight lines, any edge incident to  will either fall in the wedge region  or will be on the same side (above or below) of both hyperplanes. Let be the set of edges that are incident to and below both hyperplanes; that is, Furthermore, suppose we split the wedge into the two infinite cones. Let be the set of edges in one cone and  be the set of edges in the other cone. We note that is equal to one if there is an edge with or and zero otherwise. Then, by definition of indegree,

 |\indegv\dir1−\indegv\dir2| =||A|+|B1|−|A|−|B2|| =||B1|−|B2|| =|B∩E|,

which equals one iff . Then , as required. ∎

Next, we prove that we can find the embedding of the edges in the original graph using directional augmented persistence diagrams. See Example for an example of walking through the reconstruction.

[Edge Reconstruction] Let be a plane graph. If  is known, then we can compute  using  directional augmented persistence diagrams in time, where  is the time complexity of computing a single diagram.

###### Proof.

We prove this theorem constructively, and summarize the construction in edge_recon. In the algorithm, we first preprocess in order to find a global bow tie half-angle. Then we iterate through each pair of vertices and test to see if the edge exists. This edge test is done in two steps: first create a bow tie that isolates the potential edge, then apply edgeExist to determine if the edge exists or not by comparing the indegrees of with respect to the two directions defining the bow tie.

Preprocessing (Lines 17 of edge_recon). We initialize a set of edges to be the empty set in Line 1. Next, we compute an angle that is sufficiently small to be used to construct bow ties for every edge. For each vertex , we consider the cyclic ordering of the points in  around ; let denote this ordered list of vertices. By Lemmas  and  of [verma2011slow], we compute  for all  in  total time222Note that the naïve approach would be to sort about each vertex independently, which would take time, but the results of [verma2011slow] improve this to . The lemmas in [verma2011slow] use big-O notation, but the presented algorithm is actually asymptotically tight.. Once we have these cyclic orderings, we compute all  lines through and  and can compute a cyclic ordering of all such lines through in per vertex. (The step of obtaining the cyclic ordering of lines given the cyclic ordering of vertices is similar to the merge step of merge sort). Given two adjacent lines through , consider the angle between these lines; see the angles labeled in edgebow. For each vertex , the minimum of such angles, denoted , is computed in Line 5 in time. Finally, we define in Line 7. The value will be used to compute bow ties in the edge test. The runtime for this preprocessing is  and requires no augmented persistence diagrams.

Edge Test (Lines 917). Let be an oracle that takes a direction in and returns the zeroth- and first-dimensional directional APDs for the unknown plane graph  in direction in time . Let such that . We now provide the two steps necessary to test if using only two diagrams.

The first step is to construct bow ties (Lines 912 of edge_recon). Let  be a unit vector perpendicular to vector , and let be the two unit vectors that form angles  with . Note that  and  are at the same height in direction , but different heights in direction and (and, in fact, their order changes between directions and ). We consult the oracle to obtain the APDs and . Note that we need the zeroth- and first-dimensional diagrams only, and these are the only non-trivial diagrams for a graph. Let be the bow tie between  and . Note that, by construction,  contains exactly one point from , namely . This first step of the edge test takes time and will use two augmented persistence diagrams.

The second step of the edge test is to compute indegrees of in order to determine if there exists an edge between  and (Lines 1317 of edge_recon). By Indegree, we compute the indegrees and from  and , respectively, in time; see Lines 13 and 14. Then, using edgeExist, we determine whether the edge  is in by checking if . If this inequality holds, the edge exists; if not, the edge does not; see Lines 1517. The bottleneck of the edge test is the indegree computation, the second step of the edge test takes  time. We do not compute additional diagrams in this step.

We apply the edge test for all distinct pairs in . For all pairs, the complexity of the edge test uses  persistence diagrams and takes time. Observing that the time to compute a diagram is  and using PDpoints, we observe that is . As a result, we can simplify to . Thus, the runtime of edge_recon is ( for preprocessing and for the edge tests). ∎

Putting together intComp and edgeEmbed leads us to our primary result:

[Plane Graph Reconstruction] Let  be a plane graph with vertices embedded in . vert_recon and edge_recon calculate the vertex locations and edges using  different directional augmented persistence diagrams in time, where  is the time complexity of computing a single diagram.

###### Proof.

By intComp, vert_recon reconstructs the vertices using three APDs in time. By edgeEmbed, edge_recon reconstructs the edges with directional augmented persistence diagrams in time. Thus, we can reconstruct all vertex locations and edges of using augmented persistence diagrams in time. ∎

### 5.2 Edge Reconstruction in \Rd

We can also reconstruct edges of graphs embedded in higher dimensions. We can form a higher-dimensional version of the bow tie, referred to as a wedge, which is the symmetric difference of two -dimensional hyperplanes.

[Higher-dimensional Edge Reconstruction] Let be a straight-line embedded graph in for some . If  is known, then we can compute  using directional augmented persistence diagrams in  time, where  is the time complexity of computing a single diagram.

###### Proof.

Let be the subspace spanned by and . Let be defined by ; in other words,  is the projection to . Let . We follow Lines 17 of edge_recon to compute an angle for the vertex set .

Let . Using Lines 911 of edge_recon, we define and such that the lines and , which are perpendicular to and  that go through , define a bow tie at that isolates the edge . This bow tie extends to a wedge in by replacing the lines with hyperplanes that intersect orthogonally; specifically, the line in corresponds to the -dimensional hyperplane, in . Let and be directions in that define this wedge. Notice that is the only vertex in this wedge; for this reason, we say that the wedge isolates the edge .

We then compute the indegrees of with respect to and , just as we did in the two-dimensional case in Lines 13 and 14 of edge_recon. However, we note that since our dimension may be greater than two, Indegree states that this step takes time for each indegree computation. We perform indegree checks pairs of vertices. Finally, by edgeExist, we test for an edge by determining if the difference between the indegrees is one or not using the same technique as Lines 15-16 of edge_recon.

The runtime for computing the indegree for all