In the last decade, estimation of topological and geometric features of an unknown underlying space from a finite sample has received an increasing attention in the field of computational topology and geometry. For example in  the authors provide a reconstruction guarantee for the topology of an embedded smooth n-manifold from a finite cover by balls of sufficiently small radius around a dense enough finite sample. Random sampling is also considered in 
, and estimates for the probability of reconstructingfrom a sample are obtained. These estimates imply that with increasing sample size, the probability of reconstructing tends to 1, thus we can recover almost surely as the sample size increases to infinity. Also, curve and surface reconstruction algorithms are discussed in 
In practice, not all manifolds are smoothly embedded, nor all spaces of interest are topological manifolds. In , the authors show that the homologies of a compact set of can be obtained by considering only the relevant homological features in the nerve of the radius balls around a sample that is close to in the Hausdorff metric. An upper bound for the required estimate is expressed in terms of weak feature size of (defined below). The result of  works for non-manifolds, but it is limited to spaces that have a positive weak feature size.
1.1 Background and related work
First, we outline a general approach to estimation of topology and geometry of Euclidean compact sets from the union of balls centered around a finite set of points densely sampled from the underlying space . Let be a sample that satisfies some density constraints, and let be a radius that depends on . Then our goal lies to estimate the topology and geometry of from . Here, is the union of Euclidean balls of radius around . Roughly speaking, one can expect to capture the topological features of if is chosen proportional to the size of the features of and proportional to , the Hausdorff distance between an . If is too small or too large, then may fail to capture the topological features of .
This suggests the following generic scheme:
The underlying space should be a well-behaved space that would allow us to choose an appropriate “feature size” , that would restrict the radius not to be too big to capture even the smallest topological feature of .
Having , for any one chooses a sample that approximates our underlying space very closely, ,
One then expects to estimate the topology and geometry of from .
In  the authors show that for a smooth manifold embedded in , can be chosen to be the maximal radius of the embedded normal disk bundle of . In thm:smale, the authors show that can be chosen to be a threshold for to compute the homologies of from the union of balls around an -dense sample .
[Deformation Retraction ] Let be a manifold with injectivity radius . Let be any finite collection of points such that it is dense in , where . Then, the union deformation retracts onto . Consequently, the homology of equals homology of .
We also mention the reconstruction results of  for compact sets in . If admits a positive weak feature size (wfs) and , then a densely sampled set of points can give us the right topology of .
Let and be two compact sets of such that and . Then,
where denotes the -th homology group and is the inclusion of in .
1.2 Summary of results
The current paper is motivated by the following questions:
thm:smale provides a reconstruction result when . Does the result fail to hold when ?
The feature size is defined for smooth manifolds. How can we define an appropriate feature size when the space is not a smooth manifold, e.g., an embedded simplicial complex.
We answer the first question positively in thm:1d, addressing the case of smooth curves in . thm:1d shows that is sufficient for the reconstruction. The second and third questions are considered in the setting where is a metric graph (later denoted by ) embedded in . In this setting, unlike in the manifold case, it is generally not possible to choose a threshold for the sampling parameter so that has the same homotopy type as , even if is an arbitrary dense sample, since will contain unnecessary “small” features (noise) that are not present in . In order to address this issue, we propose a different notion of a feature size that we call geodesic feature size (denoted by ). This new definition of feature size allows us to threshold the sampling parameter , in the case of a metric graph , and leads to the reconstruction algorithm shown in alg:graph. In particular, we obtain a simplicial complex in , which is -close to (in the sense of the Hausdorff distance) and which deformation retracts onto .
2 Reconstruction Results
2.1 Smooth Curve Reconstruction
[Smooth Curve Reconstruction] Let be a smooth curve in without boundary and let be the injectivity radius. Let and let be a finite subset of such that . Then, the medial axis of is homeomorphic to .
From the result of , which shows that any bounded open subset of Euclidean space is homotopy equivalent to its medial axis, we conclude that and are homotopy equivalent.
A deformation retraction constructed in , collapses along the normal lines . The collapse is not well defined if the intersection of with a normal line has more than one connected component. The condition (for a sample that is -dense in ) guarantees that such intersections do not happen and the deformation retraction is well-defined.
Let satisfy the assumptions of thm:1d. For brevity of exposition, we assume that has one path-connected component. Then, we know that is, in fact, the image of an injective, smooth map with . Let us denote the -tubular neighborhood of by .
Without loss of generality, assume that the sample points of are enumerated with increasing preimages: for all . Then, these samples introduce a partition of the manifold, where and . Let be the piecewise linear curve obtained by connecting ’s in the respective order and let .
Since , each ball intersects the tubular neighborhood at exactly at two points, say at and ; see fig:medialAxis. Let denote the normal passing through . Notice that these line segments do not intersect. In fact, these normal lines partition the tubular neighborhood into regions, where the -th region, denoted , is the one containing .
We will show that is homeomorphic to . Observe that is also the medial axis of . Restricting our attention to , we define a homeomorphism between and , and extend it globally so that they retain continuity since they agree on each by the pasting lemma.
We define a homeomorphism for each in the following way. If we draw a perpendicular at any point on , we show that intersects at exactly one point and define . As a consequence, is a continuous graph on , hence a homeomorphism.
On the contrary, let’s assume that there exists a point on whose normal intersects at at least two points and . We arrive at a contradiction by showing that there is a point on such that the normal at is parallel to .
Without loss of generality, we assume that cuts the manifold at both and . Note that and are points on the manifold and tangents and are not parallel to . By continuity of the tangents of , we conclude that there exists a point on such that is parallel to . Consequently, the normal at is parallel to .
Now, we arrive at a contradiction in either of the following cases; see thm:1d for an illustration.
Case 1: If , then the -radius normal, , at intersects either or . This contradicts the fact that is the injectivity radius.
Case 2: If , then the -radius normal, , at lies completely in the interior of , which is a contradiction because the boundary of each -radius normal lies on the boundary of the tubular neighborhood of the manifold.
Therefore, the function is a well-defined, invertible, continuous map on a compact domain, hence a homeomorphism.
Since the ’s agree on the boundary of each , we glue them to obtain a global homeomorphism . This completes the proof.
2.2 Metric Graph Reconstruction
A weighted graph is said to be a metric graph if the edge-weights are all positive. Then, we can interpret the weights as lengths, and thus each point has a well-defined distance to the endpoints of . We define the length of a given continuous path in to be the total length of all edges and partial edges in the path. Then, the distance function is defined to be the length of the shortest path connecting two points; in words, is the geodesic distance in . Metric graphs were first introduced in , and have recently been studied in [1, 6].
Below we list our assumptions about the underlying graph that we aim to reconstruct.
is an embedded metric graph with straight line edges. is the vertex set and is the edge set.
The length of the smallest edge of is .
[Nerve of a Cover] Suppose is a cover of a topological space . We take to be the vertex set and form an abstract simplicial complex in the following way: if a -way intersection is non-empty, then .
is then called the nerve of the cover and is denoted by .
[Nerve Lemma ]
is a “good” covering of , i.e., every is contractible along with all non-empty finite intersections of elements of . For such a good covering has the same homotopy type as .
We now propose our feature size that we call
Geodesic Feature Size (). [Geodesic Feature Size] Let be an embedded metric graph. We define the Geodesic Feature Size () of to be the supremum of all having the following property: for any , , if then , where is the length of the smallest edge of .
To motivate the above definition of , we take a finite sample from . Let be a cover of and let be its nerve, where is the Euclidean -ball centered at . An edge , between two vertices and in , is called transverse if and belong to two different edges of . If and is transverse, then the geodesic distance and the geodesic on is unique. This implies that there is at most one vertex of lying on this geodesic. We call this geodesic the geodesic shadow of the edge . The threshold for forces any transverse edge to be within around that vertex , where , the maximum is taken over all acute angles between any pair of edges of . The idea behind this definition of comes from our goal to estimate the diameter of non-trivial 1-cycles of that are not present in . These noisy one-cycles in are formed by some of the transverse edges. We also mention here without a proof that is positive for the type of metric graphs we are considering. In fact, we can show that , where is the length of the shortest edge in .
We now state our main reconstruction theorem for embedded metric graphs. This theorem proves the correctness of alg:graph for computing the 1-dimensional Betti number of .
Let and be a finite sample from such that . Then, , where is the inclusion map from , denotes the first homology group in coefficients and is as defined above.
Let and . As , it follows that . An application of the nerve lemma implies that there is an injective homomorphism from to . In other words, contains all the non-trivial 1-cycles of . Similarly, we can show that there also exists an injective homomorphism from to . We then consider the induced homomorphism , where . Finally, we show that . Therefore, .
We believe that it should be possible to find a simplicial complex with the same homotopy type as . We formulate this stronger result as follows:
Let and be a finite sample from such that each edge of can be covered by the union of -balls centered at the sample points on the same edge. Then the Vietoris-Rips complex , computed on w.r.t. the geodesic metric on the 1-skeleton of at a scale of , has the same homotopy type as .
The idea of collapsing the “small” 1-cycles in alg:graph motivates us to add a full simplex around each vertex whenever a subset of has a diameter smaller than the estimated scale. That is precisely what the Vietoris-Rips complex does on a finite metric space.
To further extend our result, we also consider a probabilistic reconstruction, as considered by the authors of . Given a chance of correct reconstruction, one can find the smallest sample size to guarantee the given chance of recovery. Also, we can extend our definition of to metric graphs and obtain similar reconstruction results. Lastly, we also consider the reconstruction question when samples that are drawn not exactly from our underlying space, but from a close vicinity of it.
The first, third, and fourth authors would like to acknowledge the generous support of the National Science Foundation under grants CCF-1618469 and CCF-1618605.
-  Aanjaneya, M., Chazal, F., Chen, D., Glisse, M., Guibas, L., and Morozov, D. Metric graph reconstruction from noisy data. International Journal of Computational Geometry & Applications 22, 04 (2012), 305–325.
-  Ahmed, M., Karagiorgou, S., Pfoser, D., and Wenk, C. Map Construction Algorithms. Springer, 2015.
-  Alexandroff, P. Über den allgemeinen dimensionsbegriff und seine beziehungen zur elementaren geometrischen anschauung. Mathematische Annalen 98, 1 (March 1928), 617–635.
-  Chazal, F., and Lieutier, A. Stability and computation of topological invariants of solids in . Discrete & Computational Geometry 37, 4 (2007), 601–617.
-  Dey, T. K. Curve and Surface Reconstruction: Algorithms with Mathematical Analysis (Cambridge Monographs on Applied and Computational Mathematics). Cambridge University Press, New York, NY, USA, 2006.
-  Gasparovic, E., Gommel, M., Purvine, E., Sazdanovic, R., Wang, B., Wang, Y., and Ziegelmeier, L. A complete characterization of the 1-dimensional intrinsic cech persistence diagrams for metric graphs. Research in Computational Topology. To appear; arXiv preprint arXiv:1702.07379.
-  Kuchment, P. Quantum graphs: I. some basic structures. Waves in Random Media 14, 1 (2004), S107–128.
-  Lieutier, A. Any open bounded subset of has the same homotopy type as its medial axis. Computer-Aided Design 36, 11 (2004), 1029–1046.
-  Niyogi, P., Smale, S., and Weinberger, S. Finding the homology of submanifolds with high confidence from random samples. Discrete And Computational Geometry 39. 1-3 (2008), 419–441.