The last two decades have witnessed great advances in methods using topology to analyze data, in a process called Topological Data Analysis (TDA). Their popularity is due in large part to their robustness and applicability to a variety of domains . The Reeb graph , which encodes the evolution of the connectivity of the level sets induced by a scalar function defined on a data domain, was originally proposed as a data structure to encode the geometric skeleton of 3D objects, but recently it has been re-purposed as a tool in TDA. Beside their usefulness in handling large data , Reeb graphs and their non-looping relative, contour trees , have been successfully used in feature detection , data reduction and simplification [7, 22], image processing , shape understanding , visualization of isosurfaces  and many other applications.
One challenge with Reeb graphs is that the graph may be too large or complex to directly visualize, therefore requiring further abstraction. Persistent homology , parameterizes topological structures by their life-time, providing a topological description called the persistence diagram. The notion of persistence can be applied to any act of birth that is paired with an act of death. Since the Reeb graph encodes the birth and the death of the connected components of the level sets of a scalar function, the notion of persistence can be applied to critical points in the Reeb graph . The advantage of this approach is simplicity and scalability—a large Reeb graph can be reduced to a much easier to interpret scatterplot. Fig. 1 shows an example, where a mesh with a scalar function (Fig. 1(a)) is converted into a Reeb graph (Fig. 1(b)). After that, the critical points are paired, and the persistence diagram displays the data (Fig. 1(f) and 1(g)). This final step can be challenging, particularly when considering essential critical points—those critical points associated with cycles in the Reeb graph—that each require an expensive search. While many prior works [8, 20, 14, 15, 25, 11, 10, 18, 9] have provided efficient algorithms for the calculation of Reeb graph structures, to our knowledge, none have provided a detailed description of an algorithm for pairing critical points.
In this paper, we describe and test 2 algorithms to compute persistence diagrams from Reeb graphs. The first is a multipass approach that pairs non-essential (non-loop) critical points using branch decomposition  on join and split trees, then pairing essential critical points also using join trees. This leads to our second approach, an algorithm for pairing both non-essential and essential critical points in a single-pass.
2 Reeb Graphs and Persistence Diagrams
2.1 Reeb graph
Let be a triangulable topological space, and let be a continuous function defined on it. The Reeb graph, , can be thought of as a topological summary of the space using the information encoded by the scalar function . More precisely, the Reeb graph encodes the changes that occur to connected components of the level sets of as goes from negative infinity to positive infinity. Fig. 1(a) and 1(b) show an example of a Reeb graph defined on a surface. For the simplicity of our description, we plot the Reeb graph with the vertical coordinate of a point as the function value .
can be used to classify points on the Reeb graph as follows. Letbe a point in . The up-degree of is the number of branches (1-cells) incident to that have higher values of than . The down-degree of is defined similarly. A point on is said to be regular if its up-degree and down-degree are equal to one. Otherwise it is a critical point. A critical point on the Reeb graph is also a node of the Reeb graph. A critical point is called a minimum if its down-degree is equal to . Symmetrically, a critical point is said a maximum if its up-degree is equal to . Finally, a critical point is said to be a down-fork/up-fork if its down-degree/up-degree is larger than .
2.2 Persistent Homology
The notion of persistent homology was originally introduced by Edelsbrunner et al. . Here we present the theoretical setting for the computation of the persistence diagram associated with a scalar function defined on a triangulated topological space. Consider the -dimensional homology class of a space, where are components, are tunnels/cycles,
are voids, etc. Persistent homology evaluates a sequence of vector spaces:, where , recording the birth and death events. In particular, the -th ordinary persistence diagram of , denoted as , is a multiset of pairs corresponding to the birth and death values of some -dimensional homology class.
Since the homology may not be trivial in general, any nontrivial homology class of , referred to as an essential homology class, will never die during the sequence. These events are associated with the cyclic portions of the Reeb graph. We refer to the multiset of points encoding the birth and death time of th homology classes created in the ordinary part and destroyed in the relative part of the sequence as the -th extended persistence diagram of , denoted by . In particular, for each point in there is an essential homology class in that is born in and dies at . Observe that for the extended persistence diagram the birth time for an essential homology class in is larger than or equal to death time for the relative homology class in that kills it.
2.3 Persistence Diagram of Reeb Graph
Of interest to us are the persistence diagram and extended persistence diagram . Pairing critical points can be computed independently of the Reeb graph. However, it is more efficiently computed by considering the Reeb graph . We give an intuitive explanation here and refer the reader to  for more details.
First, we distinguish between 2 types of forks in the Reeb graph, namely the ordinary (non-essential) forks and the essential forks. Let be a Reeb graph and let be a down-fork such that . We say that the down-fork is an ordinary fork if the lower branches of are contained in disjoint connected components and of . The down-fork is said to be essential if it is not ordinary. The ordinary and essential up-forks are defined similarly.
Ordinary Down-Forks of a Reeb Graph
We first consider pairing down-forks using sublevel set filtration. We track changes that occur in as increases. A connected component of is created when passes through a minimum of . Let be a connected component of . We say that a local minimum of creates if is the global minimum of . Every ordinary down-fork is paired with a local minimum to form one point in the persistence diagram as follows. Let be an ordinary down-fork with and let and be the connected components of . Let and be the creators of and . Without loss of generality we assume that . The homology class that is created at and dies at gives rise to a point in the ordinary persistence diagram . Note, a pair occurs when the minimum is a branch in the Reeb graph, hence we name it a branching feature.
Ordinary Up-Forks of a Reeb Graph
Ordinary up-forks are paired similarly using superlevel set filtration, pairing each up-fork with a local maximum to form points in the persistence diagram, , with the following variations. For an ordinary up-fork, , with , connected components and now come from . Assuming that , the homology class that is created at dies at and gives rise to a point in .
Cycle Features of a Reeb Graph
Let be an essential down-fork. We call the down-fork a creator of a -cycle in the sublevel set . As shown in , will be paired with an essential up-fork to form an essential pair , and a point in the extended persistence diagram . The essential up-fork is determined as follows. Let be the set of all cycles born at , each corresponding to a cycle in . Let be an element of with largest minimum value of among these cycles born at . The point is the point that the function achieves this minimum on the cycle .
3 Conditioning the Graph
Our approach is restricted to Reeb graphs where all point are either a minimum, maximum, up-fork with up-degree 2, or down-fork with down-degree 2. Fortunately, graphs that do not abide by these requirements can be conditioned to fit them. We define the degree of a node as the up-degree and down-degree.
There are 4 node conditions to be corrected: 1:1 nodes—Nodes with both 1 up- and 1 down-degree are regular. Therefore, they only need to be removed from the graph. This is done by removing the regular point and reconnecting the nodes above and below, as seen in Fig. 2(a). 0:2 (and 2:0) nodes—Nodes with 0 up-degree and 2 down-degree (or vice versa) are degenerate maximum (minimum) nodes, in that they are both down-fork (up-fork) and local maximum (minimum). As shown in Fig. 2(b), this condition is corrected by added a new node for the local maximum higher value, where is a small number. This type of degenerate node rarely occurs in Reeb graphs, but it frequently occurs in approximations of a Reeb graph, such as Mapper . 2:2 nodes—Nodes with both 2 up- and 2 down-degree are degenerate double forks, both down-fork and up-fork. Fig. 2(c) shows how double forks can be corrected by splitting into 2 separate forks, one up- and one down-fork, distance apart. 1:N>2 (and N>2:1) nodes—Nodes with down-degree (or up-degree) 3 or higher, are difficult forks to pair. These forks correspond to complex saddles in , such as monkey saddles. A single critical point pairing to these forks just reduces the degree of down-fork by 1, requiring complicated tracking of pairs. To simplify this, as seen in Fig. 2(d), these forks can be split into 2 forks apart. The upper down-fork retains 1 of the original down edges. The new down-fork connects with the old and takes the remaining down-edges. For even higher-order forks, the operation can be repeated on the lower down-fork.
Beyond these requirements, the Reeb graph is assumed a single connected component. If not, each connected component can simply be extracted and processed individually. Finally, all nodes on the Reeb graph are assume to have unique function values. If not, some processing order is arbitrary, and 0-persistence features may result.
4 Multipass Approach
The persistence diagram can be obtained by pairing the non-essential fork nodes of the Reeb graph. The extended persistence diagram can be obtained by pairing of essential fork nodes. We demonstrate these 2 steps using Fig. 1 as an example.
4.1 Non-Essential Fork Pairing
Identifying the non-essential forks can be reduced to calculating join and split trees on the Reeb graph (see Fig. 1(c) and 1(d)), in our case, using Carr’s et al.’s approach . Next, a stack-based algorithm, based upon branch decomposition , pairs critical points. The algorithm operates as a depth first search that seeks out simply connected forks (i.e., forks connected to 2 leaves) and recursively pairs and collapses the tree.
The algorithm processes the tree using a stack that is initially seeded with the root of the tree. At each iteration, 1 of 3 operation types occurs, as seen in Fig. 2(e). Operation Type 1 occurs when the top of the stack is a fork. In this case, the children of the fork are pushed onto the stack. Operation Type 2 occurs when the top of the stack is a leaf, but the next node is a fork. In this case, the leaf and fork have their orders swapped. Finally, operation Type 3 has 2 variants that occur when 2 leaf nodes sit atop the stack. In both variants, one leaf is paired with the fork, and the other leaf is pushed back onto the stack. The pairing occurs with the leaf that has a value closer to the value of the fork. The stack is processed until only a single leaf node remains, the global minimum/maximum of the join/split trees, respectively. The algorithm operates identically on both join and split trees. Finally, the unpaired global minimum and maximum are paired.
Fig. 3(a) shows an example for the join tree in Fig. 1(d). Initially the root is placed on the stack. A Type 1 operation pushes the children, and , onto the stack. Next, a Type 2 operation reorders the top of the stack. , a down-fork, in now atop the stack, pushing its 2 children, and , onto the stack. Another Type 1 pushes ’s children, and onto the stack. In the next 3 steps, a series of Type 3 operations occur. First and are paired, followed by and , and finally and . At the end, , the global minimum, is the only point remaining on the stack. The assigned pairs, /, /, and /, appear in the in Fig. 1(f), along with the split tree pairing, /, and the global min/max pairing, /.
4.2 Essential Forks Pairing
The remaining unpaired forks are essential forks, as seen in Fig. 1(e). We developed an algorithm from the high-level description of  to pair them. The essential fork pairing algorithm can be treated as join tree problem, processing forks one at a time. For a given up-fork, , the node can be split into two temporary nodes, and . A join tree can be computed by sweeping the superlevel set. At each step of the sweep, the connected components are calculated. The pairing for a selected essential up-fork occurs at the down-fork that merges and into a single connected component.
Fig. 4 shows the sweeping process for the up-fork . Initially (Fig. 4(a)), is split into and , which are each part of separate connected components, denoted by color (Fig. 4(b)). As the join tree is swept past (Fig. 4(c)), a new connected component is formed. In Fig. 4(d), is added to the connected component of . As the join tree is swept past (Fig. 4(e)), the and connected components join. The process continues until Fig. 4(h), where 3 connected components exist. The purple and yellow components join at (Fig. 4(i)). Finally at (Fig. 4(j)), both and are part of the same connected component. This indicates that pairs with . Fig. 4(k-m) shows the superlevel sets and associated join trees for the up-forks , , and . The pairing partner /, /, and / can all be seen in the in Fig. 1(g).
5 Single-Pass Algorithm: Propagate and Pair
In the previous section, we showed that the critical point pairing problem could be broken down into a series of merge tree computations. For non-essential forks this was in the form of join and split trees, which are merge trees of the superlevel sets and sublevel sets, respectively. For essential saddles, it came in the form of a special join tree calculation for each essential up-fork. A natural question is whether these merge tree calculations can be combined into a single-pass operation, which is precisely what follows.
5.1 Basic Propagate and Pair
The Propagate and Pair algorithm operates by sweeping the Reeb graph from lowest to highest value. At each point, a list of unpaired points from the sublevel set is maintained. When a point is processed in the sweep, 2 possible operations occur on these lists: propagate and/or pair.
The job of propagate is to push labels from unpaired nodes further up the unprocessed Reeb graph. 4 cases exist.
For local maxima nothing needs to propagate.
For down-forks all unpaired labels are propagated upwards. In the example of Fig. 5(c), the critical points and are paired, thus only is propagated to .
For up-forks all unpaired labels are propagated upwards. Additional labels for the current up-fork are created and tagged with the specific branch of the fork that created them (in the examples with subscripts and ). This tag is critical for closing essential cycles. In the example of Fig. 5(d), the labels and are propagated to , and labels and are propagated to .
The pairing operation searches the list of labels to determine an appropriate pairing partner from the sublevel set. The pairing operation only occurs for local maxima and down-forks.
For local maxima the labels list is searched for the unpaired up-fork with the largest value. Those critical points are then paired. In Fig. 5(o), for local maximum , the list is searched and is determined to be the closest unpaired up-fork.
For down-forks two possible cases exist, essential or non-essential, which can be differentiated by searching the available labels. First, the list is searched for the largest up-fork with both legs. Both legs indicate that the current down-fork closes a cycle with the associated up-fork. In the example, Fig. 5(m), the list of is searched and labels and found. If no such up-fork exists, then the down-fork is non-essential. In this case, the highest valued local minimum is selected from the list. In the example of Fig. 5(c), no essential up-forks are found for , and the largest local minimum, is selected instead.
5.2 Virtual Edges for Propagate and Pair
The basic propagate and pair will fail in certain cases, such as in Fig. 6(a). The failure arises from the assumption that the superlevel set is the only thing needed to propagate labels. In this case, label information needs to be communicated between and , which are connected by the node in the sublevel set. To resolve this communication issue, virtual edges are used. Virtual edges have 4 associated operations.
Virtual Edge Creation
Propagating labels across virtual edges is similar to standard propagation with one additional condition. A label can only be propagated if its value is less than that of the up-fork that generated the virtual edge. In other words, for a given label and a virtual edge , is only propagated if . Looking at the example in Fig. 6(d), for the virtual edge , only is propagated because . For the virtual edge : , , and are all propagated, since they all have values smaller than .
Virtual Edge Merging
When processing down-forks, all incoming virtual edges need to be pairwise merged. Fig. 6(k) shows an example. When processing down-fork , the virtual edges and are merged into a new virtual edge . For the purpose of label propagation, the virtual edge uses its minimum saddle, in this case .
Virtual Edge Propagation
Finally, virtual edges themselves need to be propagated. For up-forks, all virtual edges are propagated up to both neighboring nodes. In the case of down-forks, all virtual edges are similarly propagated, as we see in Fig. 6(e). During the virtual edge propagation phase, redundant virtual edges can also be culled. For example, the virtual edge is a superset of . Therefore, can be discarded. The necessity of the virtual edge process can also be seen in Fig. 5. In Figures 5(i)-5(l), the pairing of with D is only possible because of the virtual edge created by in Fig. 5(i).
|Data||Figure||Mesh||Reeb Graph Nodes||Cycles||Multipass||Single-pass|
|Vertices||Faces||Initial||Cond.||Time (ms)||Time (ms)|
|scivis_contest||3(b)||194k (avg)||—||117 (avg)||178.2 (avg)||81.3 (avg)||3.82 (total)||4.18 (total)|
We have implemented the described algorithms using Java. Performance reported in Table 1 was calculated on a 2017 MacBook Pro, 3.1 Ghz i5 CPU, 8 GB RAM.
We investigate the performance of the algorithms using the following:
Synthetic split trees, join trees, and Reeb graphs generated by a Python script. Given a positive integer , where , the script creates a fork consisting of a node with valency and 3 nodes with valency linked to the -valence node. At each iteration , another fork is generated, and 1 or 2 of its valency nodes are glued to the nodes in with valency 1. Constraining the gluing to a single node at each iteration results in a split tree.
Reeb graphs calculated on publicly available meshes in Figures 8 and meshes provided by AIM@SHAPE Shape Repository. Reeb graphs were extracted using our own Reeb graph implementation in C++.
Time-series of 120 Mapper graphs taken from the 2016 SciVis Contest111https://www.uni-kl.de/sciviscontest/, a large time-varying multi-run particle simulation, in Fig. 3(b). Our evaluation took one realization, smoothing length 0.44, run 50, and calculated the Mapper graphs for all 120 time-steps using the variable concentration. Our video, available at https://youtu.be/AcJX4GdzBZY, shows the entire sequence. The Mapper graphs were generated using a Python script that follows the standard Mapper algorithm .
Overview of Results
The performance for the algorithms can be seen in Table 1. These values were obtained by running the test 1000 times and storing the average compute time. The persistence diagrams of both the single-pass and multipass algorithms were compared in order to verify correctness. For most cases, the single-pass approach outperformed the multipass approach. The exceptions being the random split tree, random graph, and SciVis contest data, each of which we will discuss.
Random Split Tree vs. Join Tree
We compared the exact same tree structures as split trees and join trees by negating the function value of the input tree. The performance observed in Table 1 and Fig. 7(a) shows that the join tree performs significantly better than the split tree. The explanation for this is quite simple. The join tree consists of exclusively down-forks, while the split tree consists of exclusively up-forks. Since only up-forks generate virtual edges, the split tree created and processed many virtual edges, while the join tree has none. In fact, split trees represent one worst case by generating many unneeded virtual edges. From a practical standpoint, the algorithm can avoid situations like this by switching sweep directions (i.e. top-to-bottom), when the number of up-forks is significantly larger than the number of down-forks.
We next investigate the performance of randomly generated Reeb graphs, shown in Table 1 and Fig. 7(b). These Reeb graphs consist predominantly of cycles. This represents another type of worst case, since the many up-forks generate virtual edges, which are then merged into even more virtual edges at the down-forks. To verify this, we ran an experiment, as seen in Fig. 7(c), that randomly cuts cycles in the starting Reeb graph random_graph_5000 containing 2400 cycles. The break even was about 900 cycles (about essential and non-essential forks).
SciVis Contest Data
The SciVis contest data was “cycle heavy” as can be seen in the persistence diagram of Fig. 3(b). Given the random graph analysis, it is unsurprising that the performance of the single-pass approach was lower than the multipass approach.
7 Discussion & Conclusion
Pairing critical points is a key part of the TDA pipeline—the Reeb graphs capture complex structure, but direct representation is impractical. Critical point pairing enables a compact visual representation in the form of a persistence diagram. The value of representing a dataset with the persistence diagram is the simplicity and efficiency. Persistence diagrams avoid the occlusions problems of normal 3D datasets (e.g., the internal structure of Fig. 3(b)), and they avoid the potential confusion of direct representation of the Reeb graph (e.g., the Reeb graph of Fig. 8(h)). In addition, they provide sharp visual cue for time-varying data (see our video).
Our results showed that although the single-pass algorithm tended to outperform the multipass algorithm, there was no clear winner. We point out some advantages and disadvantages for each. The multipass algorithm has an advantage in simplicity of implementation. Once the merge tree and branch decomposition are implemented, the only necessity is repeated calls to those algorithms. This approach also has a potential advantage for (limited) parallelism. First, processing join and split trees in parallel, then all essential up-forks. The single-pass algorithm showed a slight edge in performance, particularly for data with a balance between essential and non-essential forks. The other significant advantage of the approach is that it is in fact a single-pass approach, only visiting critical points once. This is useful for streaming or time-varying data, where the critical points arrive in order, but analysis cannot wait for the entire data to arrive.
This project was supported in part by National Science Foundation (IIS-1513616 and IIS-1845204). Mesh data are provided by AIM@SHAPE Repository.
-  Agarwal, P.K., Edelsbrunner, H., Harer, J., Wang, Y.: Extreme elevation on a 2-manifold. Discrete & Computational Geometry 36(4), 553–572 (2006)
-  Attene, M., Biasotti, S., Spagnuolo, M.: Shape understanding by contour-driven retiling. The Visual Computer 19(2), 127–138 (2003)
-  Bajaj, C.L., Pascucci, V., Schikore, D.R.: The contour spectrum. In: Proceedings of the 8th IEEE Visualization. pp. 167–ff (1997)
-  Bauer, U., Ge, X., Wang, Y.: Measuring distance between reeb graphs. In: Symposium on Computational Geometry. p. 464 (2014)
-  Boyell, R.L., Ruston, H.: Hybrid techniques for real-time radar simulation. In: Proceedings of the November 12-14, 1963, fall joint computer conference. pp. 445–458 (1963)
-  Carr, H., Snoeyink, J., Axen, U.: Computing contour trees in all dimensions. Computational Geometry: Theory and Applications 24(2), 75–94 (2003)
-  Carr, H., Snoeyink, J., van de Panne, M.: Simplifying flexible isosurfaces using local geometric measures. Proceedings 15th IEEE Visualization pp. 497–504 (2004)
-  Cole-McLaughlin, K., Edelsbrunner, H., Harer, J., Natarajan, V., Pascucci, V.: Loops in reeb graphs of 2-manifolds. In: Symposium on Computational Geometry. pp. 344–350 (2003)
-  Doraiswamy, H., Natarajan, V.: Efficient output-sensitive construction of reeb graphs. In: International Symposium on Algorithms and Computation. pp. 556–567. Springer (2008)
-  Doraiswamy, H., Natarajan, V.: Efficient algorithms for computing reeb graphs. Computational Geometry 42(6), 606–616 (2009)
-  Doraiswamy, H., Natarajan, V.: Computing reeb graphs as a union of contour trees. IEEE Transactions on Visualization and Computer Graphics 19(2), 249–262 (2013)
-  Edelsbrunner, H., Harer, J., Mascarenhas, A., Pascucci, V.: Time-varying reeb graphs for continuous space-time data. In: Symposium on Computational Geometry. pp. 366–372 (2004)
-  Edelsbrunner, H., Letscher, D., Zomorodian, A.: Topological persistence and simplification. In: Symposium on Foundations of Computer Science. pp. 454–463 (2000)
-  Harvey, W., Wang, Y., Wenger, R.: A randomized O (m log m) time algorithm for computing Reeb graphs of arbitrary simplicial complexes. Symp. on Comp. Geo. pp. 267–276 (2010)
Hilaga, M., Shinagawa, Y.: Topology matching for fully automatic similarity estimation of 3D shapes. SIGGRAPH pp. 203–212 (2001)
-  Kweon, I.S., Kanade, T.: Extracting topographic terrain features from elevation maps. CVGIP: image understanding 59(2), 171–182 (1994)
-  Munch, E.: A user’s guide to topological data analysis. J. Learn. Analytics 4(2), 47–61 (2017)
-  Parsa, S.: A deterministic time algorithm for the Reeb graph. In: ACM Sympos. Comput. Geom. (SoCG). pp. 269–276 (2012)
-  Pascucci, V., Cole-McLaughlin, K., Scorzelli, G.: Multi-resolution computation and presentation of contour trees. In: IASTED Conf. Vis., Img., and Img. Proc. pp. 452–290 (2004)
-  Pascucci, V., Scorzelli, G., Bremer, P.T., Mascarenhas, A.: Robust on-line computation of Reeb graphs: Simplicity and speed. ACM Transactions on Graphics 26(3), 58.1–58.9 (2007)
-  Reeb, G.: Sur les points singuliers dune forme de pfaff completement intgrable ou dune fonction numrique. CR Acad. Sci. Paris 222, 847–849 (1946)
-  Rosen, P., et al.: Using contour trees in the analysis and visualization of radio astronomy data cubes. In: TopoInVis (2019)
Singh, G., Mémoli, F., Carlsson, G.E.: Topological methods for the analysis of high dimensional data sets and 3d object recognition. In: Eurographics SPBG. pp. 91–100 (2007)
-  Takahashi, S., Takeshima, Y., Fujishiro, I.: Topological volume skeletonization and its application to transfer function design. Graphical Models 66(1), 24–49 (2004)
-  Tierny, J., Vandeborre, J.P., Daoudi, M.: Partial 3D shape retrieval by Reeb pattern unfolding. Computer Graphics Forum 28(1), 41–55 (2009)