Topological persistence for a filtration or a piecewise linear function on a simplicial complex is known to be computable in time  where is the number of simplices in and is the exponent of matrix multiplication. The question regarding the lower bound on its computation was largely open until Edelsbrunner and Parsa  showed that computing the rank of the first homology group of a simplicial complex linearly embedded in is as hard as the rank computation of a sparse - matrix. The current upper bound for matrix rank computation is super-quadratic  and lowering it is a well-recognized hard problem. Consequently, computing the dimension of the homology groups and hence the topological persistence for functions on general complexes in better than super-quadratic time is difficult, if not impossible. But, what about the special cases that are still interesting? The complexes embedded in three dimensions which arise in plenty of applications present such cases.
It is easy to see that the Betti numbers , the rank of the th homology group defined over a finite field for a simplicial complex linearly embedded in can be computed in time. For this, compute with a walk over the boundaries of the voids, compute as the number of components of , and then compute from the Euler characteristics of obtained as the alternating sum of the numbers of simplices of each dimension. Unfortunately, computation of other topological properties such as persistence and homology generators (basis) for such a complex is not known to be any easier than that of matrix multiplication ( time). In the special case when is a graph or a surface, the persistence for a PL function or a filtration on can be computed in time [1, 10]. In this paper, we show that when is more general, that is, a simplicial complex linearly embedded in , the persistence of a height function on it can be computed in time. This special type of persistence which we term as the height persistence is not as general as the standard persistence. Nonetheless, it provides an avenue to compute a set of basis cycles in time where is the total size of the output. Also, the height persistence provides a window to the topological features of the domain , the need for which arises in various applications.
To arrive at our result, we first observe a connection between the standard sublevel-set persistence [11, 17] and the level-set zigzag persistence  from the recent work in [3, 4, 6]. Then, with a sweep-plane algorithm that treats the level sets as planar graphs embedded in a plane, we compute a barcode graph in time. A barcode is extracted from this graph using a slight but important modification of an algorithm in . The barcode extracted from this graph provides a part of the height persistence. We show that the missing piece can be recovered from the Reeb graph which can be computed again in time . We make other observations that allow us to extract the actual basis cycles from both pieces in time as claimed.
A zigzag diagram of topological spaces is a sequence
where each is a topological space and each bidirectional arrow ‘’ is either a forward or a backward continuous map. Applying the homology functor with coefficient in a field
, we obtain a sequence of vector spaces connected by forward or backward linear maps, also called a zigzag module:
When all vector spaces in are finite dimensional, the Gabriel’s theorem in quiver theory  says that is a direct sum of a finite number of interval modules which are of the form
where for and otherwise with the maps and being identities. The decomposition provides a barcode (set of interval modules) for topological persistence when the topological spaces originate as sublevel or level sets of a real-valued function defined on a space . As shown in , classical persistence [11, 17], its extended version , and the more general zigzag persistence  arise as a consequence of choosing variants of the module in 2.1 that are derived from .
2.1 Standard persistence
Standard persistence [11, 17] is defined by considering the sublevel sets of , that is, is for some . These values are taken as the critical values of so that the barcode captures the evolution of the homology classes of the sub-level sets across the critical values of , which are defined below precisely.
For an interval , let denote the interval set. Following [3, 6], we assume that is tame. It means that it has finitely many homological critical values so that for each open interval , is homeomorphic to a product space , with . This homeomorphism should extend to a continuous function , with being the closure of and each interval set should have finitely generated homology groups.
It turns out that the description of the interval modules assumes one more subtle aspect when it comes to describing the standard persistence and zigzag persistence in general. Specifically, the interval modules can be open or closed at their end points. To elucidate this, consider a set of values of interleaving with its critical values:
Assuming and , one can write the sub-level sets as . For standard persistence, we consider the sublevel set diagram and its corresponding homology module for dimension :
The summand interval modules, or the so called bars, for this case has the form . This means that a -dimensional homology class is born at the critical value and it dies at the value . The right end point of is an artifact of our choice of the intermediate value . Because of our assumption that is tame, homology classes cannot die in any open interval between the critical values. In fact, they remain alive in the interval and may die entering the critical value . To accommodate this fact, we convert each bar of the standard persistence to a bar that is open on the right end point.
One can see that there are two types of bars in the standard persistence, one of the type , , which is bounded (finite) on the right, and the other of the type which is unbounded (infinite) on the right. The unbounded bars represent the essential homology classes since . The work of [3, 4, 6] implies that both types of bars of the standard persistence can be recovered from those of the level set zigzag persistence as described next. This observation leads to an efficient algorithm for computing the standard persistence in .
2.2 Level set zigzag
In level set zigzag persistence, we track the changes in the homology classes in the level sets instead of the sub-level sets. We need maps connecting individual level sets, which is achieved by including the level sets into the adjacent interval sets. For this purpose we use the notation for the interval set between the two non-critical level sets. We have a zigzag sequence of interval and level sets connected by inclusions producing a level set zigzag diagram:
Applying the homology functor with coefficients in a field , we obtain the zigzag persistence module for any dimension
The zigzag persistence of is given by the summand interval modules of . Each interval module is of the type where and can be or for some . Just as in the sub-level set persistence, we identify the end points of the interval modules with the critical values [r] that were used to define the level set zigzag in the first place. In keeping with the understanding that even the level set homology classes do not change in the open interval sets, we convert an endpoint to an adjacent critical value and make the interval module open at that critical value. Precisely we modify the interval modules as (i) , (ii) (iii) (ii) . The intervals in (i)-(iv) are referred as closed-closed, closed-open, open-closed, and open-open bars respectively. The figure above shows the two bar codes, one for and another for for a height function on a torus. The rightmost picture shows the barcode graph of which we explain later.
is a bar for iff it is so for ,
is a bar for iff either is a closed-closed bar for for some , or is an open-open bar for for some .
We know . The first summand given by the finite intervals is isomorphic to a similar summand in the level set zigzag module ; see (Table 1, Type I). The second summand is isomorphic to , which by a result in  is isomorphic to where the open-open interval modules in generate and the closed-closed interval modules in generate . Then, the claimed result follows again from (Table 1, Type III and IV). ∎
Overview and main results. Let be a simplicial complex consisting of simplices that are linearly embedded in . Let denote the geometric realization arising out of this embedding. First, assume that is a pure -complex, that is, its highest dimensional simplices are triangles and all vertices and edges are faces of at least one triangle. The algorithm for the case when it has tetrahedra and possibly edges and vertices that are not faces of triangles follows straightforwardly from the case when is pure, and is remarked upon at the end. Another assumption we make for our algorithm is that the coefficient field of the homology groups is .
A function is called a height function if there is an affine transformation of the coordinate frame so that for all points with -coordinate being . Without loss of generality, assume that is indeed the -coordinate function and is proper, that is, its values on the vertices are distinct. The standard topological persistence of on is called the height persistence which we aim to compute. Theorem 1 says that we can compute the barcode of the height persistence by computing the same for the level set zigzag persistence using the same height function. Precisely, we first compute the barcode for from which we obtain a partial set of bars for and the complete set of bars for . This is achieved by maintaining a level set data structure and tracking a set of primary cycles in them as we sweep through along increasing . At the same time, we build a barcode graph that registers the birth, death, split, and merge of the primary cycles. We show that this can be done in time. The bars of are extracted from this graph again in time by adapting an algorithm of  to our case after a slight but important modification. According to Theorem 1, the closed-open and closed-closed bars of constitute a partial set of bars for . The open-open bars of , on the other hand, constitute a complete list of bars for the second homology module because the other summands for are trivial.
The rest of the bars of which are the open-open bars of (Theorem 1) are shown to be captured by the Reeb graph of on which can be computed in time . We show that the basis cycles for the first and second homology groups can be computed as part of the level set persistence and Reeb graph computations.
Let be a simplicial complex embedded in with simplices. Let be a height function defined on it. One can compute the barcode for for , in time where is the number of simplices in . Furthermore, a set of basis cycles for , , can be computed in time where is the total size of the output cycles.
Similar statement holds for standard persistence.
Let be a simplicial complex embedded in with simplices. Let be a height function defined on it. One can compute the barcode for for , in time where is the number of simplices in .
3 Level set data structure
Let be the set of vertices of ordered by increasing -values, that is, for . Consider sweeping in the increasing order of -values. A level set , , viewed as a graph embedded in the plane , does not change its adjacency structure in any open interval . This structure, however, may change as the level set sweeps through a vertex of . Consequently, for every vertex , it suffices to track the changes when the level set jumps from the intermediate level to the level and then to the intermediate level where , and . All three level sets , , and are plane graphs embedded linearly in the planes , , and respectively. Let denote any such generic level set graph at a level , where the vertex set is the restrictions of the level set to the edges of and the edge set is the restriction of the level set to the triangles of . To avoid confusions, we will say complex edges and complex triangles to refer to the edges and triangles of respectively.
Level set graph and homology basis. We need to track a set of cycles representing a homology basis of to that of and then to that of as we sweep through the vertex . Consider any such generic level set graph representing . Primary and secondary cycles. [r] The embedding of in the plane produces a partition of into -dimensional faces, -dimensional edges, and -dimensional vertices. The faces are the connected components of . Let denote the collection of all -faces in this partition. A face has boundary cycle consisting of possibly multiple components, each being a cycle. We orient by orienting its boundary and denote it with . The orientation is such that has the face on its right. In Figure 3, the face has two boundaries, one around the outer curve (shown solid) and another around the inner circle (shown dotted). The unique face in that is unbounded plays a special role and is denoted .
For a bounded face , there is a unique oriented cycle that bounds a bounded face of on its right. By definition, the unbounded face has no such . In the figure above, is the solid curve around outer boundary.
Because of the uniqueness of the cycles , we give them the special name of primary cycles. All other cycles are secondary. In Figure 3, the primary cycles are rendered solid and the secondary ones are rendered dotted. Recall that the elements of the first homology group are classes of cycles denoted for a cycle . It turns out that the classes of unoriented primary cycles form a basis for and thus tracking the primary cycles across the levels become the key to computing the level set zigzag persistence.
The classes of unoriented cycles form a basis of .
We observe the following facts:
The classes of unoriented primary cycles form a sub-basis of .
where denotes the reduced zero-dimensional homology group.
The faces in form a basis of .
For the first fact, observe that the set of such cycles are independent meaning that there is no unoriented primary cycle that can be written as the sum of other unoriented primary cycles. If it were true, let . Then, the boundary of is empty. But, that is impossible unless . Since , we have .
The second fact follows from Alexander duality because is embedded in the plane . The third fact follows from the definition of reduced homology groups.
Consider a map that sends each face to its unoriented primary cycle . This map is bijective due to Observation 3.1. Therefore, by the first and third facts, is isomorphic to the summand of generated by the classes of unoriented primary cycles. Indeed, this summand is itself since is isomorphic to by the second fact. ∎
The following Proposition complements Proposition 4. We do not use it, but remark about its connection to Reeb graphs at the end.
The -classes of unoriented secondary cycles form a basis of .
Representing level set graphs. Proposition 4 implies that we can maintain a basis of by maintaining the primary cycles alone. However, for realizing the zigzag maps that connect across the level sets (Eqn. 2.2), we need a different basis involving both primary and secondary cycles. For each bounded face , let be the boundary cycle for the face which is the -addition of the primary cycle with the secondary ones in . The next assertion follows from Proposition 4 immediately.
The classes form a basis of .
The importance of the boundary cycles in realizing the zigzag maps needed for the persistence module in Eqn. 2.2 is due to the following observation.
For every and for every boundary cycle in the intermediate level , there are sum of boundary cycles and at the critical levels and respectively with so that the inclusions of , and into the interval space induce linear maps at the homology levels given by , .
By Proposition 6 and the above observation, the zigzag maps of the persistence module in Eqn. 2.2 can be tracked if we track the boundary cycles for each face. However, this requires additional bookkeeping for maintaining the primary and secondary cycles of a face together. Instead, we maintain each individual primary and secondary cycle independently being oblivious to their correspondence to a particular face though this information is maintained implicitly. Due to Proposition 4, it becomes sufficient to register the changes in the primary cycles for tracking the boundary cycles.
The primary and secondary cycles change as we sweep over vertices. Figure 1 illustrates some of these changes. A secondary cycle may split into two cycles one of which is primary and the other is not ( in Fig.), it may split into two secondary cycles ( in Fig.), or two primary cycles may merge (, in Fig.). Therefore, we need to maintain all oriented cycles in , and keep track of the primary ones among them.
We consider a directed version of where each edge is converted into two directed edges in that are oriented oppositely. The graph is represented with a set of oriented cycles that bound the faces in on right. These cycles are represented with a sequence of directed edges. Connection rules. [r] A vertex in either lies on a vertex , or in the interior of a complex edge in which case we denote it as the vertex . Any edge in is an intersection of the level set with a complex triangle , which we also denote as an edge . Let be any edge adjoining a vertex . We have two directed copies and of in . Assume that is directed away from and is directed toward .
We follow a connection rule for deciding the connections among the directed edges around to construct the cycles in as follows. Let and be a pair of directed edges, where the head of is the tail of . The directed path locally separates the plane around the meeting point of and . The region to the right of is called its right wedge, and the region to the left is called its left wedge. We have three cases for deciding the connections:
has only one edge ( in Figure 1): connect to .
has exactly two edges and ( in Figure 1): connect to , and connect to .
has three or more edges ( in Figure 1): consider a circular order of all edges adjoining . Let be this circularly ordered edges around . For any consecutive pairs of edges , , determine if the right wedge of contains the edge . If so, connect to . If not, connect to .
The choice of our orientations and connections leads to the following observation:
Let be any pair of directed edges in . They are consecutive directed edges on the oriented boundary of a face if and only if connects to by the connection rule around some vertex .
The observation above relates the directed cycles in with a local connection rule. We exploit this fact to update the cycles locally in our algorithm.
Cycle trees. The directed cycles in are represented with balanced trees that help implementing certain operations on them efficiently. We explain this data structure now.
A directed edge where or is represented with a node that has three fields; points to the complex triangle , and point to the complex edges and respectively where is directed from to . A cycle of directed edges is represented with a balanced tree , namely a 2-3 tree  where the directed edges of constitute the leaf nodes of with the constraint that the leaves of any subtree of represent a path (directed) in . The leaves of are joined with a linked list in the order they appear on the directed cycle . A pointer in a leaf node implements this link list. The node also maintains another pointer to access the previous node on the linked list in time. However, it is important to keep in mind that it is the pointers that provide the orientation of the cycle . Furthermore, the last node in both linked lists connected by and pointers respectively is assumed to connect to the first one. This creates the necessary circularity without actually making the list circular. We denote the linked list of leaves of a tree as . The 2-3 trees built on top of the paths support the following operations.
find(): returns the root of the tree belongs to.
split(): splits a tree into two trees and where is the sublist of that contains all elements in before , and is the sublist that contains all elements in after and including .
join(): takes two trees and and produces a single tree with as the concatenation of and in this order.
permute(): makes the first node in the cycle represented with . It is implemented by calling split() that produces and , and then returning join(,).
insert(): inserts the element after in where find().
delete(): deletes from where find().
All of the above operations maintain the trees well balanced allowing traversal of a path from a leaf to the root in time where is the total number of elements in the lists of the trees involved. This in turn allows each of these operations to be carried out in time. Using these basic operations, we implement two key operations, splitting and merging of cycles.
splitcycle(): This splits a directed cycle into two. A cycle may get first pinched and then splits into more cycles as we sweep through a vertex. This operation is designed to implement this event. Given a tree , it returns two trees and where represents the path from to in the directed cycle given by , and represents the path from to in the same cycle. See Figure 2, bottom row.
It is implemented as follows: Let permute(). Call split() which returns two trees and as required.
mergecycle(): This merges the two cycles that and belong to. The new cycle has after and after . This is implemented as follows: Let find() and find(). Let permute() and permute(). Then, return join().
4 Updating level sets
Now we describe how we update the graph to and then to . As we sweep through , only the cycles in these graphs containing a vertex on a complex edge with as an endpoint may change combinatorially. We only update the cycles for combinatorial changes to make sure that the combinatorics of the level set graphs are maintained correctly though their geometry is updated only when needed to infer the correct adjacencies. This allows us to inspect only simplices where is the number of simplices adjoining in . Summing over all vertices, this provides an bound which gets multiplied with the complexity for the tree operations that we perform for each such simplex. Also, local circular sorting of edges around each vertex and complex edges connected to it accounts for time in total.
Primary cycle detection. The cycles in that change combinatorially may experience splitting, merging, edge contraction, edge expansion, or a combination of such events. Specifically, during splitting and merging, new cycles are generated which need to be characterized as primary or not. Figure 2 illustrates two cases of a secondary cycle splitting. Two similar cases arise for the primary cycle splitting. For merging also we have four cases mirroring the splitting case. It turns out that we can determine if the new cycles are primary or not by the orientations of the edges around the ‘pinching’ vertex if we know the type (primary or not) of the original cycles. We explain this for the case of splitting.
Let be a cycle in which splits at . Let and be any two non-consecutive directed edges in that meet at in . Assume that we know that is secondary. The case when is primary is similar. We need to distinguish the case when one of the two new cycles nests inside the other. This can be checked in time by determining if the right wedge of contains or not. If not, both new cycles remain secondary. Otherwise, we have a nesting, and exactly one of the two new cycles becomes primary. We can determine again which of the two becomes primary in time. For this consider a ray with tail at and entering the left wedge of . If this ray enters the left wedge of , we declare the new cycle containing and to be secondary and the other cycle containing and to be primary. If the ray enters the right wedge, we flip the assignment for the type of the two new cycles.
With these local checks, we design the two routines below that decide the type of the new cycle(s) in both the splitting and merging cases assuming that we know if the input cycle(s) are primary or not.
splitPrim(,,): This routine assumes that indicates if the cycle to be split which contains and is primary (true or false), and returns a pair of booleans where is true if and only if the new cycle containing is primary.
mrgPrim(,,,): This routine assumes that the input boolean variables indicates if the cycle containing is primary, and returns a boolean variable which is true if and only if the new merged cycle is primary.
Now we describe the actual updates of the graphs when the sweep goes through a vertex . For convenience, we designate a complex triangle as top, middle, or bottom if it has as the lowest, middle, or highest vertex respectively w.r.t. the height . Similarly, a complex edge is called top, or bottom if it has as the lowest or highest vertex respectively. As we continue with the sweep, we keep on recording the birth, death, splitting and merging of primary cycles by creating a barcode graph. Current primary cycles are represented by current edges in the barcdoe graph whose one endpoint is already determined, but the other one is yet to be determined. The nodes in the barcode graph are created when a primary cycle is born, dies, splits, or merges with another cycle. It is important to note that the nodes of the barcode graph are created only at the intermediate levels . Each tree maintains a pointer that points to a current edge in the barcode graph if its cycle is primary. Otherwise, this pointer is assumed to be a null pointer. Additionally, we assume that there is a boolean field which is set true if and only if represents a primary cycle. The barcode graph at level is denoted . As we move from level to the next level , we keep updating this barcode graph by recording the birth, death, splitting and merging of primary cycles and still denote it as till we finish processing level at which point we denote it as .
Updating to . The combinatorics of change only by the edges where is a bottom or middle triangle. If the edge has both vertices on bottom complex edges, then is contracted to in . Otherwise, the edge remains in , but its adjacency at the vertex which becomes in changes. Also, in both cases classes in may die or be born. We perform the combinatorial changes and detect the birth and deaths of homology classes as follows:
Contracting edges: When we contract edges, a cycle may simply contract and nothing else happens. But, we may also detect that a primary cycle of three edges is collapsed to two directed edges corresponding to a single undirected edge. This indicates a death of a class in which occurs entering the level but not exactly at . So, we operate as follows.
Let be any bottom complex triangle for , and let and be two directed edges associated with . Let find() and find(). For , we call delete(). If and has two leaves, we terminate the current edge pointed by with a closed node at level in and remove completely. The closed node indicates that the cycle dying entering the level is still alive at the level .
Cycle updates: The edges of a cycle in can come together at to create new cycles. After the edge contractions, the only edges that we need to update for possible combinatorial changes correspond to middle complex triangles. Let be such a triangle and let and be its edges that are top and bottom edges for respectively. For each directed edge with , we update or if originally we had or respectively.
Next, we update the cycles that may combinatorially change due to splitting or merging at , and also record new births as a result. We consider every directed edge so that the triangle is a middle triangle and determine a circular order of their undirected versions around . For every such directed edge , we determine its pair directed edge using the connection rule that we described before. Observe that plane embedding of the level set graph is used here. Actually, the lack of such canonical ordering of edges around a vertex for level set graphs becomes the roadblock for extending this algorithm to persistence of functions that are not heights. Let find() and find(). We have two cases: the splitting case when (see and in Figure 3) and the merging case when .
Splitting Case, : If , the cycle containing and and represented by does not change and we do nothing. Otherwise, the cycle splits into two new cycles whose type needs to be determined. So, we call splitPrim(,,) which returns a pair of boolean values indicating if the two new cycles are primary or not. We split to create the representations of the two new cycles. But, this operation destroys whose type (primary or not) and barcode pointer are needed for assigning the same for the two new trees. So, we save and first, and call splitcycle(,,) which returns two trees and representing the two cycles. Geometric constraints allow only the following two cases:
A new primary cycle is born at the level .
This is an open-ended birth at the level because the cycle exists at the level but not at the level .
If , we set
Then, we set where is a current edge created with an open end at level in .
Case(ii): : A new primary cycle is born at the level due to a split of the cycle represented by the saved pointer . We set , for , and call splitbar(,) which splits at level and returns two current edge pointers and . We set and .
Merging Case, : two cycles and represented by and respectively merge to become one. As before, we first store aside the type of and and associated current edge pointers by setting and for . Next, we call mergecycle(,) which merges the two cycles containing and and returns a tree representing this new cycle, say . A call to mrgPrim(,,,) returns a boolean variable which is true if and only if is primary. Again, we have only the following two cases.
In this case no primary cycle dies, but the new
cycle remains primary. So, no
current edge is terminated and the current edge associated
to the primary cycle among and is continued
by . If , we set ,
Case(ii):: No primary cycle dies and the new cycle is also not primary. We set and .
Updating to . To update to , we need to create directed edges corresponding to top triangles, that is, the complex triangles with as the bottom vertex. These new edges
change the combinatorics of in four ways: they may (i) expand the existing cycles without creating or destroying any primary class, (ii) create a new cycle giving birth to a new class, (iii) split a cycle pinched at (it turns out that no new class is born in this case), (iv) merge two cycles meeting at ; in this case, a primary cycle dies. Two cycles containing directed edges and in in Figure 3 get merged into one cycle in . The details of the merge is shown in Figure 4. It is preceded by an insertion of a sequence of edges that connect and . Similarly, another sequence connects and .
Expanding cycles: We iterate over all directed edges corresponding to the middle triangles. Let be any such directed edge in . The directed edge belongs to a unique cycle in the directed graph . Starting from , we aim to create the missing edges in . For this, we create a routine nextLink() that takes a directed edge and creates all missing directed edges in that lie between and the next directed edge with being a middle triangle.
nextLink(): Consider the complex edge and the circular order of directed edges of around . In time we determine the adjacent directed edge using the connection rule described before. If does not exist already, we create a directed edge for and insert it into the tree containing by calling insert(,). Replacing the role of with , we continue. If is a middle triangle, we stop and return .
To complete updating containing the directed edge , we call nextlink() which returns, say . Let find() and find(). We have two cases:
Splitting Case, : If