If You Must Choose Among Your Children, Pick the Right One

Given a simplicial complex K and an injective function f from the vertices of K to ℝ, we consider algorithms that extend f to a discrete Morse function on K. We show that an algorithm of King, Knudson and Mramor can be described on the directed Hasse diagram of K. Our description has a faster runtime for high dimensional data with no increase in space.

Authors

• 1 publication
• 1 publication
• 1 publication
• 1 publication
• A Sparse Structure Learning Algorithm for Bayesian Network Identification from Discrete High-Dimensional Data

This paper addresses the problem of learning a sparse structure Bayesian...
08/21/2021 ∙ by Nazanin Shajoonnezhad, et al. ∙ 0

• High dimensional gaussian classification

High dimensional data analysis is known to be as a challenging problem. ...
06/04/2008 ∙ by Robin Girard, et al. ∙ 0

• Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models

We consider classifiers for high-dimensional data under the strongly spi...
10/30/2017 ∙ by Makoto Aoshima, et al. ∙ 0

• The Maximum Likelihood Threshold of a Path Diagram

Linear structural equation models postulate noisy linear relationships b...
05/14/2018 ∙ by Mathias Drton, et al. ∙ 0

• Deep Energy Estimator Networks

Density estimation is a fundamental problem in statistical learning. Thi...
05/21/2018 ∙ by Saeed Saremi, et al. ∙ 0

• Using topological autoencoders as a filtering function for global and local topology

Choosing a suitable filtering function for the Mapper algorithm can be d...
12/06/2020 ∙ by Filip Cornell, et al. ∙ 0

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Milnor’s classical Morse theory provides tools for investigating the topology of smooth manifolds [milnor63]. In [forman98], Forman showed that many of the tools for continuous functions can be applied in the discrete setting. Inferences about the topology of a CW complex can be made from the number of critical cells in a Morse function on the complex.

Given a Morse function one can interpret the function in many ways. Switching interpretations is often revealing. In this paper, we think of a discrete Morse function in three different ways. Algebraically, a Morse function is a function from the faces of a complex to the real numbers, subject to certain inequalities. Topologically, a Morse function is a pairing of the faces such that the removal of any pair does not change the topology of the complex. Combinatorially, a Morse function is an acyclic matching in the Hasse diagram of the complex, where unmatched faces correspond to critical cells.

Discrete Morse theory can be combined with persistent homology to analyze data, see [king, uli-phd, bauer2012optimal, edelsbrunner02, wang18, vcomic2011dimension, edelsbrunner2003hierarchical]. When dealing with data, we have the additional constraint that vertices have function values assigned. For complexes without any preassigned function values, Joswig and Pfetsch showed that finding a Morse function with a minimum number of critical cells is NP-Hard [joswig04]. Algorithms that find Morse functions with relatively few critical cells have been explored in [lewiner03, nanda2013morse, hersh05].

In this work, we consider the algorithm , LABEL:extract.LABEL:, given in [king].  takes as input a simplicial complex and an injective function from the vertices to the reals, and returns a discrete Morse function, giving topological information about the complex. We show that a subalgorithm of ,  can be simplified by considering the directed Hasse diagram. This simplification leads to an improved runtime and no change in space. The paper is organized as follows, in background, we provide the definitions that will be used in the paper. In king, we describe  and analyze the runtime, then, in ours, we give our reformulation and show that the runtime is improved from to where is the number of cells and is the dimension of .

2 Background

In this section, we provide definitions, notation, and primitive operations used throughout the paper. For a general overview of discrete Morse theory see [nic, knudson2015morse], note that both texts provide a description of  originally given in [king].  is the starting point for this work.

In what follows, we adapt the notation of Edelsbrunner and Harer [edelsbrunner2010computational] to the definitions of Forman [forman02]. Here, we work with simplicial complexes, but the results hold for CW complexes. Let be a simplicial complex with simplices. For , denote the -simplices of as , the number of simplices in as , and the dimension of the highest dimensional simplex of as .

Let denote the dimension of as . Let and be the zero-simplices of , then we say . If is disjoint from , then we can define the join of and to be the -simplex that consists of the union of the vertices in and , denoted . We write if is a proper face of .

Let and consider simplices , with and . Let be an injective function. Without loss of generality, assume that the zero-simplices of and are sorted by function value, that is, we have when , similarly for . We say that is lexicographically smaller than , denoted

, if the vector

is lexicographically smaller than .

The star of in , denoted , is the set of all simplices of containing . The closed star of in , denoted , is the closure of . The link of in , is denoted as . We define the lower link of , denoted , to be the maximal subcomplex of whose zero-simplices have function value less than ; the lower link can be computed in  time. See LABEL:link.LABEL:for details on computing the lower link.

We provide the definition of a Morse function, modified from Forman [forman02]. See defs for the equivalence of the definitions.

[Morse Function]definitionmorsedefn A function is a discrete Morse function, if for every , the following two conditions hold:

An intuitive definition is given in [nic], “the function generally increases as you increase the dimension of the simplices. But we allow at most one exception per simplex." Let be a discrete Morse function. A simplex is critical if the following two conditions hold:

Simplices that are not critical are called regular.

Given a discrete Morse function on a simplicial complex , we define the induced gradient vector field, or GVF, for short, as . Note that is a codimension one face of . See impossible. We can gain some intuition for this definition by drawing arrows on the simplicial complex as follows. If is regular, a codimension one face of , and , then we draw an arrow from to . Constructing a GVF for a  is as powerful as having a discrete Morse function, and is the goal of both  and our proposed LABEL:erchild.LABEL:.

Next, we define two functions that are helpful when constructing a GVF. The rightmost face of , denoted , is the face of with maximum lexicographic value. The leftmost coface of , denoted , is the dimension one coface of with minimum lexicographic value. We say is a left-right parent and we call a left-right child if .

In [forman02], Forman showed that each simplex in is exclusively a tail, head, or unmatched. Moreover, the unmatched simplices are critical. Thus, we can partition the simplices of into heads , tails , and critical simplices , and encode the GVF as a bijection . That is, we can represent the GVF for as the unique tuple . the uniquenes of the GVF doesn’t seem to have a lemma that we can point to. Add something to appendix to show GVF is well-defined. We will use this representation throughout our algorithms.

Note that a GVF is a particularly useful construction. It provides a way to reduce the size of a simplicial complex without changing the topology (by cancelling matched pairs), which is constructive for preprocessing large simplicial complexes. See [nanda2013morse, vcomic2011dimension] for examples.

We define a consistent GVF as follows: [Consistent GVF] Let be a simplicial complex, and let  be injective. Then, we say that a gradient vector field is consistent with if, for all , there exists a discrete Morse function  such that

1. is the GVF corresponding to .

2. .

3. .

Let be a GVF. Then, for , a gradient path111 There is a slight discrepancy between the definition of Forman [forman02] and KKM [king]. In particular, Forman’s definition states the head and the tail of the path are simplices of the same dimension. On the other hand, KKM’s usage in the algorithm expects that the head and tail are different dimensions. Here, we state the definition implied by the usage in KKM. is a sequence of simplices in :

 Γ={σ−1,τ0,σ0,τ1,σ1,…,τr,σr,τr+1}

beginning and ending with critical simplices and such that for , , , , and . We call a path nontrivial if .

3 A Discrete Morse Extension of f0:K0→\R

In this section, we give a description of LABEL:extract.LABEL:~(), originally from [king]. This algorithm takes a simplicial complex , an injective function , and a threshold that ignores pairings with small persistence ; and returns a GVF on that is consistent with .

uses two subroutines: First, in callerawextract  (given in LABEL:eraw.LABEL:) is used to generate an initial GVF on consistent with . Let be this initial GVF. Then, for each dimension ( through ), the algorithm makes a call to  (given in LABEL:ecancel.LABEL:) that augments an existing gradient path to remove simplices from in pairs. For more details, see kingsubroutines.

In the next section, we provide a simpler and faster algorithm to replace , which dominates the runtime of when (and in practice, when is very small). We conclude this section with properties of the output from :

[Properies of ]theoremkingthm Let be a simplicial complex, let be an injective function, and suppose  is the output of . Let . Then, there exists a discrete Morse function  such that the following hold:

1. is a GVF consistent with .

2. Let . Then, if and only if is a left-right parent.

3. For all , .

4. The runtime of  is .

4 A Faster Algorithm for

The main contribution of this paper is , which we show is a simplified version of that has the same output with an improved runtime. This section provides a description of the algorithm, and a proof of the equivalence with .

4.1 Hasse Diagram Data Structure

We assume that KKM [king] represent in a standard Hasse diagram data structure , which can be encoded as an adjacency list representation for a graph. Each simplex is represented by a node in . We abuse notation and write as the corresponding node. Two simplices are connected by an edge from to if is a codimension one face of . For a node , we partition its edges into two sets, and as the edges in which is a face or coface, respectively.

For , we denote the nodes of corresponding to the -simplices of as and we store each in its own set that can be accessed in time. Note that there is no requirement about the ordering of the edges or the nodes in each . See cat for an example of the data structure.

For our algorithm, we decorate each node of with additional data. For clarity, we denote the decorated data data structure as . Next, we describe the additional data stored in each node and how to initialize the data. Consider and define . Each node stores , the rightmost child and leftmost parent .

Next, we describe how to initialize the data and summarize with the following lemma.

[Hasse decoration] Given a simplicial complex with simplices and . The decorated Hasse diagram uses additional space. We can decorate the Hasse digram in time.

We begin by analyzing the space complexity. For each node, we store a constant amount of additional data. Thus, the decorated Hasse diagram uses additional space.

Next, we analyze the time complexity. To decorate for each node , we must compute , , and . Let . We proceed in three steps.

First we compute . In general, computing takes time, since there may be no more than vertices which compose any . Let and be distinct codimension one faces of . Observe that . Thus, if we know the function values for , we can compute and store all function values of all nodes in in time.

Second we compute by brute force. We iterate over all edges in to find its largest face under lexicographic ordering. Since a -simplex has down edges, computing for takes time. As , and there are nodes, we can then compute for all nodes in time.

Third, we compute , also by brute force. We iterate over all edges in to find its smallest lexicographical coface. While we cannot bound as easily as , we do know that when computing we can charge each edge in the Hasse diagram for one comparison. Observe that when computing , we can similarly charge each comparison to an edge. Then, from computing , we know the total number of comparisons is . Thus, the total number of comparisons for computing is also .

As each step takes time, decorating takes time.

4.2 Algorithm Description

Next, we describe the main algorithm. Given a simplicial complex (represented as a Hasse diagram), and an injective function , computes a GVF consistent with .

LABEL:erchild.LABEL:has three main steps. First, we create a decorated Hasse diagram. Second, we process each level of the Hasse diagram from top to bottom. For each unassigned simplex, we check for a left-right parent node, and use the results to build up a GVF. Third, we process unassigned zero-simplices. See hasses for an example.

4.3 Analysis of

For the remainder of this section, we prove that LABEL:erchild.LABEL:() is equivalent to and faster than LABEL:eraw.LABEL:(). For the following lemmas, let be a simplicial complex, let be an injective function, and let  be the output of .

First, we show that is a partition of . [Partition] The sets , , and partition .

By downderchild and ltorerchild,  iterates over all with once. Each is either assigned or unassigned. If is unassigned, there are two options; may be a left-right parent, or it may not be. If is a left-right parent, addherchild ensures that is put into . Otherwise, addcerchild ensures that is put into . If is assigned, then was assigned to in addherchild. Thus, every with must be assigned to exactly one of or . Then, every is again either assigned or unassigned. If assigned, . If unassigned, is added to in leftoverserchild. Thus, every is assigned one of or , making and partition .

We will show that satisfies erawgvf, erawifftails, and erawcomposition of eraw. Later in this section, we show that any GVF with these properties is unique.

First, we show erawcomposition and one direction of erawifftails.

[Child Heads are Parents] Let . Then, is a left-right parent and .

Recall that is the second output of , given in LABEL:erchild.LABEL:. As addherchild is the only step in which simplices are added to  and is within an statement that checks if is a left-right parent, must be a left-right parent. Also within the statement, defroerchild adds to , which means that .

Now we show the reverse direction of erawifftails.

[Child Parents are Heads] Let . If is a left-right parent, then .

Recall that in order for to be a left-right parent, we must have . Now, we consider two cases. For the first case, suppose . Then is added to in addcerchild when must already be assigned to . So, and is not a left-right parent.

For the second case, suppose . Then is added to in addherchild where for some with . Notice that is a face of and . Then, and is not a left-right parent.

Thus, if is a left-right parent, then .

To see satisfies erawgvf we have the following lemma:

[Consistency] The tuple is a gradient vector field consistent with .

Let and . Let . We define

 δ:=min{\eps,minv,w∈K0|f(v)−f(w)|}.

We define recursively as follows: for all vertices , define . Now, assume that  is defined on the -simplices, for some . For each , we initially assign , then we update:

 f(σ)=f(σ)+{−\lrmatchoffsetif σ is a left-% right parent;\lrmatchoffsetotherwise, (1)

where  is the index of  in the lexicographic ordering of all simplices. We make one final update:

 f(σ)=f(σ)+{\lrchildoffsetif σ is a left-% right child;0otherwise. (2)

We need to show that  and satisfy the three properties in consistent.

First, we show consistentgvf holds for as defined above (that  is the GVF corresponding to ). Let  be the GVF corresponding to . Since , , partitions by partition, it suffices to show that is a bijection and . The only time that simplices are added to or happens directly alongside when pairs are added to in lines 10 and 11, forcing that  must be a match.

Let . Let . By wefindtails, is a left-right parent and , which means that is a left-right pair. We follow the computation of . Since is a left-right pair, is the rightmost face of , which means is initialized to . Since is a left-right parent, is updated by (LABEL:addlrmatch) to . Since  is not a left-right child, nothing changes in (LABEL:addlrchild). Thus, . Next, let such that and . We follow the computation of . Since  is the only face of that is a left-right child, for any other , (LABEL:addlrchild), adds zero to the definition of . Recalling that (LABEL:addlrchild) adds to the definition of , we find that , and

 f(σ)=f(τ)−\lrmatchoffset≥f(τ′)+\lrchildoffsetlowerdim−\lrmatchoffset≥f(τ′).

Because may be any arbitrary left-right parent, we can guarantee that the above inequality is valid for any when related to any other faces of . Thus, is discrete Morse, since it is impossible for to violate the inequality given in morsedefn.

Since and is a discrete Morse function, we obtain . Each of these statements are biconditional, so we have shown that .

consistentrestr () holds trivially.

Finally, we show consistenteps holds (that ). By construction,

 |f(σ−maxv∈σf0(v)|≤(d∑i=12−i)δ=(1−2−d)δ<\eps.

Properties erawgvf, erawifftails, and erawcomposition are quite restrictive. In fact, they uniquely determine a GVF, as we now show.

[Unique GVF] Let  be a simplicial complex and let be an injective function. There is exactly one gradient vector field, , with the following two properties:

1. is consistent with .

2. For all , if and only if is a left-right parent.

3. For all , .

Let and be as defined in the theorem statement. Let be defined for each simplex  by . Let  and be two GVFs that satisfy uniquegvf, uniqueifftails, and uniquedefine.

Let . By the forward direction of uniqueifftails, we know that is a left-right parent. By the backward direction of uniqueifftails, we know that . Thus, we have shown that . Repeating this argument by swapping the roles of and gives us .

Since and because uniquedefine holds, we have shown that  is paired with in both matchings, and specifically  . Since and are bijections by uniquegvf, we also know that:

Thus, and .

Finally, we conclude:

which means that and are the same GVF. Thus, we conclude that the gradient vector field satisfying uniquegvf, uniqueifftails, and uniquedefine is unique.

Since  and  both satisfy the hypothesis of unique, the outputs of the algorithms must be the same.

[Algorithm Equivalence] Let  be a simplicial complex and let be an injective function. Then (, ) and (, ) yield identical outputs.

By eraw and kingfindmatch, the output of  satisfies the properties in unique. By assignment, wefindmatch, and wefindtails, the output of  satisfies the properties of unique. Then, by unique,  and  are equivalent.

When we consider the runtime and space usage of , we find the following:

[New Runtime] Given a simplicial complex (represented as a Hasse diagram), and an injective function , computes a GVF consistent with in time and uses space.

First, line decorateerchild decorates the Hasse diagram. By decoration, the decoration takes time and space. downddowndenderchild, process each node of the decorated Hasse diagram. Each iteration of the loop is in time and space because all required data was computed while decorating. As there are nodes to process, downddowndenderchild takes time and uses space. Finally, we iterate over the zero-simplices in time.

The bottleneck of space and time usage of the algorithm is decorating the Hasse diagram, therefore, the algorithm takes time and space.

5 Discussion

In this paper, we identified properties of the and algorithms [king]. We used these properties to simplify to the equivalent algorithm . Our simplification improves the runtime from to .

There are several possible extensions of this work. The problem of finding tight bounds on the runtime of is interesting and open. We plan to implement our approach on high dimensional data sets, and to further improve to the runtime. We intend to explore a cancellation algorithm that performs the same task as , eliminating critical pairs with small persistence. Our conjectured cancellation algorithm iterates over critical simplices and applies .

Constructing Morse functions that do not require preassigned function values on the vertices is a related area of active research. The problem of finding a Morse function with a minimum number of critical simplices is NP-hard [joswig04]. In [bauer-rathod-18], Bauer and Rathod show that for a simplicial complex of dimension with simplices, it is NP-hard to approximate a Morse matching with a minimum number of critical simplices within a factor of , for any . The question is open for 2-dimensional simplicial complexes.

Acknowledgements

This material is based upon work supported by the National Science Foundation under the following grants: CCF 1618605 & DMS 1854336 (BTF) and DBI 1661530 (DLM). Additionally, BH thanks the Montana State Undergraduate Scholars Program. All authors thank Nick Scoville for introducing us to KKM [king] and for his thoughtful discussions.

References

To put our result in context, we now provide a glimpse into the inner workings of , and reveal the underlying properties of  which give it an identical output to . We also provide a formal runtime analysis of  to verify that  provides an improved time complexity.

a.1 Subroutines for

In this section, we recall the algorithms proposed by KKM [king]. Note that we made some slight modifications to the presentation of KKM’s initial description to improve readability. The modifications do not affect the asymptotic time or space used by the algorithm, although it does remove some redundant computation.

In particular, we modified the inputs to explicitly pass around a GVF so that the inputs of each algorithm are clear. We simplified notation and inlined the subroutine . From the previous modifications, we observed that the algorithm recomputes a gradient path that is currently in scope and so we simply unpack the path on unpackGammaecancel.

computes the lower link of each vertex in a , and assigns if . If , its lower link is recursively inputted into  and this recursion continues until an empty lower link is reached. When the lower link is not empty,  assigns and the smallest function valued vertex in is combined with and added to , carrying with this assignment a mapping from to . As the recursion continues, higher dimensional simplices in the lower start of are able to be assigned to both and based on combinations consistent with the assignments of the vertices and the original mappings of . Higher dimensional critical cells are assigned similarly by combining the current vertex and each previously computed from the last recursion, until all simplices have been assigned.

Then, because  may have extraneous critical cells,  works to reduce the number of critical cells by locating “redundant" gradient paths to a critical simplex and reversing them after the first pass by , refining the output of  .

Let , be a critical simplex. Let denote the set of all nontrival gradient path starting at and ending in .

a.2 Analysis of

In this appendix, we provide the analysis LABEL:extract.LABEL:from king. In what follows, let be a simplicial complex and let be an injective function.

[Raw Heads are Parents] Let be the output of . Every simplex in is a left-right parent. Furthermore, for all , .

Let . We show that is a left-right parent by induction on the dimension of . When is an edge, and for some vertex . In addh1eraw is defined as where so that is smallest. So, is a left-right parent. Furthermore, .

Suppose every is a left-right parent when and consider . If is a simplex, is defined in addh2eraw, when a vertex is selected in selectveraw. We extend the GVF on the to include the lower star of . We have where . Since and are in the we have for . Then and .

By the induction hypothesis is a left-right parent. If is not a left-right parent, we can remove from and and contradict that .

Furthermore, . This proves the claim.

[Raw Parents are Heads] Let be the output of . Let . If is a left-right parent, then .

We show if then is not a left-right parent. First, suppose . We use induction on to show is not a left-right parent. For the base case, is a vertex and can not be a left-right parent.

Suppose is not a left-right parent when and consider . Then is added to in addteraw when a vertex is selected in selectveraw. As in kingfindmatch, write for some .

By the induction hypothesis is not a left-right parent, thus , and there exists a vertex such that where . We have .

Now, suppose There are two places where elements are added to addvceraw and highceraw. In addvceraw is a vertex and can not be a left-right parent.

In highceraw is defined as for some where so that is smallest. Now . We have shown that if is a left-right parent, then .

We summarize the properties of  in the following theorem.

*

erawgvf is proven in Theorem 3.1 of  [king]. By kingfindmatch and kingfindtails, we conclude erawifftails. Also by kingfindmatch we can guarantee erawcomposition.

To show erawtimebound, we observe that the worst-case runtime for a single execution of callexacteraw happens when the lower link of is of size . Computing the optimal pairings that  returns is at least as hard as computing the homology of , which is of the time complexity of matrix multiplication. By [raz2002complexity], we know that the runtime of  is lower-bounded by .

Appendix B Equivalence of Definitions

We gave the following definition of discrete Morse function:

*

However, in [nic], proves the following:

this is problem 2.23 in [nic] [Regular Characterization] A simplex is regular if and only if either of the following holds

1. There exists such that .

2. There exists such that .

In fact both of the above properties can not both be true. As shown in: [Exclusion] Let be a discrete Morse function and a regular simplex. Then conditions and of regsim cannot both be true, Hence, exactly one of the conditions hold whenever is regular.

Regular pairs can not differ by more than one dimension.

also stated as a problem in [nic] [Impossible Pairs] Let be a discrete Morse function. it is impossible to have a pair of simplices in with such that

The previous lemmas give the following characterization of a regular simplex in discrete Morse function:

[Check if Regular] A simplex is regular if and only if is paired with either a face or a coface with codimension 1 but not both.

now, we can say that the definitions of critical are the same as well.

Someone (maybe Dave?) needs to go carefully through the stuff that is thrown into this file. Things that I know we’ll need: (1) we reference rllbound from within king. (2) We need to call RecursiveLowerLink from LABEL:eraw.LABEL:or something like that.

We now give an algorithm that computes the link of a vertex in a simplicial complex. Like our main algorithm, the link algorithm can be visualized using the Hasse diagram. For a given vertex, assign all simplicies that contain the color blue. Then assign all faces of the blue simplicies the color red if they are not already blue. The lower link of is all simplicies that are red and not blue and have value less than the value of See links and LABEL:link.LABEL:. Notice that a simplex, is red if it is contained in a simplex that contains but is not contained in the definition of link

The lower link of each vertex can be computed in We iterate over everything in the Hasse diagram at most twice.

We count the number of times the lower link of a vertex is computed in LABEL:recursivell.LABEL:when the input contains a -dimensional simplex.

Assuming that computing the lower link is linear in the number of simplices, since there are lower link computations and each take time, we have the following:

[Recursive Lower Link Bound] The number of lower link computations in  is

We use strong induction on Let be the number of lower link computations of a simplex in . When we have a single vertex and we compute one lower link.

Suppose the vertices have been sorted by function value, then when is selected in rllvrecursivell, the lower link is empty, when is selected the lower link is a single vertex, when is selected the lower link is an edge. This pattern continues. When is selected the lower link is a simplex. When we iterate over all vertices in we call  on a simplex of every dimension from to Since there are vertices in we have an additional lower link computations.

This gives the following recurrence relation

 Ld=Ld−1+Ld−2+…+L1+L0+(d+1)

with Strong induction shows that the number of lower link computations is

proceeds recursively on the lower link of each vertex. This leads to many unnecessary computations of the lower link of vertices. llcalleraw and callexacteraw of  recursively compute the lower link of each vertex in We include this subalgorithm, called , in LABEL:recursivell.LABEL:.