Computing Minimal Presentations and Betti Numbers of 2-Parameter Persistent Homology

Motivated by applications to topological data analysis, we give an efficient algorithm for computing a (minimal) presentation of a bigraded K[x,y]-module M, where K is a field. The algorithm assumes that M is given implicitly: It takes as input a short chain complex of free bipersistence modules F^2 ∂^2 F^1 ∂^1 F^0 such that M∂^1/im∂^2. The algorithm runs in time O(∑_i |F^i|^3) and requires O(∑_i |F^i|^2) storage, where |F^i| denotes the size of a basis of F^i. Given the presentation, the bigraded Betti numbers of the module are readily computed. We also present a different but related algorithm, based on Koszul homology, which computes the bigraded Betti numbers without computing a presentation, with these same complexity bounds. These algorithms have been implemented in RIVET, a software tool for the visualization and analysis of two-parameter persistent homology. In preliminary experiments on topological data analysis problems, our approach outperforms the standard computational commutative algebra packages Singular and Macaulay2 by a wide margin.

There are no comments yet.

Authors

• 6 publications
• 9 publications
• Computing Minimal Presentations and Bigraded Betti Numbers of 2-Parameter Persistent Homology

Motivated by applications to topological data analysis, we give an effic...
02/15/2019 ∙ by Michael Lesnick, et al. ∙ 0

• Fast Minimal Presentations of Bi-graded Persistence Modules

Multi-parameter persistent homology is a recent branch of topological da...
10/29/2020 ∙ by Michael Kerber, et al. ∙ 0

• Generalized Persistence Algorithm for Decomposing Multi-parameter Persistence Modules

The classical persistence algorithm virtually computes the unique decomp...
04/07/2019 ∙ by Tamal K. Dey, et al. ∙ 0

• Anti-commutative Dual Complex Numbers and 2D Rigid Transformation

We introduce a new presentation of the two dimensional rigid transformat...
01/08/2016 ∙ by Genki Matsuda, et al. ∙ 0

• Topological Data Analysis of COVID-19 Virus Spike Proteins

Topological data analysis, including persistent homology, has undergone ...
05/01/2021 ∙ by Moo K. Chung, et al. ∙ 0

• A higher homotopic extension of persistent (co)homology

12/05/2014 ∙ by Estanislao Herscovich, et al. ∙ 0

• On the law of the iterated logarithm and strong invariance principles in computational geometry

We study the law of the iterated logarithm (Khinchin (1933), Kolmogorov ...
02/22/2020 ∙ by Johannes Krebs, et al. ∙ 0

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

1.1 Persistence Modules, Minimal Presentations, and Betti Numbers

Let be a field. For , a (-parameter) persistence module is defined to be a -module equipped with a -grading. A

-grading is a vector space decomposition

such that for all and , where denotes the standard basis vector. We refer to 2-parameter persistence modules as bipersistence modules. Persistence modules are standard objects of study in commutative algebra [24, 36].

In topological data analysis (TDA) [12, 39, 21, 48, 9], -parameter persistence modules arise as invariants of data, in the context of multi-parameter persistent homology. To explain, let us define a (-parameter) filtration to be a collection of simplicial complexes such that for all and . A number of well-known constructions in TDA associate a filtration to data, with the aim of topologically encoding coarse-scale, global geometric information about the data. One then studies the data by studying the topology of the filtration. In particular, applying the homology functor with coefficients in yields a -parameter persistence module, which serves an algebraic descriptor of the data. The case has received the most attention, but it is sometimes especially natural to consider -parameter filtrations for , and this currently a very active area of research. The case is of particular interest, in part because 2-parameter filtrations arise in the study of point cloud data with noise or non-uniform density [14, 17, 43, 34, 35, 11, 19, 22].

The isomorphism type of a finitely generated -parameter persistence module is specified by a collection of pairs with , called a barcode. When , the representation theory of -parameter persistence modules is known to be wild, and there is no simple invariant which completely encodes the isomorphism type of a module [14]. Nevertheless, for the purposes of TDA, one can consider incomplete invariants of a persistence module as surrogates for the barcode, and a number of ideas for this have been proposed; for example, see [14, 15, 34, 47, 31].

Although no simple, complete invariant of a -parameter persistence module is available, one can specify the isomorphism type of a finitely generated persistence module via a minimal presentation. Concretely, this is a matrix with field coefficients, with each row and each column labeled by an element of

. Because minimal presentations are not unique, they cannot be directly used in the way barcodes are used in TDA, e.g., as input to machine learning algorithms or statistical tests. However, they serve as a useful computational intermediate.

The (multi-graded) Betti numbers are standard invariants of a persistence module, and play an important role in parts of algebraic geometry and commutative algebra [36, 24]. For a finitely generated -parameter persistence module and , the (multi-graded) Betti number of at grade , denoted , is the number of elements at grade in a basis for the module in a free resolution for ; see Section 2. The thus define a function . When , we refer to the as bigraded Betti numbers.

For persistence modules arising in TDA, the multi-graded Betti numbers offer interesting partial information about the coarse-scale geometry of the data [14]. The bigraded Betti numbers of a bipersistence module are readily visualized as a collection of colored dots in the plane [34]. The bigraded Betti numbers of a bipersistence module also have an another application to TDA: In recent work, the authors have introduced a software tool called RIVET for the interactive visualization of bipersistence modules, designed with needs of TDA in mind [34, 46]. The tool provides a visualization of the bigraded Betti numbers of a bipersistence module , as well as visualization of two other invariants —the Hilbert function and so-called fibered barcode. The central feature of RIVET is a framework for interactive visualization of the fibered barcode. RIVET’s interactivity makes use of a novel data structure called the augmented arrangement of , which is a line arrangement in the plane, together with additional data of a barcode at each face of the arrangement. The definition of line arrangement is given in terms of and , and the algorithm for computing it involves the computation of and as a subroutine. Computing the is best done by first computing a minimal presentation for .

1.2 Our Contribution

Motivated by TDA applications, and in particular by RIVET’s data analysis pipeline, this paper considers the problems of computing a (minimal) presentation and the bigraded Betti numbers of a bipersistence module . We assume that is given implicitly: We take the input to be a chain complex of free bipersistence modules

 F2∂2−→F1∂1−→F0

such that . We provide algorithms for both problems requiring time and memory, where denotes the size of a basis of . (Here and throughout, when stating complexity bounds, we assume for simplicity that an elementary arithmetic operation in the field requires O(1) time, and that storing an element of requires memory. Virtually all TDA computations are done with a finite field, where these assumptions hold.)

We briefly explain how such chain complexes arise in TDA: Given a -parameter filtration , one has an associated a chain complex of -parameter persistence modules

 ⋯∂i+1−−→Ci(X)∂i−→Ci−1(X)∂i−1−−→⋯∂1−→C0(X)→0,

where is the usual simplicial chain vector space with coefficients in , and the internal maps in are inclusions. Let denote the homology module of this chain complex.

Often in TDA, is defined in a way that ensures that each is free. In this case case we say is 1-critical; otherwise, we say is multi-critical. If is multi-critical, a simple construction proposed by Chacholski, Scolamiero, and Vaccarino [16] takes as input the short chain complex

 Ci+1(X)∂i+1−−→Ci(X)→∂iCi−1(X)

and yields a chain complex of free -parameter persistence modules

 Af→Bg→C

with . In the case , this has been implemented in RIVET by Roy Zhao.

Questions of computational efficiency aside, the problems of computing a minimal presentation and bigraded Betti numbers of a bipersistence module can also be solved by standard computational commutative algebra techniques which work in much greater generality. These techniques can in fact compute resolutions and Betti numbers of graded modules, for any . But to the best of our knowledge, no prior work has focused on the particular problem of computing presentations and Betti numbers of bipersistence modules. Our goal here is to develop algorithms, implementations, and exposition that are optimized for this special case. While our approach has close connections to existing ones, the bigraded setting turns out to allow for specialized solutions which are particularly efficent.

The core computational engine underlying the algorithms of this paper is a matrix reduction algorithm, introduced here, which we call the bigraded reduction. This algorithm computes a basis for the kernel of a homomorphism of free bipersistence modules represented by an matrix in time and memory; as observed in [16], such kernels are always free. As we show, the bigraded reduction can also be used to do other basic “bigraded linear algebra” computations. We view the bigraded reduction as the fundamental algorithmic primitive for working with bipersistence modules. In this sense, we see its role as analogous to the roles of Gaussian elimination for vector spaces, the standard persistent homology algorithm for 1-parameter persistence modules, and Gröbner basis computation (e.g., Buchberger’s algorithm and its variants) for -modules. Indeed, the bigraded reduction is closely related to each of these algorithms, and in particular can be seen as an optimized Gröbner bases computation. However, our algorithms do not make explicit use of Gröbner bases, and they never store full Gröbner bases (except for free modules).

Problems arising in TDA typically are very large but very sparse; as such, our algorithms and implementation make essential use of sparse linear algebra, much as barcode computations in the 1-parameter setting usually do [49, 5].

Our algorithm for computing a minimal presentation first computes a presentation which satisfies a partial minimality property—what we call a semi-minimal presentation—and then minimizes this. The computation of the semi-minimal presentation amounts to two applications of the bigraded reduction, plus a bit of standard linear algebra. The problems of computing a presentation and computing the bigraded Betti numbers are closely related; in fact, as we explain in Section 4.3, once we have computed the semi-minimal presentation, we can obtain the bigraded Betti numbers with little additional work. Our approach to computing a minimal presentation extends readily to the computation of a minimal resolution, but to compute the Betti numbers from a presentation, we do not need to build a full resolution.

We also present an alternative, though related, algorithm for computing the bigraded Betti numbers which avoids explicit computation of a presentation, and instead relies on the well-known Koszul homology formulae for the Betti numbers. RIVET implements our algorithm for computing (minimal) presentations, as well as both approaches to computing bigraded Betti numbers. We discovered the approach based on Koszul homology first, and an implementation has been publicly available in RIVET since 2016. Code for minimal presentation computation, and for Betti number computation based on this, was added to RIVET in 2018. The newer approach is simpler and more efficient in practice; we regard it as an improvement on the older approach. Nevertheless, we also explain the older approach, for the following reasons: we hope that some readers will be interested in the Koszul homology perspective on Betti number computation; we want to document how earlier versions of RIVET computed Betti numbers; and third, we wish to report results from computational experiments comparing the two approaches to each other.

Computational experiments, reported in Section 6, indicate that on typical TDA input, both of our approaches to computing Betti numbers perform far better than the functions for computing Betti numbers in the standard computational commutative algebra software packages Macaulay2 and Singular, and well enough for practical use. Our implementation can currently handle chain complexes arising in TDA with tens of millions of generators on a desktop computer with 64GB RAM. We expect it to be able to handle significantly larger input in the future, as additional optimizations are implemented.

1.3 Related Work

In commutative algebra, the standard algorithms for Betti number computation construct a resolution using Gröbner basis techniques [33, 26, 42]. Several variants of the Gröbner basis approach to computing (minimal) resolutions and Betti numbers are implemented in popular computational algebra software packages such as Macaulay2, Magma, Singular, and CoCoA [25, 30, 10, 1]. One widely implemented algorithm is that of La Scala and Stillman [33]. Also of note is a recent refinement of Schreyer’s algorithm [26], implemented in Singluar, which significantly outperforms the algorithm of La Scala and Stillman on most examples considered by the authors. One notable advantage of our approach is that we avoid ever storing a full Gröbner basis of the image of a morphism of free bipersistence modules, whereas the usual Gröbner basis approaches require this; see Remark 3.5. As noted above, on the examples we consider in our computational experiments (Section 6), RIVET’s algorithms outperform those of Macaulay2 and Singular by a large margin.

Carlsson, Zomorodian, and Singh were the first to consider computational aspects of multi-parameter persistence modules in the TDA setting [13]. Their work considers the computation of Gröbner bases of images and kernels of morphisms of free -parameter persistence modules, using the classical Buchberger’s algorithm and Schreyer’s algorithm. The work of Chacholski, Scolamiero, and Vaccarino [16], mentioned above, also explores the computational aspects of multi-parameter persistent homology, with a focus on the case where the chain modules are not necessarily free.

Aiming in part to address some issues with the earlier work [13], the Ph.D. thesis of Jacek Skryzalin [44] revisits the problem of computing Gröbner bases of the kernels and images of homomorphisms of free -parameter persistence modules. Skryzalin presents an algorithm for this [44, Algorithm 5], inspired by the well-known F4 and F5 algorithms for computing Gröbner bases via sparse linear algebra [28, 27]. In the case of bipersistence modules, Skryzalin’s algorithm runs in time , assuming the morphism is input to the algorithm as an matrix. Our algorithm for kernel computation, discovered independently, also runs in time time and is similar to Skryzalin’s algorithm in the 2-parameter case, though our exposition is rather different and some details are different. Skryzalin does not consider the problems of computing Betti numbers or presentations.

Two papers, by Allili et al. and Scaramuccia et al., have introduced algorithms which use discrete Morse theory to simplify a multi-filtration without altering its topological structure [41, 2]. Fugacci and Kerber have recently developed a purely algebraic analogue of these: They give an algorithm that replaces a chain complex of persistence modules with a smaller one having the same quasi-isomorphism type [29]. This algorithm does not construct presentations or compute bigraded Betti numbers. However, it specializes to an algorithm for minimizing a presentation, and this is relevant to our work; see Remark 4.4.

Another line of related work concerns the computation of metrics between -parameter persistence modules [8, 6, 7, 20, 32]. This is one potentially significant application of minimal presentation computation in TDA.

Outline

Section 2 introduces basic definitions and standard results used throughout the paper. In particular, Section 2.7 introduces the matrix reduction used in standard persistent homology computations; this matrix reduction serves as a primitive upon which the main algorithms of this paper build. Section 3 presents the bigraded reduction and its application to computing the kernel of a morphism of free bipersistence modules. Section 4.1 applies the ideas of Section 3 to the problem of computing a semi-minimal presentation. Section 4.2 gives an algorithm for minimizing a semi-minimal presentation. Section 4.3 gives an algorithm for directly computing the bigraded Betti numbers from a semi-minimal presentation, without minimizing. Section 5 presents our alternative, Koszul homology based approach to computing bigraded Betti numbers. Section 6 reports the results of our computational experiments. Section 7 concludes the paper with a brief discussion of directions for future work.

Acknowledgements

Many thanks to: William Wang, for writing the scripts used to obtain the timing data presented in this paper; Dave Turner, for parallelizing RIVET’s code for minimizing a presentation; Bryn Keller, for extensive contributions to RIVET, including ones which made the parallelization possible, and also for pointing out several typos in an early version of this paper; Michael Kerber, for sharing the observation of Remark 4.4; Alex Tchernev, for sharing his insights about Gröbner bases in the 2-parameter setting, as described in Remark 3.5; Ulrich Bauer, for many enlightening discussions about 1-parameter persistence computation, which influenced this work; and Roy Zhao, for helpful conversations about the clearing optimization and persistence computation more generally. Work on this paper began while the authors were postdoctoral fellows at the Institute for Mathematics and its Applications; we thank the IMA and its staff for their support. This work was funded in part by NSF grant DMS-1606967. In addition, Lesnick was supported in part by NIH grant T32MH065214 and an award from the J. Insley Blair Pyne Fund.

2 Preliminaries

2.1 Notation and Terminology

In what follows, let be a -parameter persistence module. We say is homogeneous if for some . In this case, we write .

We regard as a partially ordered set by taking

 y=(y1,y2,…,yn)≤(z1,z2,…,zn)=z

if and only if for all . For , the action of the monomial on restricts to a linear map .

A morphism of -parameter persistence modules is a module homomorphism such that for all . Let denote the restriction of to .

Define

 hf(M):Zd→N,

the Hilbert function of , by . Given a morphism of persistence modules, let and . We call and the pointwise rank and pointwise nullity of , respectively.

2.2 Free Persistence Modules

For and , let denote the -parameter persistence module given by

 Qgz ={Kif g≤z,0otherwise, Qgy,z={IdKif g≤y,0otherwise.

We say a -parameter persistence module is free if there exists a multiset of elements in such that .

Many of the standard ideas of linear algebra adapt in a straightforward way to free persistence modules. For example, we define a basis for a free persistence module to be a minimal homogenous set of generators. Though as in linear algebra, bases are usually not unique, the number of elements at each grade in a basis for is an isomorphism invariant. In fact, this invariant is given by the bigraded Betti numbers of ; see Definition 2.1 below.

Suppose we are given an ordered basis for a finitely generated free persistence module . We denote the element of as . For , we can represent with respect to as a vector ; we take to be the unique vector such that if and

 v=∑i:gr(Bi)≤z[v]Bitz−gr(Bi)Bi. (2.1)

Thus, records the field coefficients in the linear combination of giving .

Along similar lines, for an ordered basis of a free persistence module , we represent a morphism via a matrix with coefficients in the field , with each row and column labelled by an element of , as follows:

• The column of is .

• The label of the column is ,

• The label row is .

It is easy to see that determines up to natural isomorphism.111Given morphisms of persistence modules and , a natural isomorphism is a pair of isomorphisms , such that the following diagram commutes:

Where no confusion is likely, we sometimes write simply as .

2.3 Resolutions, Presentations, and Bigraded Betti Numbers

An exact sequence of free persistence modules

 F∙:=⋯∂3−→F2∂2−→F1∂1−→F0

is called a resolution of if .

For a persistence module , let denote the submodule generated by the images of all linear maps with . We say the resolution is minimal if for each . It can be shown that if is finitely generated, then a minimal resolution of exists. It is unique up to isomorphism, and any resolution can be obtained (up to isomorphism) from by summing with resolutions of the form

 ⋯0→0→GIdG−−→G→0→0→⋯→0

where is free, and the two copies of are allowed to appear at any two consecutive indices. For a fuller discussion of minimal resolutions, we refer the reader to [23, Chapters 19 and 20] or [40, Chapter 1].

A presentation of a persistence module is a morphism of free persistence modules with . Thus, a presentation for is the data of the last morphism in a free resolution for . A presentation is said to be minimal if extends to a minimal resolution. Thus, minimal presentations are unique up to isomorphism and any minimal presentation can be obtained (up to isomorphism) by summing with maps of the form

 GIdG−−→GorG→0,

where is free.

It follows from Section 2.2 that we can represent the presentation with respect to bases and for and via the labelled matrix . By slight abuse of terminology, we also call this labeled matrix a presentation of .

Definition 2.1 (Betti Numbers).

Let be a minimal resolution of a finitely generated -parameter persistence module . For , define the function by

 βMi:=hf(Fi/IFi).

For , we call Betti number of at grade .

Using a multi-graded version of Nakayama’s lemma [40, Lemma 2.11], it can be checked that is the number of elements at grade in any basis for .

Remark 2.2.

Hilbert’s syzygy theorem tells us that in a minimal resolution of a finitely generated -parameter persistence module , for . Thus, is only of interest for .

The following formula relating the Hilbert function to the bigraded Betti numbers follows from Hilbert’s Syzygy theorem by an easy inductive argument; see for example [40, Theorem 16.2] for a proof of the analogous result in the case of -graded -modules.

Proposition 2.3.

For a finitely generated -parameter persistence module and ,

 dimMz=d∑i=0(−1)i∑y≤zβMi(y).

We define a graded matrix to be a matrix with entries in , with each column labeled by an element of , such that the column labels appear in increasing order. Similarly, we define a bigraded matrix to be a matrix with with entries in , with each column labeled by an element of , such that the column labels appear in colexicographical order. If is a (bi)graded matrix, we denote the label of the column by .

Given a bigraded matrix and , we let (respectively, ) denote the graded submatrix of consisting of the columns of with (respectively, ); here denotes the partial order on , not the colexicographical order. For a graded matrix and , we define and analogously.

2.5 Free Implicit Representations: The Input to Our Algorithms

As noted earlier, the algorithms of this paper take as input a bipersistence module given implicitly as a chain complex of free bipersistence modules

 F2∂2−→F1∂1−→F0

with . We now specify in more detail how we represent this input. It is clear from the discussion in Section 2.2 that with respect to bases , , and for , , and , we can represent the short chain complex above as a pair of matrices and , with the rows and columns of both matrices labelled by elements of . In fact, to encode up to isomorphism, it is enough to keep only the column labels of this pair of matrices: The row labels of are not needed, and the row labels of are the same as the column labels of .

In the case that and are both colexicographically ordered with respect to grade, the column-labeled matrices and are in fact bigraded matrices. We then call the pair of bigraded matrices a free implicit representation (FI-Rep) of . Our algorithms take as input an FI-Rep of .

In the degenerate case that is trivial, so that is an empty matrix, the FI-Rep is simply a presentation for .

2.6 Column-Sparse Representation of Matrices

For the complexity analysis of the algorithms of this paper, we assume that matrices are stored in a format allowing for

• constant time access to the non-zero element of largest index in each column,

• -time column addition, where is the number of rows in the matrix.

Moreover, for practical TDA computations, we need to work with sparse matrix data structures. To meet these requirements, it suffices to store matrices in a column sparse format, storing the non-zero entries of each column of the matrix as an ordered list. This is standard in persistence computation [49].

Remark 2.4.

In the context of computing persistent homology, Bauer, Kerber, Reininghaus, and Wagner have studied the practical efficiency of a number of sparse data structures for matrix columns, including linked lists, dynamically allocated arrays, lazy heaps, and (for coefficients) bit trees [5]. They have found that lazy heaps, which perform well when adding a column with very few non-zero entries to a column with many entries, are very effective in practice on TDA problems. Subsequent implementations of persistent homology computation by these authors use lazy heaps [4, 3]. Following this work, our implementations use lazy heaps as well. However, we note that in the worst case, column addition using lazy heaps takes time , whereas column addition using a list takes time .

2.7 The Graded Reduction and Kernel Computation in the 1-D Case.

The standard algorithm for computing persistent homology barcodes, introduced by Zomorodian and Carlsson in [49], is a simple matrix reduction algorithm similar to Gaussian elimination. It is based on column additions. In this paper, we will call this algorithm the graded reduction, or . A variant of can also be used to compute a basis for the kernel of a morphism of free 1-D persistence modules; this sort of kernel computation is commonly used in TDA to obtain a set of generators for persistent homology. The graded reduction serves as a starting point for our approach to computing bigraded Betti numbers.

We now describe the graded reduction and its use in kernel computation. We will not need to consider how this algorithm is used to compute barcodes, though this is simple; for an explanation, see [49] or [21].

We denote the column of a matrix by . For , define the pivot of by

 ρRj:={nullifR(∗,j)=0,max{i∣R(i,j)≠0}otherwise.

We say is reduced if whenever are the indices of non-zero columns in . Note that if is reduced, then is simply the number of non-zero columns of .

takes any matrix and performs left-to-right column additions to transform into a reduced matrix . An outline of the algorithm is given below as Algorithm 1.

Remark 2.5.

In implementing line 1 of Algorithm 1, we do not copy the input matrix into a new matrix ; rather, is a reference to . We introduce this reference purely as an expository convenience, to distinguish between the input matrix and the matrices obtained from by column additions. We use references similarly in the algorithms that follow.

To complete the specification of the algorithm , it remains to explain how we check the conditional of line 3 in Algorithm 1 and how we find when the conditional does hold. This can be done in constant time, provided we maintain an 1-D array of length , where records which column reduced so far, if any, has as its pivot. We call the pivot array. Our convention is that is indexed starting from 1, not 0. The full algorithm using the pivot array is given below as Algorithm 2.

Later algorithms in this paper use pivot arrays in a similar fashion; we will sometimes suppress the details.

It is easy to check that for an matrix, requires elementary operations in the worst case.

Remark 2.6.

Let be a morphism of free 1-D persistence modules and let , be ordered bases for , with in grade-increasing order. Applying to the graded matrix yields a reduced graded matrix from which we can read the pointwise ranks and nullities of : For any , is the number of nonzero columns of , and is the number of zero columns in . Relatedly, can be read off of : is the number of zero columns in .

Kernel Computation

Let , , and be as in Remark 2.6. Algorithm 3 gives an extension of which computes a basis for the free module , given as input. Each element of this basis is represented by the pair where is the vector in representing , as specified in Section 2.2. In this algorithm, we maintain a slave matrix , initially the identity matrix ; every time we do a column operation to , we do the same column operation to . When a column of is reduced to , the homogeneous element of at grade represented by the column of is added to .

It is an easy exercise in linear algebra to check that Algorithm 3 correctly computes a basis for .

Remark 2.7 (Clearing).

Suppose we have graded matrices and representing the morphisms in a short chain complex of free 1-parameter persistence modules

 F2∂2−→F1∂1−→F0

with respect to some choice of bases for the . Chen and Kerber have [18] observed that the reduction of can be used to expedite the reduction of . The key is to note that if and are reduced matrices obtained from and , respectively, by left-to-right column additions, and is non-zero with pivot , then . Hence, for each non-zero column of , we can immediately zero out one column of before running to compute . This shortcut is called the twist optimization, or alternatively, clearing. It has been observed that for typical persistent homology computations, this optimization can yield drastic speedups [18, 5, 3].

Ulrich Bauer has observed that the clearing optimization in fact extends to the computation of a basis for . The idea is simple: If is non-zero with pivot and the column of has label , we set and add the vector in represented by to . Bauer’s software Ripser exploits this idea to compute the barcodes of Vietoris-Rips filtrations in a very memory-efficient way [3].

3 Kernel Computation in the Bigraded Case

We next present our bigraded reduction algorithm for computing a basis for the kernel of a morphism of finitely generated free bipersistence modules; as mentioned in the introduction, is free. We take the input to be a bigraded matrix representing with respect to a choice of ordered bases and for and . Note that the condition that is bigraded implies that is colexicographically ordered with respect to grade.

First we will consider the slightly simpler problem of computing . Our algorithm for this extends to an algorithm for computing the kernel itself, in essentially the same way that (Algorithms 2 and 1) extends to an algorithm for computing a kernel of a morphism of 1-parameter persistence modules (Algorithm 3). We will represent as a list of pairs such that , and .

Reduction of Bigraded Submatrices

The bigraded reduction depends on the algorithm (Algorithm 4) given below. is a variant of which, given a bigraded matrix and with already reduced, puts in reduced form.

As with , we need to specify how we check the conditional and find in the while loop for (line 4). The way we do this depends on the context in which we call , and will be explained below.

Grids

Given , define by

 grid(Y):={(z1,z2)∣(z1,y)∈Yand(x,z2)∈Yforsomex,y}.

For a free module , let , where denotes the support of a function.

Computation of βkerγ0 via Bigraded Reduction

The algorithm (Algorithm 5) below computes ; see Figure 1. Note that the algorithm makes simultaneous use of both the lexicographical and colexicographical orders on : It assumes that is represented as a bigraded matrix, which means that the column labels are in colexicographical order; and it computes for each , in lexicographical order on . This interplay between the lexicographical and colexicographical orders is crucial to the success of our approach.

Pivot Arrays for KerBetti

Our specification of (Algorithm 4) above omitted some context-dependent details about the use of a pivot array to implement the while loop of the algorithm. We now specify how we handle those details in the context of the algorithm .

Let be the number of rows of . At the beginning of our call to , we initialize a 1-D array of size (indexed from 1, not from 0), with each entry set to . We then implement line 4 of as follows. Let .

• If , , or , then the column index of line 4 does not exist. If but either or , we set .

• Otherwise, does exist; we take .

Proposition 3.1.

In the context of , the implementation of described just above works correctly.

Proof.

makes one call to for each . We say that is correctly computed at if at the conclusion of the call to at index , we have for each non-zero column with .

It is easy to check that ’s call to at index works correctly when is of the form , and that is correctly computed at such . Now fix . By induction on , it is similarly easy to check that for each , ’s call to at index works correctly, and that is correctly computed at . ∎

Remark 3.2 (Computation of Pointwise Rank and Pointwise Nullity).

Given , one readily obtains and at each point of . The restriction of to completely determines , and the same is true for . Henceforth, when we speak of computing or , we will mean doing so at each point of .

Kernel Computation

The algorithm extends to an algorithm (Algorithm 7 below) for computing a basis for , in essentially the same way that extends to an algorithm for computing the kernel in the 1-D setting: We maintain a square slave matrix , initially the identity. Every time we do a column operation on , we do the same operation on . If gets reduced to 0 at index in the for loop, we add to the basis for the kernel.

To give the pseudocode, we need a variant of (Algorithm 4) that also performs column operations on a slave matrix. We call this variant (Algorithm 6). The calls to by use a pivot array in exactly the same way as the calls to by do.

We now verify the correctness of . The proofs of correctness of (Algorithm 5) and the variants described in Remark 3.2 above are very similar.

Proposition 3.3.

(Algorithm 7) correctly computes a basis for .

Proof.

To establish the correctness of our algorithm, the key observation is the following:

• For each , when we begin the iteration of the loop in line 4 of Algorithm 7, any column that was added to a column of at a previous step of the algorithm was also in .

For a column index of , let denote . To check , assume that the column of is a column of , i.e., that . Note that for any , if we add to during our call to , then we must have that , because the column labels are assumed to be colexicographically ordered. We also have that . Moreover, because the algorithm iterates through the indices in lexicographical order, if we call before , then . Thus , which establishes .

Given , it follows from elementary linear algebra that for each ,

 Bzkerf:={tz−gr(b)b∣b∈Bkerf, gr(b)≤z}

is a basis for . Thus indeed generates . Moreover, letting denote the maximum index in , it follows from the linear independence of that is in fact a minimal set of generators for , hence a basis. ∎

Proposition 3.4.

Given as input an matrix representing , runs in time and requires memory.

Proof.

The complexity analysis is similar to that of , as given in [49]. Since each column addition performed by either decreases the pivot of some column of or reduces the column to zero, performs at most column additions on . A single column addition on , together with the corresponding operation on the slave , requires time. Thus, the column additions performed by require time in total.

In addition, takes each