Effective persistent homology of digital images

10/06/2014 ∙ by Ana Romero, et al. ∙ Université Grenoble Alpes 0

In this paper, three Computational Topology methods (namely effective homology, persistent homology and discrete vector fields) are mixed together to produce algorithms for homological digital image processing. The algorithms have been implemented as extensions of the Kenzo system and have shown a good performance when applied on some actual images extracted from a public dataset.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Computational Topology and, in particular, homological algorithms are now well established methods to study digital imaging (see, for instance, [5]). Also in Fundamental Algebraic Topology, effective methods have been used to compute homology and homotopy groups of abstract spaces, and even more complicated invariants (see, for example, [22]). In this field, effective homology has provided tools to prove the computability of theoretically defined objects. In parallel, effective homology ideas have been implemented in the Kenzo system, allowing us to materialise the computability theorem into concrete results obtained by computer (see [3]).

When effective homology methods are applied to the case of persistent homology, a well-known tool in Computational Topology, we obtain (see [15]) a procedure that smoothly generalises the usual approach to persistence at least in three aspects:

  1. our algorithm can be applied to infinite spaces,

  2. we can work with integer coefficients (instead of working with coefficients over a field, as it is common in the literature, see [4, 23]), and

  3. we don’t loose the geometrical link with the initial space along the process of eliminating useless information (a process needed to compute the final groups); to be more precise, we can obtain the generators of the persistent homology groups expressed as cycles in the initial space.

In the context of 2D-digital images, the two first items are not directly applicable, since they are finite objects with torsion-free homology (and then coefficients over a field capture all the homological information). The third item could be important to establish a qualitative

analysis (based on the intrinsic geometrical meaning encoded in cycles) instead of the usual

quantitative analysis (based on the dimension of homology groups and the associated barcodes; see [5]).

The fundamental notion in effective homology is that of (algebraic) reduction. A reduction discards information which is useless from the homological point of view, but keeping a (functional) link with the original space (providing the information previously mentioned in item 3). The technique proposed in this paper to get reductions for digital images is that of Discrete Vector Fields (DVF in the sequel). We present a new algorithm to obtain a DVF from a digital image. Although the DVF is not always optimal (it is well-known that the computation of optimal DVFs is a hard problem, “hard” in the technical sense of complexity theory), it has performed remarkably in application cases.

Then, this algorithm is adapted to the filtered case, allowing us to compute reductions which are useful for determining persistent homology. The algorithm is not really adapted, but directly applied to some diagonal blocks of a matrix (where blocks are defined by filtration indexes). In that way, we get worst reduction levels than in the global case (when applying the algorithm to the whole matrix), but with the benefits of reaching quite directly the persistent homology information. Other positive features of our algorithm is that the different DVF components can be gathered together simply by concatenation (without paying attention to a technical constraint, that of admissability, that complicates the global algorithm), performing efficiently and opening the possibility of a parallel processing. Looking at our results in a global way, they generalise those of [13].

Our algorithms have been implemented as extensions of the Kenzo system [3], and applied to some digital images. After some application to artificial examples (constructed by us), we looked for public datasets of digital images. We found a repository of actual fingerprints, and applied our programs to some of them, obtaining a remarkable performance with respect to the number of elements discarded to get its persistent homology groups. It is important to stress that we choose this fingerprint database as benchmark for the efficiency of our programs. The possibility of using persistent homology techniques for the actual problem of fingerprint recognition is doubtful (we discuss about this at the end of the paper).

The organization of the paper is as follows. After this introduction, in Section 2 we present the main classical ideas of persistent homology and our new generalization to the case of integer coefficients. Then, in Sections 3 and 4 we introduce respectively the effective homology method and its relation with the combinatorial technique of discrete vector field. These techniques are applied in Sections 5 and 6 for the computation of homology groups and persistent homology of digital images. Section 7 presents the implementation of our algorithms and some examples of application. The paper ends with a section of conclusions and further work.

2 Persistent homology

2.1 Preliminaries

Let us begin by introducing some basic definitions and results about persistent homology. For details, see [5].

Definition 1

Let be a simplicial complex. A (finite) filtration of is a nested sequence of subcomplexes such that

For every we have an inclusion map on the canonically associated chain complexes and therefore we can consider the induced homomorphisms , for each dimension . The filtration produces then for each degree a sequence of homology groups connected by homomorphisms:

Definition 2

The -th persistent homology groups of , denoted by , are the images of the homomorphisms :

The group consists of the -th homology classes of that are still alive at . A class is said to be born at if . It is said to die entering if it merges with an older class as we go from to , that is, but . If is born at and dies entering , the difference is called the persistence index of , denoted . If is born at but never dies then .

2.2 Generalization to the integer case

The main references on persistent homology [5], [4] or [23] consider the case of coefficients over a field. In that situation each group is a vector space which is determined up to isomorphism by its dimension, denoted . This allows one to represent all persistent homology groups by means of a barcode diagram [5]. However, if we work with -coefficients one can face extension problems. In order to solve this difficulty, in [15] we introduced the following generalization of persistent homology with -coefficients which leads to a new (more general) definition of barcode.

First of all one can observe that the groups provide a filtration of :

We can then consider a double filtration of obtained by introducing the new groups , for , defined as

For each fixed and , the different groups define a filtration between and :

Each group contains all classes which are in and also the classes of which are born at and die at or before .

Elementary linear algebra proves the following quotient groups are canonically isomorphic and we denote by their common isomorphism class:

the notation being read the group of homological classes born at time i and dying at time k, in fact a group of equivalence classes modulo inferior homology groups.

Each group admits a canonical divisor presentation:

every -index dividing the next one.

For every , a bar is produced, a bar emphlabelled , to be installed in the n-dimensional barcode diagram between times i and k. This is clearly the canonical generalization to the universal integer case of the standard barcodes with coefficients in a field. Cancelling the “torsion” bars leaves the ordinary barcodes.

Let us emphasize that when working over a field, the group of -dimensional classes that are born at and die entering is uniquely determined (up to isomorphism) by its rank, denoted , which is given by the formula:

so that the groups can be determined if the groups are known. Conversely, in the field situation the information about the ranks of is sufficient to know the total groups and also the groups defined in our double filtration.

However, in the integer coefficient case the situation is not so favorable. Now the groups or are not sufficient to determine , because there could be several possibilities for the corresponding quotients111For example, the quotient of by could be or .. Similarly, from the groups it is not always possible to determine the persistent homology groups and because of extension problems. For this reason, the classical representation of persistent homology groups by means of a barcode diagram [5] is not sufficient. See [15] for more details about our definition of persistent homology and barcode diagram in the integer case.

3 Effective homology

3.1 Main definitions

The effective homology method, introduced in [20] and explained in depth in [18] and [19], is a technique which can be used to determine homology and homotopy groups of complicated spaces. We present now the main definitions and ideas of this method. All chain complexes considered in this section are chain complexes of free -modules.

Definition 3

A reduction between two chain complexes and is a triple where: (a) The components and are chain complex morphisms and ; (b) The component is a homotopy operator (a graded group homomorphism of degree +1); (c) The following relations are satisfied: (1) ; (2) ; (3)  ; (4) ; (5) .

Remark 1

These relations express that is the direct sum of and a contractible (acyclic) complex. This decomposition is simply , with and for all . In particular, this implies that the graded homology groups and are canonically isomorphic.

Definition 4

A (strong chain) equivalence between two complexes and  is a triple where is a chain complex and and are reductions from over and respectively:

Remark 2

An effective chain complex is essentially a free chain complex where each group is finitely generated, and there is an algorithm that returns a -base in each degree (for details, see [18]). The homology groups of an effective chain complex can easily be determined by means of some diagonalization algorithms on matrices (see [10]).

Definition 5

An object with effective homology is a triple where is an effective chain complex and is an equivalence between a free chain complex canonically associated to and , .

Remark 3

It is important to understand that in general the component of an object with effective homology is not made of the homology groups of ; this component is a free -chain complex of finite type, in general with a non-null differential, allowing to compute the homology groups of ; the justification is the equivalence .

The notion of object with effective homology makes it possible in this way to compute homology groups of complicated spaces by means of homology groups of effective complexes (which can easily be obtained using some elementary algorithms). The method is based on the following idea: given some topological spaces , a topological constructor produces a new topological space . If effective homology versions of the spaces are known, then an effective homology version of the space can also be built, and this version allows us to compute the homology groups of , even if it is not of finite type. A typical example of this kind of situation is the loop space constructor. Given a -reduced simplicial set with effective homology, it is possible to determine the effective homology of the loop space , which in particular allows one to compute the homology groups . Moreover, if is -reduced, this process may be iterated times, producing an effective homology version of , for

. Effective homology versions are also known for classifying spaces or total spaces of fibrations, see

[19] for more information.

The effective homology method is implemented in a system called Kenzo [3], a Lisp 16,000 lines program devoted to Symbolic Computation in Algebraic Topology, implemented by the third author of this paper and some coworkers. Kenzo works with rich and complicated algebraic structures (chain complexes, differential graded algebras, simplicial sets, simplicial groups, morphisms between these objects, reductions, etc.) and has obtained some results (for example homology groups of iterated loop spaces of a loop space modified by a cell attachment, components of complex Postnikov towers, homotopy groups of suspended classifying spaces, etc.) which had never been determined before. Kenzo has made it possible to detect an error in a theorem published in [12], where some theoretical reasonings are used to deduce that the fourth homotopy group of the suspended classifying space of the fourth alternating group , , is equal to ; Kenzo’s calculations have showed that the correct result (as later confirmed by the authors of [12]) is . See [16] for details on these calculations. Moreover, in [15] Kenzo has been used to deduce the correct relation between persistent homology and spectral sequences and detect an error in [5]: the so called “Spectral sequence theorem” [5, p. 171] includes a formula which is not correct (see [15] for details).

3.2 Relation with persistent homology

The effective homology method makes it possible to compute homology groups of complicated spaces, even if they are not of finite type, by means of the notion of reduction. What about persistent homology? In this subsection we show that, if a reduction between two filtered chain complexes satisfies some “natural” conditions, then we will be able to determine the persistent homology groups of the big chain complex by means of those of the small one.

Definition 6

A (finite) filtration of a chain complex is a family of sub-chain complexes such that

Definition 7

Given two filtered chain complexes and , a filtered chain complex morphism is a chain complex morphism which is compatible with the filtrations, that is to say,

Definition 8

Given two filtered chain complex morphisms and a chain homotopy , we say that has order if

Theorem 1

[15] Let be a chain complex with a filtration. Let us suppose that is an object with effective homology, such that there exists an equivalence with and , and such that filtrations are also defined on the chain complexes and . If the maps , , , and are morphisms of filtered chain complexes and both homotopies and have order , then the persistent homology groups of and are (explicitly) isomorphic for :

and the groups of and are (explicitly) isomorphic for :

The isomorphisms between the corresponding groups are deduced from the compositions and .

In particular, if both homotopies and have order (that is, they are compatible with the filtration on ), then all groups and of and are isomorphic.

Let us observe that if is an effective chain complex, then one can determine its persistent homology groups by means of elementary algorithms: each subcomplex has finite type, so that its homology groups are computable. Then the maps can be expressed by means of finite matrices and therefore we can compute the groups . Similarly, the groups and of can be computed by means of matrix diagonalization. Thanks to Theorem 1, we can also compute the persistent homology groups of the initial (big) chain complex by means of those of .

Once a strong chain equivalence is established between an initial chain complex and an effective chain complex , we can obtain the generators of the homology groups of expressed as cycles on . To this aim, we get the (representatives of) generators of the homology groups of as cycles in (it is a byproduct of the diagonalization process to determine Betti numbers and torsion coefficients), and then we apply on them the composition of the chain equivalence , getting the announced cycles over . In fact, the chain equivalence produces a complete solution of the homological problem for ; see the statement of this problem in [21]. If the chain complexes and are filtered and the equivalence satisfies the hypothesis of Theorem 1, the same process as before produces the generators of the persistent homology groups and . This opens the possibility of a qualitative study of persistent homology, going beyond the traditional quantitative

analysis (based, for instance, in barcodes). With our approach we can trace the born and death moments of particular cycles, and their contribution to the persistent homology groups.

The problem now is: given a filtered (big) chain complex , how can we obtain an equivalence where is effective? A useful way to obtain such an equivalence (or more concretely, a reduction ) consists in using discrete vector fields, a combinatorial tool introduced in the following section.

4 Effective homology and discrete vector fields

The notion of discrete vector field (DVF) is due to Robin Forman [6]; it is an essential component of the so-called discrete Morse theory. This notion is usually described and used in combinatorial topology, but a purely algebraic version can also be given as follows. See [17] for more details.

Definition 9

Let be a free chain complex with distinguished -basis . A discrete vector field on is a collection of pairs satisfying the conditions:

  • Every is some element of some , in which case . The degree depends on and in general is not constant.

  • Every component is a regular face of the corresponding (that is, the coefficient of in is or ).

  • Each generator (cell) of appears at most one time in .

It is not required all the cells of appear in the vector field . In particular the void vector field is allowed. In a sense the remaining cells are the most important. Moreover, we do not assume the distinguished bases are finite, the chain groups are not necessarily of finite type.

Definition 10

A cell which does not appear in the discrete vector field is called a critical cell.

In the case of a chain complex coming from a topological cellular complex, a DVF is a recipe to cancel “useless” cells in the underlying space, useless with respect to the homotopy type. For example and the circle have the same homotopy type, which is described by the following scheme:

0

1

2

0

01

12

02

12

The initial simplicial complex is made of three 0-cells 0, 1 and 2, and three 1-cells 01, 02 and 12. The drawn vector field is , and this DVF defines a homotopy equivalence between and the minimal triangulation of the circle as a simplicial set.

Definition 11

Given a discrete vector field , a -path of degree and length is a sequence satisfying:

  • Every pair is a component of and is an -cell.

  • For every , the component is a face of , non necessarily regular, but different from .

The -path is said starting from .

Definition 12

A discrete vector field is admissible if for every , a function is provided satisfying the following property: every -path starting from has a length bounded by .

The admissability property is necessary to exclude infinite paths and loops, and is used to ensure the homotopy type of the corresponding reduced chain complex (see Theorem 2) is the same as the initial one.

Discrete vector fields and effective homology can be related as follows. Let be a free chain complex provided with an admissible discrete vector field . Then a reduction can be constructed where the small chain complex is the critical complex; it is also a cellular complex but generated only by the critical cells of , those which do not appear in the vector field , with a differential appropriately defined (combining the initial differential of and the DVF). This result is due to Robin Forman [6, Section 8], and can be extended to complexes not necessarily of finite type. In [17] two different proofs of this result are given.

Theorem 2

[17][Vector-Field Reduction Theorem] Let be a free chain complex and be an admissible discrete vector field on . Then the vector field defines a canonical reduction where is the free -module generated by the critical -cells and is an appropriate differential canonically deduced from and .

Proof
A cell basis is canonically divided by the vector field into three components where (resp. , ) is made of the target (resp. source, critical) cells. For the third condition of Definition 9 implies a cell cannot be simultaneously a source cell and a target cell.

The decompositions of the bases induce a corresponding decomposition of the chain groups , so that every differential can be viewed as a matrix

It can be seen that the component is an isomorphism and its inverse map can be given by the recursive formula:

where the incidence number is the coefficient of in the differential . Let us remark that , but the other incidence numbers are arbitrary.

The maps , , and are then given by the formulas:

In this way, the maps , , and which provide the reduction can be explicitly constructed.

Thanks to Theorem 2 and the explicit isomorphism deduced from the reduction , one can compute the homology groups of the big (possibly infinite) chain complex by means of the homology groups of (whenever is of finite type). In [17], this result is used to produce, for example, the effective homology of twisted cartesian products and Eilenberg-MacLane spaces.

Let us present now the first new result of this paper, where we extend the result of Theorem 2 for the computation of persistent homology groups.

Definition 13

Let be a free chain complex provided with a filtration . We say that a generator has filtration index if and .

Theorem 3

Let be a free chain complex with a filtration and be an admissible discrete vector field on such that for every the elements and have the same filtration index. Then the canonical reduction described in Theorem 2 is compatible with the filtration. In particular, the chain homotopy has necessarily order .

Proof
As seen in the proof of Theorem 2, the differential map of the critical chain complex is defined by means of the initial differential and the discrete vector field . In particular, one can observe that if every vector in the discrete vector field is made of two elements with the same filtration index then the new differential is compatible with the filtration and therefore it induces in a natural way a filtration on the critical chain complex .

Similarly, the maps , and in the reduction are also obtained as certain combinations of and the DVF. As before, one can easily deduce from the corresponding formulas that if every vector in the discrete vector field is made of two elements with the same filtration index, then the morphisms , and will be compatible with the filtration.

Corollary 1

Let be a free chain complex with a filtration and be an admissible discrete vector field on such that for every the elements and have the same filtration index. Then all persistent homology groups and of the critical complex and those of are (explicitly) isomorphic.

Proof
We apply Theorem 1 to the reduction . It is a particular case of strong chain equivalence, where the left reduction is trivial ( and are the corresponding identity maps and is null), and .

Our new results Theorem 3 and Corollary 1 allow us to compute persistent homology groups of big chain complexes by means of those of a smaller chain complex obtained from a DVF. Let us see in the following section how we can determine a DVF for the particular case of digital images.

5 Discrete vector fields for digital images

In this section we present an algorithm producing DVFs for digital images. The algorithm is included in the unpublished paper [17] and in [14] has been formally verified by means of the interactive theorem prover Coq [2].

Since we have placed ourselves in an algebraic setting, we identify a digital image with its algebraic counterpart, as explained in the following definition.

Definition 14

A digital image, an image in short, is a finite algebraic cellular complex : every is finite and furthermore every is empty outside an interval , the smallest possible being the dimension of the image.

For example the various techniques of scientific imaging produce images, some finite objects, typically finite sets of pixels. If you are interested in some homological analysis of such an image, you associate to it a geometrical cellular complex, again various techniques can be used, and finally this defines an algebraic cellular complex, the complex defining the homology groups of the geometrical object. There remains to compute the homology groups of this complex; in fact computing the effective homology of this complex is much better.

The Vector-Field Reduction Theorem (Theorem 2) is then particularly welcome. The bases of the initial complex can be enormous, but appropriately choosing a vector field can produce, applying this Reduction Theorem, a new complex which is homology equivalent, with small critical bases ; so that the homology computations are then fast.

We begin by considering the simplest case of a chain complex with only two consecutive chain groups, and then the general case is easily reduced to this one.

Let be a matrix , with rows and columns. Think of as the unique non-null boundary matrix of the chain complex:

A vector field for this matrix is nothing but a set of integer pairs satisfying these conditions:

  1. and .

  2. The entry of the matrix is .

  3. The indices (resp. ) are pairwise different.

This clearly corresponds to a DVF for the chain complex , and constructing such a vector field is very easy. But there remains the main problem: is this vector field admissible? Because the context is finite, it is a matter of avoiding loops. If the vector field is admissible, it defines a partial order between source cells: the relation is satisfied between source cells if and only if a -path goes from to . The non-existence of loops guarantees this is actually a partial order.

Conversely, let be a vector field for our matrix . If , with , we can decide if there is an elementary V-path from to , that is, if a vector is present in and the entry is non-null; for this corresponds to a cell with in particular as regular face and as an arbitrary face. We so obtain a binary relation. Then the vector field is admissible if and only if this binary relation actually transitively generates a partial order, that is, if again there is no loop .

Definition 15

Let be an admissible discrete vector field on a chain complex . is said to be maximal if it is not possible to add a new vector to such that the new vector field is admissible.

Let us remark that a DVF being maximal does not imply its number of vectors is maximal. Finding a vector field of maximal size seems much too difficult in real applications. Finding a maximal admissible vector field, not the same problem, is more reasonable but still serious.

A direct way to quickly construct an admissible DVF for a matrix consists in predefining an order between row indices, and to collect all the indices for which some column is “above this index”. Let us play with this toy-matrix given by our random generator:

If we take simply the index order between row indices, we see the columns 1, 4 and 5 can be selected, giving the vector field . Theorem 2 produces a reduction of to with

In other words, a reduction is constructed from the initial chain complex to a smaller chain complex . See [17] for details on the explicit formula for .

A more sophisticated strategy consists, given an admissible vector field already constructed, in trying to add a new vector to obtain a better reduction. The already available vector field defines a partial order between the source cells with respect to this vector field and the game now is to search a new vector to be added, but keeping the admissibility property. This process is applied by starting from the void DVF.

Let us try to apply this process to the same matrix as before. We start with the void vector field . Running the successive rows in the usual reading order, we find , and we add the vector (1;3), obtaining . Only one source cell 1, but we must note that it is from now on forbidden to add a vector which would produce the relation or : this will generate a loop and the same for 5. In other words, the partial order to be recorded is and , even if 4 and 5 are not yet source cells. Also the row 1 and the column 3 are now used and cannot be used anymore.

We read the row 2 and find , which suggests to add the vector , possible, with the same restrictions as before. Now .

Reading the row 3 suggests to add the vector where 4 has 1 as a face, because . This does not create any cycle, and we define . We note also that .

Reading the row 4, the only possibility would be the new vector , but 2 is a face of 5 and this would generate the loop , forbidden. It is impossible to add a vector .

Finally we can add the vector , convenient, for 1 has no other face than 5; adding this vector certainly keeps the admissibility property.

This leads to the maximal vector field , which generates the partial order on where the only non-trivial relations are and . Reordering the rows and columns in the respective orders and gives the new form for our matrix:

The DVF has 4 components, and the top left-hand submatrix is triangular unimodular. The reduction produces the matrix which of course can be reduced to the void matrix.

The example shows a first step of reduction produces a smaller matrix which in turn can sometimes be also reduced, even if the used vector field is maximal.

We obtain in this way an algorithm producing a maximal admissible DVF which allows one to reduce the number of generators of the initial chain complex.

Theorem 4

An algorithm can be written down:

  • Input: A matrix .

  • Output: A maximal admissible discrete vector field for .

See [17] for more details on the algorithm and some considerations about a graph interpretation which can be useful for more realistic (that is, for bigger matrices) situations.

Let be a chain complex of finite type with only two non-null consecutive chain groups . Applying Theorem 4 to the differential map matrix , we obtain a maximal admissible discrete vector field for . Considering now Theorem 2, we obtain a reduction from the initial chain complex to a smaller (also effective) one . In particular this allows one to compute the homology groups of the big chain complex by means of those of in a more efficient way.

Let us consider now a general digital image (see Definition 14). The first differential map is given by a matrix . Applying Theorem 4, we obtain a maximal admissible discrete vector field for and then Theorem 2 produces a reduction to a smaller chain complex . We consider now the second differential map of , given by a matrix , and apply again Theorems 4 and 2. This produces a new reduction from to a new smaller chain complex . The process can be iterated for every dimensions . The compositions of all reductions produces a reduction from the initial chain complex to the last chain complex . This leads to the following theorem.

Theorem 5

An algorithm can be written down:

  • Input: A digital image .

  • Output: A reduction obtained as a composition of several reductions, each of them deduced from a maximal admissible discrete vector field applying Theorem 2.

The fact that each DVF involved in the construction of is maximal implies that the chain complex is usually significantly smaller than . The homology groups of can then be determined in a more efficient way.

Let us consider the following toy example, a screen with a “resolution” and this image, eight pixels black and one white.

The bases of the corresponding cellular complex are made of 16 vertices, 24 edges and 8 squares. Theorem 5 could produce the following vector field, with only two critical cells (one vertex and one edge).

The reduced chain complex is with the null map between both copies of . The reduction described by Theorem 2 informs that and produces a representant for the generating homology classes.

6 Discrete vector fields for computing effective persistent homology of digital images

Let us suppose now that we have a digital image which is filtered, that is, we consider that each pixel appears at some moment (filtration index) . This gives us a list of images such that all pixels of each image are present in the following one. The associated chain complex is filtered and then it makes sense to compute its persistent homology groups, which can be used to determine some relevant features of the image in contrast with the noise. We can slightly modify the algorithm of Theorem 5 as follows such that the small chain complex obtained by means of the construction of a DVF in each dimension allows us to compute the persistent homology groups of the initial image.

Let be a digital image with a filtration. The set of generators in each dimension , , can be reordered by their filtration index in an increasing way. With this order, each differential matrix can be decomposed in blocks such that each submatrix in the diagonal corresponds to elements of the same filtration index. We begin by considering the first matrix , with the previous decomposition. We apply Theorem 4 separately to each submatrix in the diagonal, which produces several admissible discrete vector fields where both elements in each vector have the same filtration index. The vector fields are disjoint and it is possible to concatenate all of them, obtaining a discrete vector field which is therefore compatible with the filtration (let us note that this opens the possibility of parallel processing, since the computation in each diagonal block is independent from the rest). The DVF so obtained is maximal with respect to the filtration, that is, it is not possible to add a new vector such that the new vector field is admissible and compatible with the filtration. Applying now Theorem 3 and Corollary 1, we obtain a reduction from to a new filtered chain complex where the persistent homology groups of the reduced chain complex are isomorphic to those of the initial one. We repeat the procedure with the second differential map matrix of , , obtaining a new reduction compatible with the filtrations. Iterating the process and composing the different reductions, we obtain a reduction from the initial filtered digital image to a smaller filtered chain complex , which allows one to determine the persistent homology of in an efficient way. This leads to the following theorem.

Theorem 6

An algorithm can be written down:

  • Input: A filtered digital image .

  • Output: A reduction compatible with the filtration, obtained as a composition of several reductions, each of them constructed by applying Theorem 2 to an admissible discrete vector field which is maximal with respect to the filtration.

As explained for Theorem 5, the fact that each DVF involved in the construction of is maximal with respect to the filtration implies that the chain complex is usually significantly smaller than . In this way, this result allows one to compute persistent homology of (big) digital images by means of those of a small chain complex. The technique can be considered as a generalization of [13].

7 Implementation and examples

The algorithm presented in Theorem 6 has been implemented in a new module for the Kenzo system making use of programs previously developed. More concretely, in [15] a new module for the Kenzo system was constructed allowing the computation of persistent homology of filtered chain complexes, making use of spectral sequences and the effective homology technique. On the other hand, discrete vector fields were also previously implemented in Kenzo, including the construction of the associated reduction described in Theorem 2. Moreover, we have also made use of a new module for the Kenzo system for the computation of homology groups of digital images (allowing in particular the construction of the simplicial complex associated with a digital image) developed by Jónathan Heras [7, 9].

Our new module for Kenzo includes the implementation of algorithms described in Theorems 4, 5 and 6 and makes it possible to compute persistent homology groups of a filtered digital image . First of all, the program reorders the generators of the image by increasing filtration index. Then it considers the first differential map and applies Theorem 4 on the different submatrices on the diagonal corresponding to elements of the same filtration index, producing a DVF on . Then it computes by means of Theorem 2 the associated reduction to the critical complex , repeats the process for the second differential map of , and so on. We obtain in this way the reduction described by Theorem 6, which is compatible with the filtration. This reduction is used as the effective homology of the initial image, and thanks to Theorem 1 we can compute the persistent homology groups of in an efficient way.

Figure 1: Filtered digital image.

Let us consider for example the filtered image of Figure 1. The bases of the corresponding simplicial complex are made of vertices, edges and triangles. Applying Theorem 6 one can construct a reduction (compatible with the filtration) to a small chain complex which in this case has only vertices, edges and triangles. The reduction is obtained in two steps (dimensions and ), by using two discrete vector fields (obtained by applying Theorem 4 to different submatrices) with respectively and vectors. The reduction provides an effective homology for the initial image allowing one to compute its persistent homology groups in a more efficient way.

The final homology groups of the image are and . Making use of our programs for computing persistent homology groups (by means of DVFs), we can see the evolution of the corresponding homology classes along the four filtration steps . For example, , which means that in dimension there are classes which are born at the first step and are still alive at (the last) step :

> (prst-hmlg-group K 1 4 0)
Persistent Homology H^{1,4}_0
Component Z
Component Z
Component Z
Component Z

Similarly, means that there are holes at stage which are still alive at step :

> (prst-hmlg-group K 2 4 1)
Persistent Homology H^{2,4}_1
Component Z
Component Z

The toy example of Figure 1 shows the improvement provided by the use of DVFs. This improvement is of course much more significant when working with big digital images. For instance, we have used our programs to compute the persistent homology of several fingerprints extracted from the repository [1]. Given a fingerprint image, we can filter it by taking at the first step some initial horizontal lines, adding at each stage of the filtration some additional lines and ending with the whole image. This filtration produces some persistent homology groups. A similar process could be done in the vertical direction, taking successively the columns of the image, producing in that way different persistent homology groups. It could seem natural to think that given two (different) fingerprint images corresponding to the same person, the so obtained persistent homology groups should be similar.

Figure 2: Fingerprint filtration.

Let us consider for example the fingerprint of Figure 2, with ten steps horizontal filtration. The associated simplicial complex has vertices, edges and triangles. The direct computation in Kenzo of its persistent homology groups is slow and in some cases fails because of memory problems. The discrete vector field reduction produces a small chain complex with only vertices, edges and triangles, and all persistent homology groups can then be computed in less than a second. Figure 3 shows the barcode with the results computed by our programs. In this case, there are not deaths because all homology classes in both degrees and persist along the different stages of the filtration. Let us observe that for degree , the two classes born at stage  correspond to real circles (holes) in the fingerprint; however, classes born at stages and are small holes produced in the image due to the resolution but not present in the fingerprint.

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10