# Regularization of Persistent Homology Gradient Computation

Persistent homology is a method for computing the topological features present in a given data. Recently, there has been much interest in the integration of persistent homology as a computational step in neural networks or deep learning. In order for a given computation to be integrated in such a way, the computation in question must be differentiable. Computing the gradients of persistent homology is an ill-posed inverse problem with infinitely many solutions. Consequently, it is important to perform regularization so that the solution obtained agrees with known priors. In this work we propose a novel method for regularizing persistent homology gradient computation through the addition of a grouping term. This has the effect of helping to ensure gradients are defined with respect to larger entities and not individual points.

• 6 publications
• 19 publications
12/23/2020

### The Structure of Morphisms in Persistent Homology, I. Functorial Dualities

We prove duality results for absolute and relative versions of persisten...
01/29/2019

### Explicit topological priors for deep-learning based image segmentation using persistent homology

We present a novel method to explicitly incorporate topological prior kn...
05/29/2019

### A Topology Layer for Machine Learning

Topology applied to real world data using persistent homology has starte...
09/15/2015

### Neuron detection in stack images: a persistent homology interpretation

Automation and reliability are the two main requirements when computers ...
11/13/2018

### Computing multiparameter persistent homology through a discrete Morse-based approach

Persistent Homology (PH) allows tracking homology features like loops, h...
12/27/2021

### Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies

Unrolled computation graphs arise in many scenarios, including training ...
06/10/2022

### Persistent Homology for Resource Coverage: A Case Study of Access to Polling Sites

It is important to choose the geographical distribution of public resour...

## 1 Introduction

Persistent homology is a computational method from the field of applied topology for computing the topological features present in a given data edelsbrunner2010 . Informally, the features in question relate to the number and scale of connected components and holes of different dimensions in the data. The direct application of persistent homology has proven to be a useful method in the analysis of many different types of data including image carlsson2008local , health nicolau2011topology and network data corcoran2020stable .

Given the recent advances and interest in neural networks or deep learning, there exists a trend of attempting to integrate existing computational methods as computational steps in this framework. This includes, for example, the integration of integer programming ferber2020mipaal and shortest path methods berthet2020learning with deep learning. In these works, deep learning usually acts as a preprocessing step to the method in question where it performs representation learning. Deep learning requires all computational steps to be differentiable such that gradients can be back-propagated through each step and used to update the method parameters. Therefore, for a given computational method to be integrated with deep learning the method in question must be differentiable. Many useful computational methods, such as integer programming, in their native form are not differentiable. Consequently much work has been investing in making such computational methods differentiable so that they may be integrated with deep learning. That is, developing methods for computing the gradients of the input with respect to the output of the methods in question.

As discussed above, persistent homology has proven to be a useful computational method. Therefore integrating it with deep learning has much potential. However, in its native form, persistent homology is not differentiable. Therefore recently there has been much interest in attempting to make it differentiable so that it may be integrated with deep learning chen2019topological ; wangtopogan ; gabrielsson2020topology ; Moor19Topological . Computing gradients of persistent homology outputs with respect to inputs is an inverse problem oudot2020inverse ; solomon2020fast . Like many inverse problems, it is ill-posed with infinitely many solutions. This point is obvious when one considers that infinitely many different datasets can have the same number and scale of connected components and holes of different dimensions. When attempting to solve an arbitrary ill-posed inverse problem, if one does not consider the ill-posed nature of the problem, the solution obtained may not agree with known priors. A common approach to overcome this issue is to perform regularization which biases the solution toward known priors. This approach is commonly used to solve inverse problems in the field of image processing. In this field two commonly used regularization approaches are total variation (TV) and total generalized variation (TGV) which lead to piecewise constant and piecewise linear images respectively lunz2018adversarial .

To the authors knowledge, the use of regularization when computing gradients of persistent homology has yet to be considered. Consequently, the use of current methods for computing such gradients can lead to solutions which do not agree with known priors. For example, consider the two dimensional point dataset in Figure 1

which contains two compact clusters. If we compute the gradients of persistent homology using current methods and in turn minimize a loss function measuring the distance between the clusters (see Appendix

A), we obtain the result in Figure 1. Although this result approaches the minimization of the loss function in question, it does not agree with a reasonable prior that changes to topology features should be made at the level of larger entities and not individual points. In this case the entities in question are the two clusters. As a second example, consider the two dimensional point dataset in Figure 1 which contains a single horseshoe shaped cluster. If we compute the gradients of persistent homology using current methods and in turn minimize a loss function measuring the width of the horseshoe opening (see Appendix A), we obtain the result in Figure 1. Again, although this result approaches the minimization of the loss function in question, it does not agree with a reasonable prior that changes to topology features should be made at the level of larger entities. In this case the entities in question are the two ends of the horseshoe.

The artefacts in the above two example results are a consequence of the fact that persistent homology gradients are defined with respect to individual points and not larger entities. This motivates the following insight. When computing persistent homology gradients, this computation should be regularized through the addition of a grouping term such that gradients are defined with respect to larger entities and not individual points. In this article we propose a novel method for regularizing the computation of persistent homology gradients which achieves this goal. The result of applying this method to the two dimensional datasets in Figures 1 and 1 with the respective loss functions described above are displayed in Figures 1 and 1 respectively. It is evident that in the case of these two examples, the proposed method does not exhibit the artefacts encountered above.

The layout of this paper is as follows. In Section 2 we describe the proposed method for regularizing the computation of persistent homology gradients. In Section 3 we briefly draw some conclusions from this work.

This section is structured as follows. In Section 2.1 we briefly review necessary background material on persistent homology and describe a current method for computing gradients with respect to persistent homology. In Section 2.2 we describe the proposed method for performing regularization of this gradient computation.

### 2.1 Persistent Homology & Gradient Computation

An (abstract) simplicial complex is a finite collection of sets such that for each all subsets of are also contained in . Each element is called a -simplex where is the dimension of the simplex. Given a finite set of points in , the corresponding Rips complex for a specified radius value is defined as follows:

 Rr(X)={σ⊆X:∀x,y∈σ,∥x−y∥≤r} (1)

The -simplices in this simplicial complex equal unordered ()-tuples of points which are pairwise within distance  ghrist2008barcodes . Computing the homology of returns the homology groups for each natural number . An element of represents the existence of an -dimensional hole in . That is, an element of represents the existence of a path-component in while an element of represents the existence of a one dimensional hole in  otter2017roadmap . A Rips filtration of is a finite sequence of Rips complexes associated with an increasing sequence of radius values. A Rips filtration induces a sequence of inclusion maps defined as:

 Rr1(X)\xhookrightarrowRr2(X)\xhookrightarrow…\xhookrightarrowRrN(X) (2)

Given a Rips filtration, instead of computing the homology of each Rips complex in the sequence independently, persistent homology computes the homology of the inclusions for all  otter2017roadmap . The result of this computation is a set of persistence diagrams where corresponds to the homology group . Each persistence diagram is a multiset of points where represents the existence of an element of appearing in and subsequently disappearing in . The value is called the persistence of the element in question.

Let be an element in a given persistence diagram . There exists a map which maps and to simplices in the Rips filtration whose addition results in the appearance and disappearance respectively of the corresponding element in  leygonie2019framework ; gabrielsson2020topology . Therefore, one can adjust the values and by adjusting the radius values at which the simplices and respectively are added in the Rips filtration. Specifically, the radius value at which a simplex is added in the Rips filtration is defined by the following map :

 δ(σ)=maxx,y∈σ∥x−y∥ (3)

Both the maps and are differentiable and in turn the map is differentiable leygonie2019framework . Let be a real valued differentiable loss function which is a function of . The map is in turn differentiable and can be minimized using any gradient based optimization technique.

### 2.2 Regularization

As described in the previous section, the map is differentiable. If the map is a function of a single element in a given persistence diagram , the map is locally a function of at most four elements of . The elements in question are the two pairs of points maximum distance apart in the simplices whose introduction resulted in the appearance and disappearance of the topological feature in question. In this case, taking a single in the direction of the gradient of will alter the positions of at most these four elements of .

The above approach to minimizing a loss function changes topological features at the level of individual points. This does not agree with our prior that changes to topological features should be made at the level of larger entities consisting of sets of points. To overcome this challenge we propose a novel method for regularizing persistent homology gradient computation through the addition of a grouping term. This has the effect of helping to ensure gradients are defined with respect to larger entities and not individual points.

Let Set be the space of finite sets of points in . Let be the set of input points which we wish to optimize with respect to the loss function . Let be the initial value of before optimization and be the corresponding bijection from to . Let be the set of all unordered pairs of elements in and be a given kernel. Broadly speaking, a kernel maps smaller values to a value approaching and larger values to a value approaching . An example of a kernel is the uniform kernel defined as follows where is a specified scale parameter:

 k(x)={1∥x∥≤s0∥x∥>s (4)

Given a specified kernel, we define the proposed regularization term as follows:

 τ(X)=∑(a,b)∈Xk(∥a−b∥)(∥a−b∥−∥ρ(a)−ρ(b)∥)2 (5)

This term measures the discrepancy between pairwise distances in and the corresponding pairwise distances in . Minimizing this term helps to ensure the structure of local groups of points in is preserved in . That is, if a change is made to a single point this term will help ensure that a similar change is made to all points in a corresponding group where this group is a local neighbourhood defined by the kernel in question.

Let be a specified loss function which is a function of the persistent homology of the corresponding input. We define a regularized version of this loss , which integrates the regularization term in Equation 5, as follows:

 l(X)=ϱ(X)+λτ(X) (6)

In this equation is a specified real valued weighting parameter. This regularized version of the loss will help ensure that changes to topological features will not be made at the level of individual points, but will instead be made at the level of larger entities consisting of sets of points.

The result of applying the proposed regularization method to the sets of points in Figures 1 and 1 are displayed in Figures 1 and 1 respectively. In both cases a uniform kernel with a parameter value of and a parameter value of was used. The loss function terms in question equal the squared persistence of the element in corresponding to the merging of the two clusters and the squared persistence of the element in corresponding to the merging of the two ends of the horseshoe respectively. These functions are formally defined in the Appendix A section of this article. Optimization was performed using the Adam optimizer with a learning rate of kingma2015adam .

From these figures, we see that the distance between the clusters and the distance between the ends of the horseshoe are reduced and this is achieved in a manner which agrees with our prior that changes to topological features are made at the level of larger entities. This contrasts with the results displayed in Figures 1 and 1 respectively where no regularization is applied and consequently changes to topological features are made at the level of individual points.

## 3 Conclusions & Future Work

To the authors knowledge, this work presents the first attempt to perform regularization of persistent homology gradient computation. Given the increasing usefulness of integrating persistent homology and deep learning plus the need to perform such regularization, we believe this topic has the potential to develop into an active area of research in the field of applied topology.

## Appendix A Loss Functions for Figure 1

In this section we formally define the loss function terms in Equation 6 applied to the datasets in Figures 1 and 1 to obtain the results in Figures 1, 1, 1 and 1.

Let be the space of persistence diagrams. Let be a loss function defined as:

 ϱ0(D)=∑{(p,q)∈D:q−p>0.10,q−p<∞}(q−p)2. (7)

This loss function was applied to the persistence diagram corresponding to the dataset in Figure 1 to compute the results in Figures 1 and 1.

Let be the loss function defined as:

 ϱ1(D)=∑{(p,q)∈D:q−p>0.25}(q−p)2. (8)

This loss function was applied to the persistence diagram corresponding to the dataset in Figure 1 to compute the results in Figures 1 and 1.