Dimension Reduction of Two-Dimensional Persistence via Distance Deformations

03/01/2022
by   Maximilian Neumann, et al.
Apple, Inc.
0

This article grew out of the application part of my Master's thesis at the Faculty of Mathematics and Information Science at Ruprecht-Karls-Universität Heidelberg under the supervision of PD Dr. Andreas Ott. In the context of time series analyses of RNA virus datasets with persistent homology, this article introduces a new method for reducing two-dimensional persistence to one-dimensional persistence by transforming time information into distances.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

08/17/2021

Multidimensional Persistence: Invariants and Parameterization

This article grew out of my Master's thesis at the Faculty of Mathematic...
07/07/2022

MuRiT: Efficient Computation of Pathwise Persistence Barcodes in Multi-Filtered Flag Complexes via Vietoris-Rips Transformations

Multi-parameter persistent homology naturally arises in applications of ...
03/20/2019

Local Versus Global Distances for Zigzag Persistence Modules

This short note establishes explicit and broadly applicable relationship...
02/20/2022

Variably Scaled Persistence Kernels (VSPKs) for persistent homology applications

In recent years, various kernels have been proposed in the context of pe...
09/10/2019

Cache Persistence Analysis: Finally Exact

Cache persistence analysis is an important part of worst-case execution ...
09/04/2019

Multifractal Description of Streamflow and Suspended Sediment Concentration Data from Indian River Basins

This study investigates the multifractality of streamflow data of 192 st...
04/06/2020

On the Persistence of Persistent Identifiers of the Scholarly Web

Scholarly resources, just like any other resources on the web, are subje...

1 SNV cycles

For this article, let be a finite distance space, i.e. is a finite set and is a metric or more generally a semimetric222A semimetric satisfies all the axioms of a metric with exception of the triangle inequality. on . Recall that for every , the Vietoris-Rips complex of at scale is the abstract simplicial complex

Assume that we have a (time dependent) filtration

For , consider the Vietoris-Rips filtration

Denote by the first simplicial homology with coefficients in a finite prime field applied to the filtration . Then is a finitely generated (f.g) one-dimensional persistence module.

As in the work of Bleher et al. [topologyidentifies2021] where is a finite set of SARS-CoV-2 RNA sequences with Hamming distance , we are interested in detecting cycles that correspond to bars in the barcode born in the first filtration step. In [topologyidentifies2021], these cycles are called single nucleotide variation (SNV) cycles and are used for a topological recurrence (time series) analysis of SARS-CoV-2. For simplicity, we also call such cycles SNV cycles in our more general setting.

Definition 1.1 (SNV cycle).

The underlying homology class representatives of bars in the barcode born in the first filtration step are called SNV cycles in time step .

For every , denote by a full set of SNV cycle representatives extracted from the barcode . In [topologyidentifies2021], the barcodes are computed with Ripser [bauer2021ripser] and the are extracted from the Ripser output. Ripser is a highly optimised software tool, capable of processing hundreds of thousands of distinct RNA sequences [topologyidentifies2021]. However, this classical approach to a time series analysis has the following issues:

  1. Computing each time step seperately can be very time consuming for large (e.g. a time series analysis over one year on a daily basis).

  2. We are not able to track the time-stability of SNV cycles, i.e. whether the image of the homology class of an SNV cycle under the canonical homomorphism

    is zero or not.

  3. Since each time step is computed seperately, the are not automatically compatible: let and assume that the image of under the canonical homomorphism

    is not zero. Then it still may happen that .

In Sections 2 and 3, we present a method that enables the extraction of SNV cycles for each time step with only one barcode computation. The resulting SNV cycles are automatically compatible and we can track their time-stability.

2 Dimension reduction

The naturally lead to a finite bifiltered simplicial complex . We obtain a f.g. two-dimensional persistence module which contains all the information that occur within the . Moreover, contains additional information about the behaviour of homology classes along the time filtration parameter. Since we are only interested in detecting SNV cycles and not in determining their lifespan in the barcodes , it suffices to compute the barcode where is the one-dimensional subfiltration

For reasons of notation, we start with . The f.g. one-dimensional persistence module can be viewed as a dimensional reduction of . The barcode contains all the information we need to extract SNV cycles for each time step . Moreover, tracks the stability of SNV cycles along the time filtration parameter.

The idea to consider barcodes of subfiltrations follows a more general concept introduced by Carrie et al. [Bettinumbersmultipers] and called fibered barcode by Lesnick and Wright [lesnick2015interactive]. Fibered barcodes are closely related to the rank invariant introduced by Carlsson and Zomorodian in [Carlsson2009multidimensionalpersistence]. In [Bettinumbersmultipers], it is shown that the fibered barcode and the rank invariant determine each other.

3 Distance deformation

In this section, we introduce a distance deformation technique to realise as a Vietoris-Rips filtration such that we have a correspondence between the barcodes and for the bars corresponding to SNV cycles.

For the following, let be the lowest power of such that . For example, if , then . For , let

Definition 3.1 (Distance deformation).

We define a new distance on as follows: let with . Define

and

Example 3.2.

The intuition behind is that time information is transformed into distances. Let . Then we have . Let with and . Assume that . Then we have

and

Figure 1:   Here we illustrate the correspondence between SNV cycles and their deformed equivalents. The blue-coloured points and edges indicate that the distance was deformed according to the time step they were added. As we can see, an SNV cycle was destroyed by adding a point along the time filtration parameter.

Consider the Vietoris-Rips filtration , where for ,

with filtration parameters

Then is a f.g. one-dimensional persistence module. By construction, we have the following correspondence (illustrated in Figure 1).

Correspondence 3.3.

Consider the barcodes and . Let . Then bars born in are in one to one correspondence with bars born in . Let . If a bar born in dies in , the corresponding bar born in dies in .

Using this correspondence, the definition of SNV cycles translates as follows.

Definition 3.4 (Deformed SNV cycle).

The underlying homology class representatives of bars in the barcode born in are called deformed SNV cycles.

Denote by a full set of deformed SNV cycle representatives extracted from . For , define

By construction, we have a bijection of sets

Moreover, we have compatibility: let and assume that the image of under the canonical homomorphism

is not zero. Then by construction. In addition, we can track the time-stability of SNV cycles and instead of barcode computations of for , only the computation of has to be performed. Since is a Vietoris-Rips filtration, the barcode can be computed with Ripser [bauer2021ripser]. In practical experiments, one could investigate whether this new method provides a performance advantage over the classical approach to a time series analysis, where each time step is computed seperately.

References