Vector Summaries of Persistence Diagrams for Permutation-based Hypothesis Testing

06/09/2023
by   Umar Islambekov, et al.
0

Over the past decade, the techniques of topological data analysis (TDA) have grown into prominence to describe the shape of data. In recent years, there has been increasing interest in developing statistical methods and in particular hypothesis testing procedures for TDA. Under the statistical perspective, persistence diagrams – the central multi-scale topological descriptors of data provided by TDA – are viewed as random observations sampled from some population or process. In this context, one of the earliest works on hypothesis testing focuses on the two-group permutation-based approach where the associated loss function is defined in terms of within-group pairwise bottleneck or Wasserstein distances between persistence diagrams (Robinson and Turner, 2017). However, in situations where persistence diagrams are large in size and number, the permutation test in question gets computationally more costly to apply. To address this limitation, we instead consider pairwise distances between vectorized functional summaries of persistence diagrams for the loss function. In the present work, we explore the utility of the Betti function in this regard, which is one of the simplest function summaries of persistence diagrams. We introduce an alternative vectorization method for the Betti function based on integration and prove stability results with respect to the Wasserstein distance. Moreover, we propose a new shuffling technique of group labels to increase the power of the test. Through several experimental studies, on both synthetic and real data, we show that the vectorized Betti function leads to competitive results compared to the baseline method involving the Wasserstein distances for the permutation test.

READ FULL TEXT

page 11

page 12

page 13

research
12/05/2019

Universality of persistence diagrams and the bottleneck and Wasserstein distances

We undertake a formal study of persistence diagrams and their metrics. W...
research
06/09/2020

Hypothesis Testing for Shapes using Vectorized Persistence Diagrams

Topological data analysis involves the statistical characterization of t...
research
01/16/2020

Understanding the Power of Persistence Pairing via Permutation Test

Recently many efforts have been made to incorporate persistence diagrams...
research
06/04/2021

Bottleneck Profiles and Discrete Prokhorov Metrics for Persistence Diagrams

In topological data analysis (TDA), persistence diagrams have been a suc...
research
07/08/2022

On the Universality of Random Persistence Diagrams

One of the most elusive challenges within the area of topological data a...
research
04/01/2022

Differentiating small-scale subhalo distributions in CDM and WDM models using persistent homology

The spatial distribution of galaxies at sufficiently small scales will e...
research
10/27/2021

Approximating 1-Wasserstein Distance between Persistence Diagrams by Graph Sparsification

Persistence diagrams (PD)s play a central role in topological data analy...

Please sign up or login with your details

Forgot password? Click here to reset