Persistence Bag-of-Words for Topological Data Analysis

12/21/2018 ∙ by Bartosz Zielinski, et al. ∙ 22

Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs are compact 2D representations formed by multisets of points. Their variable size makes them, however, difficult to combine with typical machine learning workflows. In this paper, we introduce persistence bag-of-words, which is a novel, expressive and discriminative vectorized representation of PDs for topological data analysis. It represents PDs in a convenient way for machine learning and statistical analysis and has a number of favorable practical and theoretical properties like 1-Wasserstein stability. We evaluate our representation on several heterogeneous datasets and show its high discriminative power. Our approach achieves state-of-the-art performance and even beyond in much less time than alternative approaches. Thereby, it facilitates the topological analysis of large-scale data sets in future.



There are no comments yet.


page 1

page 5

page 6

page 8

page 10

page 11

page 13

page 20

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.