Convergence of Hierarchical Clustering and Persistent Homology Methods on Directed Networks
While there has been much interest in adapting conventional clustering procedures---and in higher dimensions, persistent homology methods---to directed networks, little is known about the convergence of such methods. In order to even formulate the problem of convergence for such methods, one needs to stipulate a reasonable model for a directed network together with a flexible sampling theory for such a model. In this paper we propose and study a particular model of directed networks, and use this model to study the convergence of certain hierarchical clustering and persistent homology methods that accept any matrix of (possibly asymmetric) pairwise relations as input and produce dendrograms and persistence barcodes as outputs. We show that as points are sampled from some probability distribution, the output of each method converges almost surely to a dendrogram/barcode depending on the structure of the distribution.
READ FULL TEXT