Cluster Analysis via Random Partition Distributions

06/05/2021
by   David B. Dahl, et al.
0

Hierarchical and k-medoids clustering are deterministic clustering algorithms based on pairwise distances. Using these same pairwise distances, we propose a novel stochastic clustering method based on random partition distributions. We call our method CaviarPD, for cluster analysis via random partition distributions. CaviarPD first samples clusterings from a random partition distribution and then finds the best cluster estimate based on these samples using algorithms to minimize an expected loss. We compare CaviarPD with hierarchical and k-medoids clustering through eight case studies. Cluster estimates based on our method are competitive with those of hierarchical and k-medoids clustering. They also do not require the subjective choice of the linkage method necessary for hierarchical clustering. Furthermore, our distribution-based procedure provides an intuitive graphical representation to assess clustering uncertainty.

READ FULL TEXT

page 12

page 15

page 24

research
11/03/2016

A-Ward_pe̱ṯa̱: Effective hierarchical clustering using the Minkowski metric and a fast k -means initialisation

In this paper we make two novel contributions to hierarchical clustering...
research
06/17/2019

Nested partitions from hierarchical clustering statistical validation

We develop a greedy algorithm that is fast and scalable in the detection...
research
06/18/2020

Guarantees for Hierarchical Clustering by the Sublevel Set method

Meila (2018) introduces an optimization based method called the Sublevel...
research
04/06/2018

Discussion of the article "Bayesian cluster analysis: point estimation and credible balls" by Wade and Ghahramani

We present a discussion of the paper "Bayesian cluster analysis: point e...
research
09/20/2019

A clusterwise supervised learning procedure based on aggregation of distances

Nowadays, many machine learning procedures are available on the shelve a...
research
04/22/2016

The Mean Partition Theorem of Consensus Clustering

To devise efficient solutions for approximating a mean partition in cons...
research
02/26/2020

Compact Representation of Uncertainty in Hierarchical Clustering

Hierarchical clustering is a fundamental task often used to discover mea...

Please sign up or login with your details

Forgot password? Click here to reset