Compact Representation of Uncertainty in Hierarchical Clustering

02/26/2020
by   Craig S. Greenberg, et al.
7

Hierarchical clustering is a fundamental task often used to discover meaningful structures in data, such as phylogenetic trees, taxonomies of concepts, subtypes of cancer, and cascades of particle decays in particle physics. When multiple hierarchical clusterings of the data are possible, it is useful to represent uncertainty in the clustering through various probabilistic quantities. Existing approaches represent uncertainty for a range of models; however, they only provide approximate inference. This paper presents dynamic-programming algorithms and proofs for exact inference in hierarchical clustering. We are able to compute the partition function, MAP hierarchical clustering, and marginal probabilities of sub-hierarchies and clusters. Our method supports a wide range of hierarchical models and only requires a cluster compatibility function. Rather than scaling with the number of hierarchical clusterings of n elements (ω(n n! / 2^n-1)), our approach runs in time and space proportional to the significantly smaller powerset of n. Despite still being large, these algorithms enable exact inference in small-data applications and are also interesting from a theoretical perspective. We demonstrate the utility of our method and compare its performance with respect to existing approximate methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/14/2021

Exact and Approximate Hierarchical Clustering Using A*

Hierarchical clustering is a critical task in numerous domains. Many app...
research
01/09/2016

Multicuts and Perturb & MAP for Probabilistic Graph Clustering

We present a probabilistic graphical model formulation for the graph clu...
research
06/05/2021

Cluster Analysis via Random Partition Distributions

Hierarchical and k-medoids clustering are deterministic clustering algor...
research
06/16/2023

Nearly-Optimal Hierarchical Clustering for Well-Clustered Graphs

This paper presents two efficient hierarchical clustering (HC) algorithm...
research
01/26/2021

Unsupervised clustering of series using dynamic programming and neural processes

Following the work of arXiv:2101.09512, we are interested in clustering ...
research
05/28/2018

Hierarchical clustering with deep Q-learning

The reconstruction and analyzation of high energy particle physics data ...
research
06/01/2022

Rational partition models under iterative proportional scaling

In this work we investigate partition models, the subset of log-linear m...

Please sign up or login with your details

Forgot password? Click here to reset