Bayesian contiguity constrained clustering, spanning trees and dendrograms

02/24/2023
by   Etienne Côme, et al.
0

Clustering is a well-known and studied problem, one of its variants, called contiguity-constrained clustering, accepts as a second input a graph used to encode prior information about cluster structure by means of contiguity constraints i.e. clusters must form connected subgraphs of this graph. This paper discusses the interest of such a setting and proposes a new way to formalise it in a Bayesian setting, using results on spanning trees to compute exactly a posteriori probabilities of candidate partitions. An algorithmic solution is then investigated to find a maximum a posteriori (MAP) partition and extract a Bayesian dendrogram from it. The interest of this last tool, which is reminiscent of the classical output of a simple hierarchical clustering algorithm, is analysed. Finally, the proposed approach is demonstrated with real applications. A reference implementation of this work is available in the R package gtclust that accompanies the paper (available at http://github.com/comeetie/gtclust)

READ FULL TEXT

page 18

page 23

research
03/15/2012

Bayesian Rose Trees

Hierarchical structure is ubiquitous in data across many domains. There ...
research
08/11/2021

Edge Partitions of Complete Geometric Graphs (Part 1)

In this paper, we disprove the long-standing conjecture that any complet...
research
11/02/2011

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian models offer great flexibility for clustering applications---Ba...
research
02/01/2022

Spectral Clustering, Spanning Forest, and Bayesian Forest Process

Spectral clustering algorithms are very popular. Starting from a pairwis...
research
09/13/2022

Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm

The time needed to apply a hierarchical clustering algorithm is most oft...
research
09/04/2018

Noisy Voronoi: a Simple Framework for Terminal-Clustering Problems

We reprove three known (algorithmic) bounds for terminal-clustering prob...
research
04/24/2019

Learning big Gaussian Bayesian networks: partition, estimation, and fusion

Structure learning of Bayesian networks has always been a challenging pr...

Please sign up or login with your details

Forgot password? Click here to reset