DeepAI AI Chat
Log In Sign Up

Bayesian contiguity constrained clustering, spanning trees and dendrograms

by   Etienne Côme, et al.

Clustering is a well-known and studied problem, one of its variants, called contiguity-constrained clustering, accepts as a second input a graph used to encode prior information about cluster structure by means of contiguity constraints i.e. clusters must form connected subgraphs of this graph. This paper discusses the interest of such a setting and proposes a new way to formalise it in a Bayesian setting, using results on spanning trees to compute exactly a posteriori probabilities of candidate partitions. An algorithmic solution is then investigated to find a maximum a posteriori (MAP) partition and extract a Bayesian dendrogram from it. The interest of this last tool, which is reminiscent of the classical output of a simple hierarchical clustering algorithm, is analysed. Finally, the proposed approach is demonstrated with real applications. A reference implementation of this work is available in the R package gtclust that accompanies the paper (available at


page 18

page 23


Bayesian Rose Trees

Hierarchical structure is ubiquitous in data across many domains. There ...

Edge Partitions of Complete Geometric Graphs (Part 1)

In this paper, we disprove the long-standing conjecture that any complet...

Revisiting k-means: New Algorithms via Bayesian Nonparametrics

Bayesian models offer great flexibility for clustering applications---Ba...

Spectral Clustering, Spanning Forest, and Bayesian Forest Process

Spectral clustering algorithms are very popular. Starting from a pairwis...

Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm

The time needed to apply a hierarchical clustering algorithm is most oft...

Noisy Voronoi: a Simple Framework for Terminal-Clustering Problems

We reprove three known (algorithmic) bounds for terminal-clustering prob...

Learning big Gaussian Bayesian networks: partition, estimation, and fusion

Structure learning of Bayesian networks has always been a challenging pr...