Hierarchical Clustering with Structural Constraints

05/24/2018
by   Vaggos Chatziafratis, et al.
0

Hierarchical clustering is a popular unsupervised data analysis method. For many real-world applications, we would like to exploit prior information about the data that imposes constraints on the clustering hierarchy, and is not captured by the set of features available to the algorithm. This gives rise to the problem of "hierarchical clustering with structural constraints". Structural constraints pose major challenges for bottom-up approaches like average/single linkage and even though they can be naturally incorporated into top-down divisive algorithms, no formal guarantees exist on the quality of their output. In this paper, we provide provable approximation guarantees for two simple top-down algorithms, using a recently introduced optimization viewpoint of hierarchical clustering with pairwise similarity information [Dasgupta, 2016]. We show how to find good solutions even in the presence of conflicting prior information, by formulating a "constraint-based regularization" of the objective. Finally, we explore a variation of this objective for dissimilarity information [Cohen-Addad et al., 2018] and improve upon current techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2019

Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection

Hierarchical Clustering is an unsupervised data analysis method which ha...
research
08/07/2018

Hierarchical Clustering better than Average-Linkage

Hierarchical Clustering (HC) is a widely studied problem in exploratory ...
research
01/26/2021

Hierarchical Clustering via Sketches and Hierarchical Correlation Clustering

Recently, Hierarchical Clustering (HC) has been considered through the l...
research
12/27/2018

Hierarchical Clustering for Euclidean Data

Recent works on Hierarchical Clustering (HC), a well-studied problem in ...
research
06/18/2020

Guarantees for Hierarchical Clustering by the Sublevel Set method

Meila (2018) introduces an optimization based method called the Sublevel...
research
07/12/2017

ClustGeo: an R package for hierarchical clustering with spatial constraints

In this paper, we propose a Ward-like hierarchical clustering algorithm ...
research
01/30/2018

COBRA: A Fast and Simple Method for Active Clustering with Pairwise Constraints

Clustering is inherently ill-posed: there often exist multiple valid clu...

Please sign up or login with your details

Forgot password? Click here to reset