An objective function for order preserving hierarchical clustering

09/09/2021
by   Daniel Bakkelund, et al.
0

We present an objective function for similarity based hierarchical clustering of partially ordered data that preserves the partial order in the sense that if x ≤ y, and if [x] and [y] are the respective clusters of x and y, then there is an order relation ≤' on the clusters for which [x] ≤' |y]. The model distinguishes itself from existing methods and models for clustering of ordered data in that the order relation and the similarity are combined to obtain an optimal hierarchical clustering seeking to satisfy both, and that the order relation is equipped with a pairwise level of comparability in the range [0,1]. In particular, if the similarity and the order relation are not aligned, then order preservation may have to yield in favor of clustering. Finding an optimal solution is NP-hard, so we provide a polynomial time approximation algorithm, with a relative performance guarantee of O(log^3/2n), based on successive applications of directed sparsest cut. The model is an extension of the Dasgupta cost function for divisive hierarchical clustering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2020

Order preserving hierarchical agglomerative clustering of strict posets

We present a method for hierarchical clustering of directed acyclic grap...
research
12/06/2018

An Improved Cost Function for Hierarchical Cluster Trees

Hierarchical clustering has been a popular method in various data analys...
research
11/12/2021

Hierarchical Clustering: New Bounds and Objective

Hierarchical Clustering has been studied and used extensively as a metho...
research
10/16/2015

A cost function for similarity-based hierarchical clustering

The development of algorithms for hierarchical clustering has been hampe...
research
12/13/2022

A (Slightly) Improved Deterministic Approximation Algorithm for Metric TSP

We show that the max entropy algorithm can be derandomized (with respect...
research
05/14/2018

Algorithms and Complexity of Range Clustering

We introduce a novel criterion in clustering that seeks clusters with li...
research
02/09/2020

Bi-objective Optimization of Biclustering with Binary Data

Clustering consists of partitioning data objects into subsets called clu...

Please sign up or login with your details

Forgot password? Click here to reset