Hierarchical Clustering better than Average-Linkage

08/07/2018
by   Moses Charikar, et al.
0

Hierarchical Clustering (HC) is a widely studied problem in exploratory data analysis, usually tackled by simple agglomerative procedures like average-linkage, single-linkage or complete-linkage. In this paper we focus on two objectives, introduced recently to give insight into the performance of average-linkage clustering: a similarity based HC objective proposed by [Moseley and Wang, 2017] and a dissimilarity based HC objective proposed by [Cohen-Addad et al., 2018]. In both cases, we present tight counterexamples showing that average-linkage cannot obtain better than 1/3 and 2/3 approximations respectively (in the worst-case), settling an open question raised in [Moseley and Wang, 2017]. This matches the approximation ratio of a random solution, raising a natural question: can we beat average-linkage for these objectives? We answer this in the affirmative, giving two new algorithms based on semidefinite programming with provably better guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2018

Hierarchical Clustering with Structural Constraints

Hierarchical clustering is a popular unsupervised data analysis method. ...
research
12/27/2018

Hierarchical Clustering for Euclidean Data

Recent works on Hierarchical Clustering (HC), a well-studied problem in ...
research
12/15/2019

Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection

Hierarchical Clustering is an unsupervised data analysis method which ha...
research
01/26/2021

Hierarchical Clustering via Sketches and Hierarchical Correlation Clustering

Recently, Hierarchical Clustering (HC) has been considered through the l...
research
10/04/2020

Inapproximability for Local Correlation Clustering and Dissimilarity Hierarchical Clustering

We present hardness of approximation results for Correlation Clustering ...
research
04/19/2023

The Price of Explainability for Clustering

Given a set of points in d-dimensional space, an explainable clustering ...
research
02/17/2017

Threshold Constraints with Guarantees for Parity Objectives in Markov Decision Processes

The beyond worst-case synthesis problem was introduced recently by Bruyè...

Please sign up or login with your details

Forgot password? Click here to reset