Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection

12/15/2019
by   Sara Ahmadian, et al.
0

Hierarchical Clustering is an unsupervised data analysis method which has been widely used for decades. Despite its popularity, it had an underdeveloped analytical foundation and to address this, Dasgupta recently introduced an optimization viewpoint of hierarchical clustering with pairwise similarity information that spurred a line of work shedding light on old algorithms (e.g., Average-Linkage), but also designing new algorithms. Here, for the maximization dual of Dasgupta's objective (introduced by Moseley-Wang), we present polynomial-time .4246 approximation algorithms that use Max-Uncut Bisection as a subroutine. The previous best worst-case approximation factor in polynomial time was .336, improving only slightly over Average-Linkage which achieves 1/3. Finally, we complement our positive results by providing APX-hardness (even for 0-1 similarities), under the Small Set Expansion hypothesis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2018

Hierarchical Clustering with Structural Constraints

Hierarchical clustering is a popular unsupervised data analysis method. ...
research
08/07/2018

Hierarchical Clustering better than Average-Linkage

Hierarchical Clustering (HC) is a widely studied problem in exploratory ...
research
11/21/2022

Lattice Problems Beyond Polynomial Time

We study the complexity of lattice problems in a world where algorithms,...
research
09/12/2011

Modern hierarchical, agglomerative clustering algorithms

This paper presents algorithms for hierarchical, agglomerative clusterin...
research
12/16/2021

Hierarchical Clustering: O(1)-Approximation for Well-Clustered Graphs

Hierarchical clustering studies a recursive partition of a data set into...
research
12/27/2018

Hierarchical Clustering for Euclidean Data

Recent works on Hierarchical Clustering (HC), a well-studied problem in ...
research
12/27/2021

Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error

The study of statistical estimation without distributional assumptions o...

Please sign up or login with your details

Forgot password? Click here to reset