Merging K-means with hierarchical clustering for identifying general-shaped groups

12/23/2017
by   Anna D. Peterson, et al.
0

Clustering partitions a dataset such that observations placed together in a group are similar but different from those in other groups. Hierarchical and K-means clustering are two approaches but have different strengths and weaknesses. For instance, hierarchical clustering identifies groups in a tree-like structure but suffers from computational complexity in large datasets while K-means clustering is efficient but designed to identify homogeneous spherically-shaped clusters. We present a hybrid non-parametric clustering approach that amalgamates the two methods to identify general-shaped clusters and that can be applied to larger datasets. Specifically, we first partition the dataset into spherical groups using K-means. We next merge these groups using hierarchical methods with a data-driven distance measure as a stopping criterion. Our proposal has the potential to reveal groups with general shapes and structure in a dataset. We demonstrate good performance on several simulated and real datasets.

READ FULL TEXT

page 1

page 8

page 9

page 11

research
01/17/2022

Tk-merge: Computationally Efficient Robust Clustering Under General Assumptions

We address general-shaped clustering problems under very weak parametric...
research
02/03/2023

Unsupervised hierarchical clustering using the learning dynamics of RBMs

Datasets in the real world are often complex and to some degree hierarch...
research
04/21/2019

TiK-means: K-means clustering for skewed groups

The K-means algorithm is extended to allow for partitioning of skewed gr...
research
05/22/2017

Improved Clustering with Augmented k-means

Identifying a set of homogeneous clusters in a heterogeneous dataset is ...
research
11/28/2018

A comparison of cluster algorithms as applied to unsupervised surveys

When considering answering important questions with data, unsupervised d...
research
10/10/2019

Detecting organized eCommerce fraud using scalable categorical clustering

Online retail, eCommerce, frequently falls victim to fraud conducted by ...
research
07/11/2014

Biclustering Via Sparse Clustering

In many situations it is desirable to identify clusters that differ with...

Please sign up or login with your details

Forgot password? Click here to reset