The Three Ensemble Clustering (3EC) Algorithm for Pattern Discovery in Unsupervised Learning

07/08/2021
by   Kundu, et al.
0

This paper presents a multiple learner algorithm called the 'Three Ensemble Clustering 3EC' algorithm that classifies unlabeled data into quality clusters as a part of unsupervised learning. It offers the flexibility to explore the context of new clusters formed by an ensemble of algorithms based on internal validation indices. It is worth mentioning that the input data set is considered to be a cluster of clusters. An anomaly can possibly manifest as a cluster as well. Each partitioned cluster is considered to be a new data set and is a candidate to explore the most optimal algorithm and its number of partition splits until a predefined stopping criteria is met. The algorithms independently partition the data set into clusters and the quality of the partitioning is assessed by an ensemble of internal cluster validation indices. The 3EC algorithm presents the validation index scores from a choice of algorithms and its configuration of partitions and it is called the Tau Grid. 3EC chooses the most optimal score. The 3EC algorithm owes its name to the two input ensembles of algorithms and internal validation indices and an output ensemble of final clusters. Quality plays an important role in this clustering approach and it also acts as a stopping criteria from further partitioning. Quality is determined based on the quality of the clusters provided by an algorithm and its optimal number of splits. The 3EC algorithm determines this from the score of the ensemble of validation indices. The user can configure the stopping criteria by providing quality thresholds for the score range of each of the validation indices and the optimal size of the output cluster. The users can experiment with different sets of stopping criteria and choose the most 'sensible group' of quality clusters

READ FULL TEXT
research
06/05/2018

A Visual Quality Index for Fuzzy C-Means

Cluster analysis is widely used in the areas of machine learning and dat...
research
02/13/2021

HAWKS: Evolving Challenging Benchmark Sets for Cluster Analysis

Comprehensive benchmarking of clustering algorithms is rendered difficul...
research
11/03/2021

Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score

Cluster analysis requires many decisions: the clustering method and the ...
research
07/17/2012

Ensemble Clustering with Logic Rules

In this article, the logic rule ensembles approach to supervised learnin...
research
07/24/2019

On the bias of H-scores for comparing biclusters, and how to correct it

In the last two decades several biclustering methods have been developed...
research
04/17/2019

SCE: A manifold regularized set-covering method for data partitioning

Cluster analysis plays a very important role in data analysis. In these ...
research
12/23/2021

Ensemble Method for Cluster Number Determination and Algorithm Selection in Unsupervised Learning

Unsupervised learning, and more specifically clustering, suffers from th...

Please sign up or login with your details

Forgot password? Click here to reset