A Hybrid Algorithm Based Robust Big Data Clustering for Solving Unhealthy Initialization, Dynamic Centroid Selection and Empty clustering Problems with Analysis

02/21/2020
by   Y. A. Joarder, et al.
0

Big Data is a massive volume of both structured and unstructured data that is too large and it also difficult to process using traditional techniques. Clustering algorithms have developed as a powerful learning tool that can exactly analyze the volume of data that produced by modern applications. Clustering in data mining is the grouping of a particular set of objects based on their characteristics. The main aim of clustering is to classified data into clusters such that objects are grouped in the same clusters when they are corresponding according to similarities and features mainly. Till now, K-MEANS is the best utilized calculation connected in a wide scope of zones to recognize gatherings where cluster separations are a lot than between gathering separations. Our developed algorithm works with K-MEANS for high quality clustering during clustering from big data. Our proposed algorithm EG K-MEANS : Extended Generation K-MEANS solves mainly three issues of K-MEANS: unhealthy initialization, dynamic centroid selection and empty clustering. It ensures the best way of preventing unhealthy initialization, dynamic centroid selection and empty clustering problems for getting high quality clustering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2015

Hybrid data clustering approach using K-Means and Flower Pollination Algorithm

Data clustering is a technique for clustering set of objects into known ...
research
04/14/2022

Big-means: Less is More for K-means Clustering

K-means clustering plays a vital role in data mining. However, its perfo...
research
08/15/2023

Parametric entropy based Cluster Centriod Initialization for k-means clustering of various Image datasets

One of the most employed yet simple algorithm for cluster analysis is th...
research
10/08/2020

Clustering Analysis of Interactive Learning Activities Based on Improved BIRCH Algorithm

Group tendency is a research branch of computer assisted learning. The c...
research
07/09/2018

Using Multi-Core HW/SW Co-design Architecture for Accelerating K-means Clustering Algorithm

The capability of classifying and clustering a desired set of data is an...
research
10/17/2016

High-performance K-means Implementation based on a Simplified Map-Reduce Architecture

The k-means algorithm is one of the most common clustering algorithms an...
research
05/27/2023

Dynamic User Segmentation and Usage Profiling

Usage data of a group of users distributed across a number of categories...

Please sign up or login with your details

Forgot password? Click here to reset