An Effective Evolutionary Clustering Algorithm: Hepatitis C Case Study

by   M. H. Marghny, et al.

Clustering analysis plays an important role in scientific research and commercial application. K-means algorithm is a widely used partition method in clustering. However, it is known that the K-means algorithm may get stuck at suboptimal solutions, depending on the choice of the initial cluster centers. In this article, we propose a technique to handle large scale data, which can select initial clustering center purposefully using Genetic algorithms (GAs), reduce the sensitivity to isolated point, avoid dissevering big cluster, and overcome deflexion of data in some degree that caused by the disproportion in data partitioning owing to adoption of multi-sampling. We applied our method to some public datasets these show the advantages of the proposed approach for example Hepatitis C dataset that has been taken from the machine learning warehouse of University of California. Our aim is to evaluate hepatitis dataset. In order to evaluate this dataset we did some preprocessing operation, the reason to preprocessing is to summarize the data in the best and suitable way for our algorithm. Missing values of the instances are adjusted using local mean method.


page 1

page 2

page 3

page 4


Improvement of K Mean Clustering Algorithm Based on Density

The purpose of this paper is to improve the traditional K-means algorith...

Global k-means++: an effective relaxation of the global k-means clustering algorithm

The k-means algorithm is a very prevalent clustering method because of i...

Performance evaluation results of evolutionary clustering algorithm star for clustering heterogeneous datasets

This article presents the data used to evaluate the performance of evolu...

Data Clustering using a Hybrid of Fuzzy C-Means and Quantum-behaved Particle Swarm Optimization

Fuzzy clustering has become a widely used data mining technique and play...

SCE: A manifold regularized set-covering method for data partitioning

Cluster analysis plays a very important role in data analysis. In these ...

K Means Segmentation of Alzheimers Disease in PET scan datasets: An implementation

The Positron Emission Tomography (PET) scan image requires expertise in ...

Clustering of Data with Missing Entries using Non-convex Fusion Penalties

The presence of missing entries in data often creates challenges for pat...

Please sign up or login with your details

Forgot password? Click here to reset