Log In Sign Up

An Analysis of Gene Expression Data using Penalized Fuzzy C-Means Approach

by   P. K. Nizar Banu, et al.

With the rapid advances of microarray technologies, large amounts of high-dimensional gene expression data are being generated, which poses significant computational challenges. A first step towards addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. A robust gene expression clustering approach to minimize undesirable clustering is proposed. In this paper, Penalized Fuzzy C-Means (PFCM) Clustering algorithm is described and compared with the most representative off-line clustering techniques: K-Means Clustering, Rough K-Means Clustering and Fuzzy C-Means clustering. These techniques are implemented and tested for a Brain Tumor gene expression Dataset. Analysis of the performance of the proposed approach is presented through qualitative validation experiments. From experimental results, it can be observed that Penalized Fuzzy C-Means algorithm shows a much higher usability than the other projected clustering algorithms used in our comparison study. Significant and promising clustering results are presented using Brain Tumor Gene expression dataset. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. In these clustering results, we find that Penalized Fuzzy C-Means algorithm provides useful information as an aid to diagnosis in oncology.


Interval Type-2 Enhanced Possibilistic Fuzzy C-Means Clustering for Gene Expression Data Analysis

Both FCM and PCM clustering methods have been widely applied to pattern ...

A New Gene Selection Algorithm using Fuzzy-Rough Set Theory for Tumor Classification

In statistics and machine learning, feature selection is the process of ...

EXCLUVIS: A MATLAB GUI Software for Comparative Study of Clustering and Visualization of Gene Expression Data

Clustering is a popular data mining technique that aims to partition an ...

Feasibility of Haralick's Texture Features for the Classification of Chromogenic In-situ Hybridization Images

This paper presents a proof of concept for the usefulness of second-orde...

Modelling-based experiment retrieval: A case study with gene expression clustering

Motivation: Public and private repositories of experimental data are gro...

Solution Path Clustering with Adaptive Concave Penalty

Fast accumulation of large amounts of complex data has created a need fo...