Log In Sign Up

Solving clustering as ill-posed problem: experiments with K-Means algorithm

by   Alberto Arturo Vergani, et al.

In this contribution, the clustering procedure based on K-Means algorithm is studied as an inverse problem, which is a special case of the illposed problems. The attempts to improve the quality of the clustering inverse problem drive to reduce the input data via Principal Component Analysis (PCA). Since there exists a theorem by Ding and He that links the cardinality of the optimal clusters found with K-Means and the cardinality of the selected informative PCA components, the computational experiments tested the theorem between two quantitative features selection methods: Kaiser criteria (based on imperative decision) versus Wishart criteria (based on random matrix theory). The results suggested that PCA reduction with features selection by Wishart criteria leads to a low matrix condition number and satisfies the relation between clusters and components predicts by the theorem. The data used for the computations are from a neuroscientific repository: it regards healthy and young subjects that performed a task-oriented functional Magnetic Resonance Imaging (fMRI) paradigm.


Principal Component Analysis versus Factor Analysis

The article discusses selected problems related to both principal compon...

Group Linear non-Gaussian Component Analysis with Applications to Neuroimaging

Independent component analysis (ICA) is an unsupervised learning method ...

Feature selection or extraction decision process for clustering using PCA and FRSD

This paper concerns the critical decision process of extracting or selec...

Linear Optimal Low Rank Projection for High-Dimensional Multi-Class Data

Classification of individual samples into one or more categories is crit...

PCA-aided Fully Convolutional Networks for Semantic Segmentation of Multi-channel fMRI

Semantic segmentation of functional magnetic resonance imaging (fMRI) ma...

Phase Transitions for High Dimensional Clustering and Related Problems

Consider a two-class clustering problem where we observe X_i = ℓ_i μ + Z...