On the clustering of correlated random variables
In this work, the possibility of clustering correlated random variables was examined, both because of their mutual similarity and because of their similarity to the principal components. The k-means algorithm and spectral algorithms were used for clustering. For spectral methods, the similarity matrix was both the matrix of relation established on the level of correlation and the matrix of coefficients of determination. For four different sets of data, different ways of measuring the disimilarity of variables were analyzed, and the impact of the diversity of initial points on the efficiency of the k-means algorithm was analyzed.
READ FULL TEXT