Classifying variable-structures: a general framework
In this work, we unify recent variable-clustering techniques within a common geometric framework which allows to extend clustering to variable-structures, i.e. variable-subsets within which links between variables are taken into consideration in a given way. All variables being measured on the same n statistical units, we first represent every variable-structure with a unit-norm operator in R^n× n. We consider either the euclidean chord-distance or the geodesic distance on the unit-sphere of R^n× n. Then, we introduce the notion of rank-H average of such operators as the rank-H solution of a compound distance-minimisation program. Finally, we propose a K-means-type algorithm using the rank-H average as centroid to perform variable-structure clustering. The method is tested on simulated data and applied to wine data.
READ FULL TEXT