On the Discrepancy Between Kleinberg's Clustering Axioms and k-Means Clustering Algorithm Behavior
This paper investigates the validity of Kleinberg's axioms for clustering functions with respect to the quite popular clustering algorithm called k-means. While Kleinberg's axioms have been discussed heavily in the past, we concentrate here on the case predominantly relevant for k-means algorithm, that is behavior embedded in Euclidean space. We point at some contradictions and counter intuitiveness aspects of this axiomatic set within R^m that were evidently not discussed so far. Our results suggest that apparently without defining clearly what kind of clusters we expect we will not be able to construct a valid axiomatic system. In particular we look at the shape and the gaps between the clusters. Finally we demonstrate that there exist several ways to reconcile the formulation of the axioms with their intended meaning and that under this reformulation the axioms stop to be contradictory and the real-world k-means algorithm conforms to this axiomatic system.
READ FULL TEXT