Towards Continuous Consistency Axiom

02/12/2022
by   Mieczysław A. Kłopotek, et al.
0

Development of new algorithms in the area of machine learning, especially clustering, comparative studies of such algorithms as well as testing according to software engineering principles requires availability of labeled data sets. While standard benchmarks are made available, a broader range of such data sets is necessary in order to avoid the problem of overfitting. In this context, theoretical works on axiomatization of clustering algorithms, especially axioms on clustering preserving transformations are quite a cheap way to produce labeled data sets from existing ones. However, the frequently cited axiomatic system of Kleinberg:2002, as we show in this paper, is not applicable for finite dimensional Euclidean spaces, in which many algorithms like k-means, operate. In particular, the so-called outer-consistency axiom fails upon making small changes in datapoint positions and inner-consistency axiom is valid only for identity transformation in general settings. Hence we propose an alternative axiomatic system, in which Kleinberg's inner consistency axiom is replaced by a centric consistency axiom and outer consistency axiom is replaced by motion consistency axiom. We demonstrate that the new system is satisfiable for a hierarchical version of k-means with auto-adjusted k, hence it is not contradictory. Additionally, as k-means creates convex clusters only, we demonstrate that it is possible to create a version detecting concave clusters and still the axiomatic system can be satisfied. The practical application area of such an axiomatic system may be the generation of new labeled test data from existent ones for clustering algorithm testing. which does not have this deficiency.

READ FULL TEXT
research
05/18/2021

On Convex Clustering Solutions

Convex clustering is an attractive clustering algorithm with favorable p...
research
05/10/2020

Improving The Performance Of The K-means Algorithm

The Incremental K-means (IKM), an improved version of K-means (KM), was ...
research
08/07/2023

Wide Gaps and Clustering Axioms

The widely applied k-means algorithm produces clusterings that violate o...
research
06/15/2018

Morse Theory and an Impossibility Theorem for Graph Clustering

Kleinberg introduced three natural clustering properties, or axioms, and...
research
12/16/2020

Predictive K-means with local models

Supervised classification can be effective for prediction but sometimes ...
research
02/18/2022

Clustering by Hill-Climbing: Consistency Results

We consider several hill-climbing approaches to clustering as formulated...

Please sign up or login with your details

Forgot password? Click here to reset