k-sums: another side of k-means

05/19/2020
by   Wan-Lei Zhao, et al.
0

In this paper, the decades-old clustering method k-means is revisited. The original distortion minimization model of k-means is addressed by a pure stochastic minimization procedure. In each step of the iteration, one sample is tentatively reallocated from one cluster to another. It is moved to another cluster as long as the reallocation allows the sample to be closer to the new centroid. This optimization procedure converges faster to a better local minimum over k-means and many of its variants. This fundamental modification over the k-means loop leads to the redefinition of a family of k-means variants. Moreover, a new target function that minimizes the summation of pairwise distances within clusters is presented. We show that it could be solved under the same stochastic optimization procedure. This minimization procedure built upon two minimization models outperforms k-means and its variants considerably with different settings and on different datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/08/2016

Boost K-Means

Due to its simplicity and versatility, k-means remains popular since it ...
research
11/27/2019

K-MACE and Kernel K-MACE Clustering

Determining the correct number of clusters (CNC) is an important task in...
research
11/03/2019

Clustering in Partially Labeled Stochastic Block Models via Total Variation Minimization

A main task in data analysis is to organize data points into coherent gr...
research
02/26/2014

Clustering Multidimensional Data with PSO based Algorithm

Data clustering is a recognized data analysis method in data mining wher...
research
05/04/2017

Fast k-means based on KNN Graph

In the era of big data, k-means clustering has been widely adopted as a ...
research
05/02/2023

Jacobian-Scaled K-means Clustering for Physics-Informed Segmentation of Reacting Flows

This work introduces Jacobian-scaled K-means (JSK-means) clustering, whi...
research
09/28/2018

Minimization of Gini impurity via connections with the k-means problem

The Gini impurity is one of the measures used to select attribute in Dec...

Please sign up or login with your details

Forgot password? Click here to reset