Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization

02/07/2020
by   Li Chen, et al.
0

Kernel-based clustering algorithm can identify and capture the non-linear structure in datasets, and thereby it can achieve better performance than linear clustering. However, computing and storing the entire kernel matrix occupy so large memory that it is difficult for kernel-based clustering to deal with large-scale datasets. In this paper, we employ incomplete Cholesky factorization to accelerate kernel clustering and save memory space. The key idea of the proposed kernel k-means clustering using incomplete Cholesky factorization is that we approximate the entire kernel matrix by the product of a low-rank matrix and its transposition. Then linear k-means clustering is applied to columns of the transpose of the low-rank matrix. We show both analytically and empirically that the performance of the proposed algorithm is similar to that of the kernel k-means clustering algorithm, but our method can deal with large-scale datasets.

READ FULL TEXT
research
06/09/2017

Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds

Kernel k-means clustering can correctly identify and extract a far more ...
research
01/19/2017

On the Existence of Kernel Function for Kernel-Trick of k-Means

This paper corrects the proof of the Theorem 2 from the Gower's paper [p...
research
12/23/2022

Using MM principles to deal with incomplete data in K-means clustering

Among many clustering algorithms, the K-means clustering algorithm is wi...
research
08/23/2019

QuicK-means: Acceleration of K-means by learning a fast transform

K-means -- and the celebrated Lloyd algorithm -- is more than the cluste...
research
08/26/2016

A Randomized Approach to Efficient Kernel Clustering

Kernel-based K-means clustering has gained popularity due to its simplic...
research
02/07/2017

Sparse Algorithm for Robust LSSVM in Primal Space

As enjoying the closed form solution, least squares support vector machi...
research
06/06/2016

On Robustness of Kernel Clustering

Clustering is one of the most important unsupervised problems in machine...

Please sign up or login with your details

Forgot password? Click here to reset