A Randomized Approach to Efficient Kernel Clustering

Kernel-based K-means clustering has gained popularity due to its simplicity and the power of its implicit non-linear representation of the data. A dominant concern is the memory requirement since memory scales as the square of the number of data points. We provide a new analysis of a class of approximate kernel methods that have more modest memory requirements, and propose a specific one-pass randomized kernel approximation followed by standard K-means on the transformed data. The analysis and experiments suggest the method is accurate, while requiring drastically less memory than standard kernel K-means and significantly less memory than Nystrom based approximations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/07/2017

Multiresolution Kernel Approximation for Gaussian Process Regression

Gaussian process regression generally does not scale to beyond a few tho...
research
10/09/2017

Distributed Kernel K-Means for Large Scale Clustering

Clustering samples according to an effective metric and/or vector space ...
research
02/07/2020

Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization

Kernel-based clustering algorithm can identify and capture the non-linea...
research
06/09/2017

Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds

Kernel k-means clustering can correctly identify and extract a far more ...
research
06/06/2016

On Robustness of Kernel Clustering

Clustering is one of the most important unsupervised problems in machine...
research
05/16/2017

Kernel clustering: density biases and solutions

Kernel methods are popular in clustering due to their generality and dis...
research
09/12/2017

PQk-means: Billion-scale Clustering for Product-quantized Codes

Data clustering is a fundamental operation in data analysis. For handlin...

Please sign up or login with your details

Forgot password? Click here to reset