Monte Carlo approximation certificates for k-means clustering

10/03/2017
by   Dustin G. Mixon, et al.
0

Efficient algorithms for k-means clustering frequently converge to suboptimal partitions, and given a partition, it is difficult to detect k-means optimality. In this paper, we develop an a posteriori certifier of approximate optimality for k-means clustering. The certifier is a sub-linear Monte Carlo algorithm based on Peng and Wei's semidefinite relaxation of k-means. In particular, solving the relaxation for small random samples of the dataset produces a high-confidence lower bound on the k-means objective, and being sub-linear, our algorithm is faster than k-means++ when the number of data points is large. We illustrate the performance of our algorithm with both numerical experiments and a performance guarantee: If the data points are drawn independently from any mixture of two Gaussians over R^m with identity covariance, then with probability 1-O(1/m), our poly(m)-time algorithm produces a 3-approximation certificate with 99

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2022

Sketch-and-solve approaches to k-means clustering by semidefinite programming

We introduce a sketch-and-solve approach to speed up the Peng-Wei semide...
research
02/22/2016

Clustering subgaussian mixtures by semidefinite programming

We introduce a model-free relax-and-round algorithm for k-means clusteri...
research
09/12/2013

Recovery guarantees for exemplar-based clustering

For a certain class of distributions, we prove that the linear programmi...
research
03/14/2017

A Random Finite Set Model for Data Clustering

The goal of data clustering is to partition data points into groups to m...
research
06/27/2021

Linear-Time Approximation Scheme for k-Means Clustering of Affine Subspaces

In this paper, we present a linear-time approximation scheme for k-means...
research
05/18/2015

On the tightness of an SDP relaxation of k-means

Recently, Awasthi et al. introduced an SDP relaxation of the k-means pro...
research
03/12/2013

Toward Optimal Stratification for Stratified Monte-Carlo Integration

We consider the problem of adaptive stratified sampling for Monte Carlo ...

Please sign up or login with your details

Forgot password? Click here to reset