Noisy k-means++ Revisited

07/25/2023
by   Christoph Grunau, et al.
0

The k-means++ algorithm by Arthur and Vassilvitskii [SODA 2007] is a classical and time-tested algorithm for the k-means problem. While being very practical, the algorithm also has good theoretical guarantees: its solution is O(log k)-approximate, in expectation. In a recent work, Bhattacharya, Eube, Roglin, and Schmidt [ESA 2020] considered the following question: does the algorithm retain its guarantees if we allow for a slight adversarial noise in the sampling probability distributions used by the algorithm? This is motivated e.g. by the fact that computations with real numbers in k-means++ implementations are inexact. Surprisingly, the analysis under this scenario gets substantially more difficult and the authors were able to prove only a weaker approximation guarantee of O(log^2 k). In this paper, we close the gap by providing a tight, O(log k)-approximate guarantee for the k-means++ algorithm with noise.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2022

A Nearly Tight Analysis of Greedy k-means++

The famous k-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] i...
research
02/18/2020

k-means++: few more steps yield constant approximation

The k-means++ algorithm of Arthur and Vassilvitskii (SODA 2007) is a sta...
research
07/02/2020

Adapting k-means algorithms for outliers

This paper shows how to adapt several simple and classical sampling-base...
research
12/22/2020

Fast and Accurate k-means++ via Rejection Sampling

k-means++ <cit.> is a widely used clustering algorithm that is easy to i...
research
12/02/2019

Noisy, Greedy and Not So Greedy k-means++

The k-means++ algorithm due to Arthur and Vassilvitskii has become the m...
research
03/05/2020

Fast Noise Removal for k-Means Clustering

This paper considers k-means clustering in the presence of noise. It is ...
research
08/07/2020

A Sub-linear Time Algorithm for Approximating k-Nearest-Neighbor with Full Quality Guarantee

In this paper we propose an algorithm for the approximate k-Nearest-Neig...

Please sign up or login with your details

Forgot password? Click here to reset