Simple and sharp analysis of k-means||

03/05/2020
by   Václav Rozhoň, et al.
0

We present a truly simple analysis of k-means|| (Bahmani et al., PVLDB 2012) – a distributed variant of the k-means++ algorithm (Arthur and Vassilvitskii, SODA 2007) – and improve it from O(logVar X), where Var X is the variance of the input data set, to O(logVar X / loglogVar X), which we show to be tight.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2020

Adapting k-means algorithms for outliers

This paper shows how to adapt several simple and classical sampling-base...
research
12/10/2021

Collecting Coupons is Faster with Friends

In this note, we introduce a distributed twist on the classic coupon col...
research
07/16/2022

A Nearly Tight Analysis of Greedy k-means++

The famous k-means++ algorithm of Arthur and Vassilvitskii [SODA 2007] i...
research
10/30/2019

On a Decentralized (Δ+1)-Graph Coloring Algorithm

We consider a decentralized graph coloring model where each vertex only ...
research
09/06/2021

An axiomatization of Λ-quantiles

We give an axiomatic foundation to Λ-quantiles, a family of generalized ...
research
09/28/2017

A Simple and Efficient MapReduce Algorithm for Data Cube Materialization

Data cube materialization is a classical database operator introduced in...
research
10/09/2022

Coresets for Relational Data and The Applications

A coreset is a small set that can approximately preserve the structure o...

Please sign up or login with your details

Forgot password? Click here to reset