An Empirical Evaluation of k-Means Coresets

07/03/2022
by   Chris Schwiegelshohn, et al.
0

Coresets are among the most popular paradigms for summarizing data. In particular, there exist many high performance coresets for clustering problems such as k-means in both theory and practice. Curiously, there exists no work on comparing the quality of available k-means coresets. In this paper we perform such an evaluation. There currently is no algorithm known to measure the distortion of a candidate coreset. We provide some evidence as to why this might be computationally difficult. To complement this, we propose a benchmark for which we argue that computing coresets is challenging and which also allows us an easy (heuristic) evaluation of coresets. Using this benchmark and real-world data sets, we conduct an exhaustive evaluation of the most commonly used coreset algorithms from theory and practice.

READ FULL TEXT

page 1

page 3

page 5

page 9

page 13

page 15

page 17

page 19

research
03/16/2020

A semi-supervised sparse K-Means algorithm

We consider the problem of data clustering with unidentified feature qua...
research
09/25/2019

Experimental Evaluation of Algorithms for Computing Quasiperiods

Quasiperiodicity is a generalization of periodicity that was introduced ...
research
03/03/2021

A Bounded Measure for Estimating the Benefit of Visualization: Case Studies and Empirical Evaluation

Many visual representations, such as volume-rendered images and metro ma...
research
07/16/2023

An Empirical Evaluation of AriDeM using Matrix Multiplication

For a long time, the Von Neumann has been a successful model of computat...
research
12/18/2018

Solving the Empirical Bayes Normal Means Problem with Correlated Noise

The Normal Means problem plays a fundamental role in many areas of moder...
research
09/23/2022

Creating Compact Regions of Social Determinants of Health

Regionalization is the act of breaking a dataset into contiguous homogen...
research
01/31/2023

Archetypal Analysis++: Rethinking the Initialization Strategy

Archetypal analysis is a matrix factorization method with convexity cons...

Please sign up or login with your details

Forgot password? Click here to reset