Quantization/clustering: when and why does k-means work?

01/11/2018
by   Clément Levrard, et al.
0

Though mostly used as a clustering algorithm, k-means are originally designed as a quantization algorithm. Namely, it aims at providing a compression of a probability distribution with k points. Building upon [21, 33], we try to investigate how and when these two approaches are compatible. Namely, we show that provided the sample distribution satisfies a margin like condition (in the sense of [27] for supervised learning), both the associated empirical risk minimizer and the output of Lloyd's algorithm provide almost optimal classification in certain cases (in the sense of [6]). Besides, we also show that they achieved fast and optimal convergence rates in terms of sample size and compression risk.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2018

Quantization/clustering: when does k-means work?

Though mostly used as a clustering algorithm, k-means are originally des...
research
05/21/2018

A New Lower Bound for Agnostic Learning with Sample Compression Schemes

We establish a tight characterization of the worst-case rates for the ex...
research
10/30/2010

Fast Color Quantization Using Weighted Sort-Means Clustering

Color quantization is an important operation with numerous applications ...
research
01/29/2018

A notion of stability for k-means clustering

In this paper, we define and study a new notion of stability for the k-m...
research
04/07/2021

Fast Convergence on Perfect Classification for Functional Data

In this study, we investigate the availability of approaching to perfect...
research
05/03/2013

Anisotropic oracle inequalities in noisy quantization

The effect of errors in variables in quantization is investigated. We pr...
research
01/13/2022

Context binning, model clustering and adaptivity for data compression of genetic data

Rapid growth of genetic databases means huge savings from improvements i...

Please sign up or login with your details

Forgot password? Click here to reset