Mini-batch k-means terminates within O(d/ε) iterations

04/02/2023
by   Gregory Schwartzman, et al.
0

We answer the question: "Does local progress (on batches) imply global progress (on the entire dataset) for mini-batch k-means?". Specifically, we consider mini-batch k-means which terminates only when the improvement in the quality of the clustering on the sampled batch is below some threshold. Although at first glance it appears that this algorithm might execute forever, we answer the above question in the affirmative and show that if the batch is of size Ω̃((d/ϵ)^2), it must terminate within O(d/ϵ) iterations with high probability, where d is the dimension of the input, and ϵ is a threshold parameter for termination. This is true regardless of how the centers are initialized. When the algorithm is initialized with the k-means++ initialization scheme, it achieves an approximation ratio of O(log k) (the same as the full-batch version). Finally, we show the applicability of our results to the mini-batch k-means algorithm implemented in the scikit-learn (sklearn) python library.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2019

Properties of the Stochastic Approximation EM Algorithm with Mini-batch Sampling

To speed up convergence a mini-batch version of the Monte Carlo Markov C...
research
02/09/2016

Nested Mini-Batch K-Means

A new algorithm is proposed which accelerates the mini-batch k-means alg...
research
06/18/2021

An Investigation into Mini-Batch Rule Learning

We investigate whether it is possible to learn rule sets efficiently in ...
research
07/31/2017

Mini-batch Tempered MCMC

In this paper we propose a general framework of performing MCMC with onl...
research
05/19/2017

EE-Grad: Exploration and Exploitation for Cost-Efficient Mini-Batch SGD

We present a generic framework for trading off fidelity and cost in comp...
research
11/17/2017

A Resizable Mini-batch Gradient Descent based on a Randomized Weighted Majority

Determining the appropriate batch size for mini-batch gradient descent i...
research
03/02/2021

Mini-Batch Consistent Slot Set Encoder for Scalable Set Encoding

Most existing set encoding algorithms operate under the assumption that ...

Please sign up or login with your details

Forgot password? Click here to reset