A Fast Approximation Scheme for Low-Dimensional k-Means
We consider the popular k-means problem in d-dimensional Euclidean space. Recently Friggstad, Rezapour, Salavatipour [FOCS'16] and Cohen-Addad, Klein, Mathieu [FOCS'16] showed that the standard local search algorithm yields a (1+ϵ)-approximation in time (n · k)^1/ϵ^O(d), giving the first polynomial-time approximation scheme for the problem in low-dimensional Euclidean space. While local search achieves optimal approximation guarantees, it is not competitive with the state-of-the-art heuristics such as the famous k-means++ and D^2-sampling algorithms. In this paper, we aim at bridging the gap between theory and practice by giving a (1+ϵ)-approximation algorithm for low-dimensional k-means running in time n · k · ( n)^(dϵ^-1)^O(d), and so matching the running time of the k-means++ and D^2-sampling heuristics up to polylogarithmic factors. We speed-up the local search approach by making a non-standard use of randomized dissections that allows to find the best local move efficiently using a quite simple dynamic program. We hope that our techniques could help design better local search heuristics for geometric problems. We note that the doubly exponential dependency on d is necessary as k-means is APX-hard in dimension d = ω( n).
READ FULL TEXT