# A Quantum Approximation Scheme for k-Means

We give a quantum approximation scheme (i.e., (1 + ε)-approximation for every ε > 0) for the classical k-means clustering problem in the QRAM model with a running time that has only polylogarithmic dependence on the number of data points. More specifically, given a dataset V with N points in ℝ^d stored in QRAM data structure, our quantum algorithm runs in time Õ( 2^Õ(k/ε)η^2 d) and with high probability outputs a set C of k centers such that cost(V, C) ≤ (1+ε) · cost(V, C_OPT). Here C_OPT denotes the optimal k-centers, cost(.) denotes the standard k-means cost function (i.e., the sum of the squared distance of points to the closest center), and η is the aspect ratio (i.e., the ratio of maximum distance to minimum distance). This is the first quantum algorithm with a polylogarithmic running time that gives a provable approximation guarantee of (1+ε) for the k-means problem. Also, unlike previous works on unsupervised learning, our quantum algorithm does not require quantum linear algebra subroutines and has a running time independent of parameters (e.g., condition number) that appear in such procedures.

READ FULL TEXT