EPTAS for k-means Clustering of Affine Subspaces

10/19/2020
by   Eduard Eiben, et al.
0

We consider a generalization of the fundamental k-means clustering for data with incomplete or corrupted entries. When data objects are represented by points in ℝ^d, a data point is said to be incomplete when some of its entries are missing or unspecified. An incomplete data point with at most Δ unspecified entries corresponds to an axis-parallel affine subspace of dimension at most Δ, called a Δ-point. Thus we seek a partition of n input Δ-points into k clusters minimizing the k-means objective. For Δ=0, when all coordinates of each point are specified, this is the usual k-means clustering. We give an algorithm that finds an (1+ ϵ)-approximate solution in time f(k,ϵ, Δ) · n^2 · d for some function f of k,ϵ, and Δ only.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset