The Johnson-Lindenstrauss Lemma for Clustering and Subspace Approximation: From Coresets to Dimension Reduction
We study the effect of Johnson-Lindenstrauss transforms in various Euclidean optimization problems. We ask, for a particular problem and an accuracy parameter ϵ∈ (0, 1), what is the smallest target dimension t ∈ℕ such that a Johnson-Lindenstrauss transform Πℝ^d →ℝ^t preserves the cost of the optimal solution up to a (1+ϵ)-factor. ∙ For center-based (k,z)-clustering, we show t = O( (log k + z log(1/ϵ)) / ϵ^2) suffices, improving on O((log k + zlog(1/ϵ) + z^2)/ϵ^2) [MMR19]. ∙ For (k,z)-subspace approximation, we show t = Õ(zk^2 / ϵ^3) suffices. The prior best bound, of O(k/ϵ^2), only applied to the case z = 2 [CEMMP15]. ∙ For (k,z)-flat approximation, we show t = Õ(zk^2/ϵ^3) suffices, improving on a bound of Õ(zk^2 log n/ϵ^3) [KR15]. ∙ For (k,z)-line approximation, we show t = O((k loglog n + z + log(1/ϵ)) / ϵ^3) suffices. No prior results were known. All the above results follow from one general technique: we use algorithms for constructing coresets as an analytical tool in randomized dimensionality reduction.
READ FULL TEXT