The Johnson-Lindenstrauss Lemma for Clustering and Subspace Approximation: From Coresets to Dimension Reduction

05/01/2022
by   Moses Charikar, et al.
0

We study the effect of Johnson-Lindenstrauss transforms in various Euclidean optimization problems. We ask, for a particular problem and an accuracy parameter ϵ∈ (0, 1), what is the smallest target dimension t ∈ℕ such that a Johnson-Lindenstrauss transform Πℝ^d →ℝ^t preserves the cost of the optimal solution up to a (1+ϵ)-factor. ∙ For center-based (k,z)-clustering, we show t = O( (log k + z log(1/ϵ)) / ϵ^2) suffices, improving on O((log k + zlog(1/ϵ) + z^2)/ϵ^2) [MMR19]. ∙ For (k,z)-subspace approximation, we show t = Õ(zk^2 / ϵ^3) suffices. The prior best bound, of O(k/ϵ^2), only applied to the case z = 2 [CEMMP15]. ∙ For (k,z)-flat approximation, we show t = Õ(zk^2/ϵ^3) suffices, improving on a bound of Õ(zk^2 log n/ϵ^3) [KR15]. ∙ For (k,z)-line approximation, we show t = O((k loglog n + z + log(1/ϵ)) / ϵ^3) suffices. No prior results were known. All the above results follow from one general technique: we use algorithms for constructing coresets as an analytical tool in randomized dimensionality reduction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2021

Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering

Random dimensionality reduction is a versatile tool for speeding up algo...
research
11/08/2018

Performance of Johnson-Lindenstrauss Transform for k-Means and k-Medians Clustering

Consider an instance of Euclidean k-means or k-medians clustering. We sh...
research
12/09/2021

Improved approximation algorithms for two Euclidean k-Center variants

The k-Center problem is one of the most popular clustering problems. Aft...
research
04/27/2021

Exponentially Improved Dimensionality Reduction for ℓ_1: Subspace Embeddings and Independence Testing

Despite many applications, dimensionality reduction in the ℓ_1-norm is m...
research
12/19/2021

Parameterized Approximation Algorithms for k-Center Clustering and Variants

k-center is one of the most popular clustering models. While it admits a...
research
10/31/2022

Improved Learning-augmented Algorithms for k-means and k-medians Clustering

We consider the problem of clustering in the learning-augmented setting,...
research
06/17/2022

Scalable Differentially Private Clustering via Hierarchically Separated Trees

We study the private k-median and k-means clustering problem in d dimens...

Please sign up or login with your details

Forgot password? Click here to reset