Optimality of the Johnson-Lindenstrauss Dimensionality Reduction for Practical Measures

07/14/2021
by   Yair Bartal, et al.
0

It is well known that the Johnson-Lindenstrauss dimensionality reduction method is optimal for worst case distortion. While in practice many other methods and heuristics are used, not much is known in terms of bounds on their performance. The question of whether the JL method is optimal for practical measures of distortion was recently raised in <cit.> (NeurIPS'19). They provided upper bounds on its quality for a wide range of practical measures and showed that indeed these are best possible in many cases. Yet, some of the most important cases, including the fundamental case of average distortion were left open. In particular, they show that the JL transform has 1+ϵ average distortion for embedding into k-dimensional Euclidean space, where k=O(1/^2), and for more general q-norms of distortion, k = O(max{1/^2,q/}), whereas tight lower bounds were established only for large values of q via reduction to the worst case. In this paper we prove that these bounds are best possible for any dimensionality reduction method, for any 1 ≤ q ≤ O(log (2^2 n)/) and ϵ≥1/√(n), where n is the size of the subset of Euclidean space. Our results imply that the JL method is optimal for various distortion measures commonly used in practice, such as stress, energy and relative error. We prove that if any of these measures is bounded by then k=Ω(1/^2), for any ϵ≥1/√(n), matching the upper bounds of <cit.> and extending their tightness results for the full range moment analysis. Our results may indicate that the JL dimensionality reduction method should be considered more often in practical applications, and the bounds we provide for its quality should be served as a measure for comparison when evaluating the performance of other methods and heuristics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/07/2021

Optimal (Euclidean) Metric Compression

We study the problem of representing all distances between n points in ℝ...
research
11/03/2021

Average complexity of matrix reduction for clique filtrations

We study the algorithmic complexity of computing persistent homology of ...
research
10/18/2021

Dimensionality Reduction for Wasserstein Barycenter

The Wasserstein barycenter is a geometric construct which captures the n...
research
07/07/2022

Barriers for Faster Dimensionality Reduction

The Johnson-Lindenstrauss transform allows one to embed a dataset of n p...
research
08/01/2023

ZADU: A Python Library for Evaluating the Reliability of Dimensionality Reduction Embeddings

Dimensionality reduction (DR) techniques inherently distort the original...
research
05/30/2023

Dimensionality Reduction for General KDE Mode Finding

Finding the mode of a high dimensional probability distribution D is a f...
research
03/15/2019

Distribution-Sensitive Bounds on Relative Approximations of Geometric Ranges

A family R of ranges and a set X of points together define a range space...

Please sign up or login with your details

Forgot password? Click here to reset