Sparse Dimensionality Reduction Revisited
The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of n points in ℝ^d into m=O(ε^-2 n) dimensions while preserving all pairwise distances to within 1 ±ε. Each input point x is embedded to Ax, where A is an m × d matrix having s non-zeros per column, allowing for an embedding time of O(s x_0). Since the sparsity of A governs the embedding time, much work has gone into improving the sparsity s. The current state-of-the-art by Kane and Nelson (JACM'14) shows that s = O(ε ^-1 n) suffices. This is almost matched by a lower bound of s = Ω(ε ^-1 n/(1/ε)) by Nelson and Nguyen (STOC'13). Previous work thus suggests that we have near-optimal embeddings. In this work, we revisit sparse embeddings and identify a loophole in the lower bound. Concretely, it requires d ≥ n, which in many applications is unrealistic. We exploit this loophole to give a sparser embedding when d = o(n), achieving s = O(ε^-1( n/(1/ε)+^2/3n ^1/3 d)). We also complement our analysis by strengthening the lower bound of Nelson and Nguyen to hold also when d ≪ n, thereby matching the first term in our new sparsity upper bound. Finally, we also improve the sparsity of the best oblivious subspace embeddings for optimal embedding dimensionality.
READ FULL TEXT