Near neighbor preserving dimension reduction for doubling subsets of ℓ_1

02/23/2019
by   Ioannis Z. Emiris, et al.
0

Randomized dimensionality reduction has been recognized as one of the fundamental techniques in handling high-dimensional data. Starting with the celebrated Johnson-Lindenstrauss Lemma, such reductions have been studied in depth for the Euclidean (ℓ_2) metric and, much less, for the Manhattan (ℓ_1) metric. Our primary motivation is the approximate nearest neighbor problem in ℓ_1. We exploit its reduction to the decision-with-witness version, called approximate near neighbor, which incurs a roughly logarithmic overhead. In 2007, Indyk and Naor, in the context of approximate nearest neighbors, introduced the notion of nearest neighbor-preserving embeddings. These are randomized embeddings between two metric spaces with guaranteed bounded distortion only for the distances between a query point and a point set. Such embeddings are known to exist for both ℓ_2 and ℓ_1 metrics, as well as for doubling subsets of ℓ_2. In this paper, we propose a dimension reduction, near neighbor-preserving embedding for doubling subsets of ℓ_1. Our approach is to represent the point set with a carefully chosen covering set, and then apply a random projection to that covering set. We study two cases of covering sets: c-approximate r-nets and randomly shifted grids, and we discuss the tradeoff between them in terms of preprocessing time and target dimension.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2018

An Algorithm for Reducing Approximate Nearest Neighbor to Approximate Near Neighbor with O(logn) Query Time

This paper proposes a new algorithm for reducing Approximate Nearest Nei...
research
08/21/2017

Approximate nearest neighbors search without false negatives for l_2 for c>√(n)

In this paper, we report progress on answering the open problem presente...
research
02/16/2020

Coresets for the Nearest-Neighbor Rule

The problem of nearest-neighbor condensation deals with finding a subset...
research
01/25/2019

Metric Spaces with Expensive Distances

In algorithms for finite metric spaces, it is common to assume that the ...
research
03/24/2022

Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction

Dimensionality reduction is crucial both for visualization and preproces...
research
11/16/2017

A New Method for Performance Analysis in Nonlinear Dimensionality Reduction

In this paper, we develop a local rank correlation measure which quantif...
research
09/17/2016

ADAGIO: Fast Data-aware Near-Isometric Linear Embeddings

Many important applications, including signal reconstruction, parameter ...

Please sign up or login with your details

Forgot password? Click here to reset