On the I/O complexity of the k-nearest neighbor problem

02/12/2020
by   Mayank Goswami, et al.
0

We consider static, external memory indexes for exact and approximate versions of the k-nearest neighbor (k-NN) problem, and show new lower bounds under a standard indivisibility assumption: - Polynomial space indexing schemes for high-dimensional k-NN in Hamming space cannot take advantage of block transfers: Ω(k) block reads are needed to to answer a query. - For the ℓ_∞ metric the lower bound holds even if we allow c-appoximate nearest neighbors to be returned, for c ∈ (1, 3). - The restriction to c < 3 is necessary: For every metric there exists an indexing scheme in the indexability model of Hellerstein et al. using space O(kn), where n is the number of points, that can retrieve k 3-approximate nearest neighbors using k/B I/Os, which is optimal. - For specific metrics, data structures with better approximation factors are possible. For k-NN in Hamming space and every approximation factor c>1 there exists a polynomial space data structure that returns kc-approximate nearest neighbors in k/B I/Os. To show these lower bounds we develop two new techniques: First, to handle that approximation algorithms have more freedom in deciding which result set to return we develop a relaxed version of the λ-set workload technique of Hellerstein et al. This technique allows us to show lower bounds that hold in d≥ n dimensions. To extend the lower bounds down to d = O(k log(n/k)) dimensions, we develop a new deterministic dimension reduction technique that may be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2017

Approximate nearest neighbors search without false negatives for l_2 for c>√(n)

In this paper, we report progress on answering the open problem presente...
research
03/17/2022

Stronger 3SUM-Indexing Lower Bounds

The 3SUM-Indexing problem was introduced as a data structure version of ...
research
06/30/2018

Approximate Nearest Neighbors in Limited Space

We consider the (1+ϵ)-approximate nearest neighbor search problem: given...
research
07/14/2022

Provably Adversarially Robust Nearest Prototype Classifiers

Nearest prototype classifiers (NPCs) assign to each input point the labe...
research
03/02/2018

Hardness of Approximate Nearest Neighbor Search

We prove conditional near-quadratic running time lower bounds for approx...
research
05/24/2019

Learning Mahalanobis Metric Spaces via Geometric Approximation Algorithms

Learning Mahalanobis metric spaces is an important problem that has foun...
research
12/21/2018

Lower bounds for text indexing with mismatches and differences

In this paper we study lower bounds for the fundamental problem of text ...

Please sign up or login with your details

Forgot password? Click here to reset