Improved approximate near neighbor search without false negatives for l_2

09/28/2017
by   Piotr Wygocki, et al.
0

We present a new algorithm for the c--approximate nearest neighbor search without false negatives for l_2^d. We enhance the dimension reduction method presented in wygos_red and combine it with the standard results of Indyk and Motwani motwani. We present an efficient algorithm with Las Vegas guaranties for any c>1. This improves over the previous results, which require c=ω(n) wygos_red, where n is the number of the input points. Moreover, we improve both the query time and the pre-processing time. Our algorithm is tunable, which allows for different compromises between the query and the pre-processing times. In order to illustrate this flexibility, we present two variants of the algorithm. The "efficient query" variant involves the query time of O(d^2) and the polynomial pre-processing time. The "efficient pre-processing" variant involves the pre-processing time equal to O(d^ω-1 n) and the query time sub-linear in n, where ω is the exponent in the complexity of the fast matrix multiplication. In addition, we introduce batch versions of the mentioned algorithms, where the queries come in batches of size d. In this case, the amortized query time of the "efficient query" algorithm is reduced to O(d^ω -1).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2018

An Algorithm for Reducing Approximate Nearest Neighbor to Approximate Near Neighbor with O(logn) Query Time

This paper proposes a new algorithm for reducing Approximate Nearest Nei...
research
11/15/2018

Boosting Search Performance Using Query Variations

Rank fusion is a powerful technique that allows multiple sources of info...
research
08/08/2017

A discriminative view of MRF pre-processing algorithms

While Markov Random Fields (MRFs) are widely used in computer vision, th...
research
03/29/2018

Modified SMOTE Using Mutual Information and Different Sorts of Entropies

SMOTE is one of the oversampling techniques for balancing the datasets a...
research
01/10/2013

Pre-processing for Triangulation of Probabilistic Networks

The currently most efficient algorithm for inference with a probabilisti...
research
03/18/2021

Optimally Summarizing Data by Small Fact Sets for Concise Answers to Voice Queries

Our goal is to find combinations of facts that optimally summarize data ...
research
10/07/2019

Fast and Bayes-consistent nearest neighbors

Research on nearest-neighbor methods tends to focus somewhat dichotomous...

Please sign up or login with your details

Forgot password? Click here to reset