Practical Near Neighbor Search via Group Testing

06/22/2021
by   Joshua Engels, et al.
0

We present a new algorithm for the approximate near neighbor problem that combines classical ideas from group testing with locality-sensitive hashing (LSH). We reduce the near neighbor search problem to a group testing problem by designating neighbors as "positives," non-neighbors as "negatives," and approximate membership queries as group tests. We instantiate this framework using distance-sensitive Bloom Filters to Identify Near-Neighbor Groups (FLINNG). We prove that FLINNG has sub-linear query time and show that our algorithm comes with a variety of practical advantages. For example, FLINNG can be constructed in a single pass through the data, consists entirely of efficient integer operations, and does not require any distance computations. We conduct large-scale experiments on high-dimensional search tasks such as genome search, URL similarity search, and embedding search over the massive YFCC100M dataset. In our comparison with leading algorithms such as HNSW and FAISS, we find that FLINNG can provide up to a 10x query speedup with substantially smaller indexing time and memory.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2019

Similarity Problems in High Dimensions

The main contribution of this dissertation is the introduction of new or...
research
02/18/2019

RACE: Sub-Linear Memory Sketches for Approximate Near-Neighbor Search on Streaming Data

We demonstrate the first possibility of a sub-linear memory sketch for s...
research
04/07/2021

Graph Reordering for Cache-Efficient Near Neighbor Search

Graph search is one of the most successful algorithmic trends in near ne...
research
07/25/2018

Local Orthogonal-Group Testing

This work addresses approximate nearest neighbor search applied in the d...
research
10/27/2022

DESSERT: An Efficient Algorithm for Vector Set Search with Vector Set Queries

We study the problem of vector set search with vector set queries. This ...
research
11/13/2020

Kernel Density Estimation through Density Constrained Near Neighbor Search

In this paper we revisit the kernel density estimation problem: given a ...

Please sign up or login with your details

Forgot password? Click here to reset