Experimental Analysis of Locality Sensitive Hashing Techniques for High-Dimensional Approximate Nearest Neighbor Searches

06/19/2020
by   Omid Jafari, et al.
0

Finding nearest neighbors in high-dimensional spaces is a fundamental operation in many multimedia retrieval applications. Exact tree-based indexing approaches are known to suffer from the notorious curse of dimensionality for high-dimensional data. Approximate searching techniques sacrifice some accuracy while returning good enough results for faster performance. Locality Sensitive Hashing (LSH) is a very popular technique for finding approximate nearest neighbors in high-dimensional spaces. Apart from providing theoretical guarantees on the query results, one of the main benefits of LSH techniques is their good scalability to large datasets because they are external memory based. The most dominant costs for existing LSH techniques are the algorithm time and the index I/Os required to find candidate points. Existing works do not compare both of these dominant costs in their evaluation. In this experimental survey paper, we show the impact of both these costs on the overall performance of the LSH technique. We compare three state-of-the-art techniques on four real-world datasets, and show that, in contrast to recent works, C2LSH is still the state-of-the-art algorithm in terms of performance while achieving similar accuracy as its recent competitors.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2021

A Survey on Locality Sensitive Hashing Algorithms and their Applications

Finding nearest neighbors in high-dimensional spaces is a fundamental op...
research
03/13/2020

mmLSH: A Practical and Efficient Technique for Processing Approximate Nearest Neighbor Queries on Multimedia Data

Many large multimedia applications require efficient processing of neare...
research
12/15/2019

Drawbacks and Proposed Solutions for Real-time Processing on Existing State-of-the-art Locality Sensitive Hashing Techniques

Nearest-neighbor query processing is a fundamental operation for many im...
research
07/06/2021

PM-LSH: a fast and accurate in-memory framework for high-dimensional approximate NN and closest pair search

Nearest neighbor (NN) search is inherently computationally expensive in ...
research
10/10/2018

Technical Report: KNN Joins Using a Hybrid Approach: Exploiting CPU/GPU Workload Characteristics

This paper studies finding the K nearest neighbors (KNN) of all points i...
research
04/11/2020

Locality-Sensitive Hashing Scheme based on Longest Circular Co-Substring

Locality-Sensitive Hashing (LSH) is one of the most popular methods for ...
research
11/16/2014

Revisiting Kernelized Locality-Sensitive Hashing for Improved Large-Scale Image Retrieval

We present a simple but powerful reinterpretation of kernelized locality...

Please sign up or login with your details

Forgot password? Click here to reset