Pyramid: A General Framework for Distributed Similarity Search

06/25/2019
by   Shiyuan Deng, et al.
0

Similarity search is a core component in various applications such as image matching, product recommendation and low-shot classification. However, single machine solutions are usually insufficient due to the large cardinality of modern datasets and stringent latency requirement of on-line query processing. We present Pyramid, a general and efficient framework for distributed similarity search. Pyramid supports search with popular similarity functions including Euclidean distance, angular distance and inner product. Different from existing distributed solutions that are based on KD-tree or locality sensitive hashing (LSH), Pyramid is based on Hierarchical Navigable Small World graph (HNSW), which is the state of the art similarity search algorithm on a single machine. To achieve high query processing throughput, Pyramid partitions a dataset into sub-datasets containing similar items for index building and assigns a query to only some of the sub-datasets for query processing. To provide the robustness required by production deployment, Pyramid also supports failure recovery and straggler mitigation. Pyramid offers a set of concise API such that users can easily use Pyramid without knowing the details of distributed execution. Experiments on large-scale datasets show that Pyramid produces quality results for similarity search, achieves high query processing throughput and is robust under node failure and straggler.

READ FULL TEXT
research
10/22/2018

Norm-Range Partition: A Univiseral Catalyst for LSH based Maximum Inner Product Search (MIPS)

Recently, locality sensitive hashing (LSH) was shown to be effective for...
research
07/21/2015

Clustering is Efficient for Approximate Maximum Inner Product Search

Efficient Maximum Inner Product Search (MIPS) is an important task that ...
research
12/21/2017

The Pyramid Scheme: Oblivious RAM for Trusted Processors

Modern processors, e.g., Intel SGX, allow applications to isolate secret...
research
09/24/2018

Norm-Ranging LSH for Maximum Inner Product Search

Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art ...
research
08/30/2019

Fashion Retrieval via Graph Reasoning Networks on a Similarity Pyramid

Matching clothing images from customers and online shopping stores has r...
research
02/24/2020

Relaxing Relationship Queries on Graph Data

In many domains we have witnessed the need to search a large entity-rela...

Please sign up or login with your details

Forgot password? Click here to reset