Zoom: SSD-based Vector Search for Optimizing Accuracy, Latency and Memory

09/11/2018
by   Minjia Zhang, et al.
0

With the advancement of machine learning and deep learning, vector search becomes instrumental to many information retrieval systems, to search and find best matches to user queries based on their semantic similarities.These online services require the search architecture to be both effective with high accuracy and efficient with low latency and memory footprint, which existing work fails to offer. We develop, Zoom, a new vector search solution that collaboratively optimizes accuracy, latency and memory based on a multiview approach. (1) A "preview" step generates a small set of good candidates, leveraging compressed vectors in memory for reduced footprint and fast lookup. (2) A "fullview" step on SSDs reranks those candidates with their full-length vector, striking high accuracy. Our evaluation shows that, Zoom achieves an order of magnitude improvements on efficiency while attaining equal or higher accuracy, comparing with the state-of-the-art.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2023

Similarity search in the blink of an eye with compressed indices

Nowadays, data is represented by vectors. Retrieving those vectors, amon...
research
10/21/2017

An efficient deep learning hashing neural network for mobile visual search

Mobile visual search applications are emerging that enable users to sens...
research
10/22/2022

OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries

State-of-the-art algorithms for Approximate Nearest Neighbor Search (ANN...
research
12/02/2021

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

Neural information retrieval (IR) has greatly advanced search and other ...
research
04/12/2018

On Using Non-Volatile Memory in Apache Lucene

Apache Lucene is a widely popular information retrieval library used to ...
research
11/23/2021

End-to-End Optimized Arrhythmia Detection Pipeline using Machine Learning for Ultra-Edge Devices

Atrial fibrillation (AF) is the most prevalent cardiac arrhythmia worldw...
research
10/07/2010

Profile Based Sub-Image Search in Image Databases

Sub-image search with high accuracy in natural images still remains a ch...

Please sign up or login with your details

Forgot password? Click here to reset