Vector and Line Quantization for Billion-scale Similarity Search on GPUs

01/02/2019
by   Wei Chen, et al.
0

Billion-scale high-dimensional approximate nearest neighbour (ANN) search has become an important problem for searching similar objects among the vast amount of images and videos available online. The existing ANN methods are usually characterized by their specific indexing structures, including the inverted index and the inverted multi-index. The inverted index structure is amenable to GPU-based implementations, and the state-of-the-art systems such as Faiss are able to exploit the massive parallelism offered by GPUs. However, the inverted index requires high memory overhead to index the dataset effectively. The inverted multi-index is difficult to implement for GPUs, and also ineffective in dealing with database with different data distributions. In this paper we propose a novel hierarchical inverted index structure generated by vector and line quantization methods. Our quantization method improves both search efficiency and accuracy, while maintaining comparable memory consumption. This is achieved by reducing search space and increasing the number of indexed regions. We introduce a new ANN search system, VLQ-ADC, that is based on the proposed inverted index, and perform extensive evaluation on two public billion-scale benchmark datasets SIFT1B and DEEP1B. Our evaluation shows that VLQ-ADC significantly outperforms the state-of-the-art GPU- and CPU-based systems in terms of both accuracy and search speed.

READ FULL TEXT
research
12/02/2019

GGNN: Graph-based GPU Nearest Neighbor Search

Approximate nearest neighbor (ANN) search in high dimensions is an integ...
research
04/26/2018

Link and code: Fast indexing with graphs and compact regression codes

Similarity search approaches based on graph walks have recently attained...
research
02/28/2017

Billion-scale similarity search with GPUs

Similarity search finds application in specialized database systems hand...
research
09/26/2018

GPU Accelerated Similarity Self-Join for Multi-Dimensional Data

The self-join finds all objects in a dataset that are within a search di...
research
01/30/2018

A-Tree: A Bounded Approximate Index Structure

Index structures are one of the most important tools that DBAs leverage ...
research
08/12/2018

Reconfigurable Inverted Index

Existing approximate nearest neighbor search systems suffer from two fun...
research
11/29/2017

Online Product Quantization

Approximate nearest neighbor (ANN) search has achieved great success in ...

Please sign up or login with your details

Forgot password? Click here to reset