Link and code: Fast indexing with graphs and compact regression codes

04/26/2018
by   MMatthijs Douze, et al.
0

Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements. In this paper, we revisit these approaches by considering, additionally, the memory constraint required to index billions of images on a single server. This leads us to propose a method based both on graph traversal and compact representations. We encode the indexed vectors using quantization and exploit the graph structure to refine the similarity estimation. In essence, our method takes the best of these two worlds: the search strategy is based on nested graphs, thereby providing high precision with a relatively small set of comparisons. At the same time it offers a significant memory compression. As a result, our approach outperforms the state of the art on operating points considering 64-128 bytes per vector, as demonstrated by our results on two billion-scale public benchmarks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/02/2019

Vector and Line Quantization for Billion-scale Similarity Search on GPUs

Billion-scale high-dimensional approximate nearest neighbour (ANN) searc...
research
12/12/2016

FastText.zip: Compressing text classification models

We consider the problem of producing compact architectures for text clas...
research
10/31/2017

A multi-layer network based on Sparse Ternary Codes for universal vector compression

We present the multi-layer extension of the Sparse Ternary Codes (STC) f...
research
08/10/2016

Approximate search with quantized sparse representations

This paper tackles the task of storing a large collection of vectors, su...
research
12/18/2019

Interleaved Composite Quantization for High-Dimensional Similarity Search

Similarity search retrieves the nearest neighbors of a query vector from...
research
07/17/2018

RiffleScrambler - a memory-hard password storing function

We introduce RiffleScrambler: a new family of directed acyclic graphs an...
research
02/18/2011

Searching in one billion vectors: re-rank with source coding

Recent indexing techniques inspired by source coding have been shown suc...

Please sign up or login with your details

Forgot password? Click here to reset