Searching in one billion vectors: re-rank with source coding

02/18/2011
by   Hervé Jégou, et al.
0

Recent indexing techniques inspired by source coding have been shown successful to index billions of high-dimensional vectors in memory. In this paper, we propose an approach that re-ranks the neighbor hypotheses obtained by these compressed-domain indexing methods. In contrast to the usual post-verification scheme, which performs exact distance calculation on the short-list of hypotheses, the estimated distances are refined based on short quantization codes, to avoid reading the full vectors from disk. We have released a new public dataset of one billion 128-dimensional vectors and proposed an experimental setup to evaluate high dimensional indexing algorithms on a realistic scale. Experiments show that our method accurately and efficiently re-ranks the neighbor hypotheses using little memory compared to the full vectors representation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2016

Scalable Image Retrieval by Sparse Product Quantization

Fast Approximate Nearest Neighbor (ANN) search technique for high-dimens...
research
04/18/2018

HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces

Nearest neighbor searching of large databases in high-dimensional spaces...
research
12/22/2014

Language Recognition using Random Indexing

Random Indexing is a simple implementation of Random Projections with a ...
research
04/24/2017

Accelerated Nearest Neighbor Search with Quick ADC

Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a f...
research
10/17/2021

Low-Precision Quantization for Efficient Nearest Neighbor Search

Fast k-Nearest Neighbor search over real-valued vector spaces (KNN) is a...
research
08/02/2015

Indexing of CNN Features for Large Scale Image Search

Convolutional neural network (CNN) features which represent images with ...
research
04/26/2018

Link and code: Fast indexing with graphs and compact regression codes

Similarity search approaches based on graph walks have recently attained...

Please sign up or login with your details

Forgot password? Click here to reset