Learning to Index for Nearest Neighbor Search

07/09/2018
by   Chih-Yi Chiu, et al.
0

In this study, we present a novel ranking model based on learning the nearest neighbor relationships embedded in the index space. Given a query point, a conventional nearest neighbor search approach calculates the distances to the cluster centroids, before ranking the clusters from near to far based on the distances. The data indexed in the top-ranked clusters are retrieved and treated as the nearest neighbor candidates for the query. However, the loss of quantization between the data and cluster centroids will inevitably harm the search accuracy. To address this problem, the proposed model ranks clusters based on their nearest neighbor probabilities rather than the query-centroid distances to the query. The nearest neighbor probabilities are estimated by employing neural networks to characterize the neighborhood relationships as a nonlinear function, i.e., the density distribution of nearest neighbors with respect to the query. The proposed probability-based ranking model can replace the conventional distance-based ranking model as a coarse filter for candidate clusters, and the nearest neighbor probability can be used to determine the data quantity to be retrieved from the candidate cluster. Our experimental results demonstrated that implementation of the proposed ranking model for two state-of-the-art nearest neighbor quantization and search methods could boost the search performance effectively in billion-scale datasets.

READ FULL TEXT
research
06/28/2019

PUFFINN: Parameterless and Universally Fast FInding of Nearest Neighbors

We present PUFFINN, a parameterless LSH-based index for solving the k-ne...
research
12/17/2009

Optimal construction of k-nearest neighbor graphs for identifying noisy clusters

We study clustering algorithms based on neighborhood graphs on a random ...
research
06/16/2022

Active Nearest Neighbor Regression Through Delaunay Refinement

We introduce an algorithm for active function approximation based on nea...
research
09/21/2010

Balancing clusters to reduce response time variability in large scale image search

Many algorithms for approximate nearest neighbor search in high-dimensio...
research
01/12/2015

Navigating the Semantic Horizon using Relative Neighborhood Graphs

This paper is concerned with nearest neighbor search in distributional s...
research
11/03/2020

Memory-Efficient RkNN Retrieval by Nonlinear k-Distance Approximation

The reverse k-nearest neighbor (RkNN) query is an established query type...
research
05/08/2022

Results of the NeurIPS'21 Challenge on Billion-Scale Approximate Nearest Neighbor Search

Despite the broad range of algorithms for Approximate Nearest Neighbor S...

Please sign up or login with your details

Forgot password? Click here to reset