Scaling Graph-Based ANNS Algorithms to Billion-Size Datasets: A Comparative Analysis

05/07/2023
by   Magdalen Dobson, et al.
0

Algorithms for approximate nearest-neighbor search (ANNS) have been the topic of significant recent interest in the research community. However, evaluations of such algorithms are usually restricted to a small number of datasets with millions or tens of millions of points, whereas real-world applications require algorithms that work on the scale of billions of points. Furthermore, existing evaluations of ANNS algorithms are typically heavily focused on measuring and optimizing for queries-per second (QPS) at a given accuracy, which can be hardware-dependent and ignores important metrics such as build time. In this paper, we propose a set of principled measures for evaluating ANNS algorithms which refocuses on their scalability to billion-size datasets. These measures include ability to be efficiently parallelized, build times, and scaling relationships as dataset size increases. We also expand on the QPS measure with machine-agnostic measures such as the number of distance computations per query, and we evaluate ANNS data structures on their accuracy in more demanding settings required in modern applications, such as evaluating range queries and running on out-of-distribution data. We optimize four graph-based algorithms for the billion-scale setting, and in the process provide a general framework for making many incremental ANNS graph algorithms lock-free. We use our framework to evaluate the aforementioned graph-based ANNS algorithms as well as two alternative approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/29/2021

A Comprehensive Survey and Experimental Comparison of Graph-Based Approximate Nearest Neighbor Search

Approximate nearest neighbor search (ANNS) constitutes an important oper...
research
10/22/2022

OOD-DiskANN: Efficient and Scalable Graph ANNS for Out-of-Distribution Queries

State-of-the-art algorithms for Approximate Nearest Neighbor Search (ANN...
research
04/03/2019

A Comparative Study on Hierarchical Navigable Small World Graphs

Hierarchical navigable small world (HNSW) graphs get more and more popul...
research
06/22/2022

FINGER: Fast Inference for Graph-based Approximate Nearest Neighbor Search

Approximate K-Nearest Neighbor Search (AKNNS) has now become ubiquitous ...
research
01/10/2022

Tree-based Search Graph for Approximate Nearest Neighbor Search

Nearest neighbor search supports important applications in many domains,...
research
05/08/2022

Results of the NeurIPS'21 Challenge on Billion-Scale Approximate Nearest Neighbor Search

Despite the broad range of algorithms for Approximate Nearest Neighbor S...
research
09/01/2023

General and Practical Tuning Method for Off-the-Shelf Graph-Based Index: SISAP Indexing Challenge Report by Team UTokyo

Despite the efficacy of graph-based algorithms for Approximate Nearest N...

Please sign up or login with your details

Forgot password? Click here to reset