LANNS: A Web-Scale Approximate Nearest Neighbor Lookup System

10/19/2020
by   Ishita Doshi, et al.
0

Nearest neighbor search (NNS) has a wide range of applications in information retrieval, computer vision, machine learning, databases, and other areas. Existing state-of-the-art algorithm for nearest neighbor search, Hierarchical Navigable Small World Networks(HNSW), is unable to scale to large datasets of 100M records in high dimensions. In this paper, we propose LANNS, an end-to-end platform for Approximate Nearest Neighbor Search, which scales for web-scale datasets. Library for Large Scale Approximate Nearest Neighbor Search (LANNS) is deployed in multiple production systems for identifying topK (100 ≤ topK ≤ 200) approximate nearest neighbors with a latency of a few milliseconds per query, high throughput of 2.5k Queries Per Second (QPS) on a single node, on large (∼180M data points) high dimensional (50-2048 dimensional) datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2018

Approximate Nearest Neighbor Search in High Dimensions

The nearest neighbor problem is defined as follows: Given a set P of n p...
research
08/25/2017

Subspace Approximation for Approximate Nearest Neighbor Search in NLP

Most natural language processing tasks can be formulated as the approxim...
research
01/17/2023

Custom 8-bit floating point value format for reducing shared memory bank conflict in approximate nearest neighbor search

The k-nearest neighbor search is used in various applications such as ma...
research
07/17/2019

The Role of Local Intrinsic Dimensionality in Benchmarking Nearest Neighbor Search

This paper reconsiders common benchmarking approaches to nearest neighbo...
research
07/12/2022

Accelerating Large-Scale Graph-based Nearest Neighbor Search on a Computational Storage Platform

K-nearest neighbor search is one of the fundamental tasks in various app...
research
12/02/2019

scikit-hubness: Hubness Reduction and Approximate Neighbor Search

This paper introduces scikit-hubness, a Python package for efficient nea...
research
09/09/2020

KNN-DBSCAN: a DBSCAN in high dimensions

Clustering is a fundamental task in machine learning. One of the most su...

Please sign up or login with your details

Forgot password? Click here to reset