Norm-Ranging LSH for Maximum Inner Product Search

09/24/2018
by   Xiao Yan, et al.
0

Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of magnitude speedup over Simple-LSH for the same recall, thus significantly benefiting applications that involve MIPS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2018

Norm-Range Partition: A Univiseral Catalyst for LSH based Maximum Inner Product Search (MIPS)

Recently, locality sensitive hashing (LSH) was shown to be effective for...
research
07/21/2015

Clustering is Efficient for Approximate Maximum Inner Product Search

Efficient Maximum Inner Product Search (MIPS) is an important task that ...
research
10/14/2021

Reverse Maximum Inner Product Search: How to efficiently find users who would like to buy my item?

The MIPS (maximum inner product search), which finds the item with the h...
research
09/30/2019

Understanding and Improving Proximity Graph based Maximum Inner Product Search

The inner-product navigable small world graph (ip-NSW) represents the st...
research
07/02/2020

Climbing the WOL: Training for Cheaper Inference

Efficient inference for wide output layers (WOLs) is an essential yet ch...
research
06/25/2019

Pyramid: A General Framework for Distributed Similarity Search

Similarity search is a core component in various applications such as im...
research
12/21/2020

Sublinear Maximum Inner Product Search using Concomitants of Extreme Order Statistics

We propose a novel dimensionality reduction method for maximum inner pro...

Please sign up or login with your details

Forgot password? Click here to reset