Pigeonring: A Principle for Faster Thresholded Similarity Search

04/04/2018
by   Jianbin Qin, et al.
0

The pigeonhole principle states that if n items are contained in m boxes, then at least one box has no fewer than n/m items. It is utilized to solve many data management problems, especially for thresholded similarity searches. Despite many pigeonhole principle-based solutions proposed in the last few decades, the condition stated by the principle is weak. It only constrains the number of items in a single box. By organizing the boxes in a ring, we observe that the number of items in multiple boxes are also constrained. We propose a new principle called the pigeonring principle which formally captures such constraints and yields stronger conditions. To utilize the pigeonring principle, we focus on problems defined in the form of identifying data objects whose similarities or distances to the query is constrained by a threshold. Many solutions to these problems utilize the pigeonhole principle to find candidates that satisfy a filtering condition. By the pigeonring principle, stronger filtering conditions can be established. We show that the pigeonhole principle is a special case of the pigeonring principle. This suggests that all the solutions based on the pigeonhole principle are possible to be accelerated by the pigeonring principle. A universal filtering framework is introduced to encompass the solutions to these problems based on the pigeonring principle. Besides, we discuss how to quickly find candidates specified by the pigeonring principle with minor modifications on top of existing algorithms. Experimental results on real datasets demonstrate the applicability of the pigeonring principle as well as the superior performance of the algorithms based on the principle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2022

Constrained Approximate Similarity Search on Proximity Graph

Search engines and recommendation systems are built to efficiently displ...
research
04/02/2020

Nass: A New Approach to Graph Similarity Search

In this paper, we study the problem of graph similarity search with grap...
research
02/01/2023

Extending the Known Region of Nonlocal Boxes that Collapse Communication Complexity

Non-signalling boxes (NS) are theoretical resources defined by the princ...
research
12/30/2021

Harmony in the Light of Computational Ludics

Prawitz formulated the so-called inversion principle as one of the chara...
research
03/06/2013

Partially Specified Belief Functions

This paper presents a procedure to determine a complete belief function ...
research
08/17/2023

BOTT: Box Only Transformer Tracker for 3D Object Tracking

Tracking 3D objects is an important task in autonomous driving. Classica...

Please sign up or login with your details

Forgot password? Click here to reset