The SuperM-Tree: Indexing metric spaces with sized objects

01/31/2019
by   Jörg P. Bachmann, et al.
0

A common approach to implementing similarity search applications is the usage of distance functions, where small distances indicate high similarity. In the case of metric distance functions, metric index structures can be used to accelerate nearest neighbor queries. On the other hand, many applications ask for approximate subsequences or subsets, e.g. searching for a similar partial sequence of a gene, for a similar scene in a movie, or for a similar object in a picture which is represented by a set of multidimensional features. Metric index structures such as the M-Tree cannot be utilized for these tasks because of the symmetry of the metric distance functions. In this work, we propose the SuperM-Tree as an extension of the M-Tree where approximate subsequence and subset queries become nearest neighbor queries. In order to do this, we introduce metric subset spaces as a generalized concept of metric spaces. Various metric distance functions can be extended to metric subset distance functions, e.g. the Euclidean distance (on windows), the Hausdorff distance (on subsets), the Edit distance and the Dog-Keeper distance (on subsequences). We show that these examples subsume the applications mentioned above.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

Approximate Nearest Neighbor Searching with Non-Euclidean and Weighted Distances

We present a new approach to approximate nearest-neighbor queries in fix...
research
01/12/2018

Toward Metric Indexes for Incremental Insertion and Querying

In this work we explore the use of metric index structures, which accele...
research
12/20/2021

The Cascading Metric Tree

This paper presents the Cascaded Metric Tree (CMT) for efficient satisfa...
research
01/25/2019

Metric Spaces with Expensive Distances

In algorithms for finite metric spaces, it is common to assume that the ...
research
01/15/2020

Complete and Sufficient Spatial Domination of Multidimensional Rectangles

Rectangles are used to approximate objects, or sets of objects, in a ple...
research
10/08/2019

Accurate and Fast Retrieval for Complex Non-metric Data via Neighborhood Graphs

We demonstrate that a graph-based search algorithm-relying on the constr...
research
08/18/2022

Learned Indexing in Proteins: Extended Work on Substituting Complex Distance Calculations with Embedding and Clustering Techniques

Despite the constant evolution of similarity searching research, it cont...

Please sign up or login with your details

Forgot password? Click here to reset