Navigable Proximity Graph-Driven Native Hybrid Queries with Structured and Unstructured Constraints

03/25/2022
by   Mengzhao Wang, et al.
0

As research interest surges, vector similarity search is applied in multiple fields, including data mining, computer vision, and information retrieval. Given a set of objects (e.g., a set of images) and a query object, we can easily transform each object into a feature vector and apply the vector similarity search to retrieve the most similar objects. However, the original vector similarity search cannot well support hybrid queries, where users not only input unstructured query constraint (i.e., the feature vector of query object) but also structured query constraint (i.e., the desired attributes of interest). Hybrid query processing aims at identifying these objects with similar feature vectors to query object and satisfying the given attribute constraints. Recent efforts have attempted to answer a hybrid query by performing attribute filtering and vector similarity search separately and then merging the results later, which limits efficiency and accuracy because they are not purpose-built for hybrid queries. In this paper, we propose a native hybrid query (NHQ) framework based on proximity graph (PG), which provides the specialized composite index and joint pruning modules for hybrid queries. We easily deploy existing various PGs on this framework to process hybrid queries efficiently. Moreover, we present two novel navigable PGs (NPGs) with optimized edge selection and routing strategies, which obtain better overall performance than existing PGs. After that, we deploy the proposed NPGs in NHQ to form two hybrid query methods, which significantly outperform the state-of-the-art competitors on all experimental datasets (10× faster under the same Recall), including eight public and one in-house real-world datasets. Our code and datasets have been released at <https://github.com/AshenOn3/NHQ>.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2022

HQANN: Efficient and Robust Similarity Search for Hybrid Queries with Structured and Unstructured Constraints

The in-memory approximate nearest neighbor search (ANNS) algorithms have...
research
04/04/2023

High-Throughput Vector Similarity Search in Knowledge Graphs

There is an increasing adoption of machine learning for encoding data in...
research
12/18/2018

Index-based, High-dimensional, Cosine Threshold Querying with Optimality Guarantees

Given a database of vectors, a cosine threshold query returns all vector...
research
08/08/2021

Fairest Neighbors: Tradeoffs Between Metric Queries

Metric search commonly involves finding objects similar to a given sampl...
research
02/28/2023

WISK: A Workload-aware Learned Index for Spatial Keyword Queries

Spatial objects often come with textual information, such as Points of I...
research
07/05/2021

PandaDB: Understanding Unstructured Data in Graph Database

At present, graph model is widely used in many applications, such as kno...
research
11/03/2020

Memory-Efficient RkNN Retrieval by Nonlinear k-Distance Approximation

The reverse k-nearest neighbor (RkNN) query is an established query type...

Please sign up or login with your details

Forgot password? Click here to reset