Eclipse: Generalizing kNN and Skyline

06/14/2019
by   Jinfei Liu, et al.
0

k nearest neighbor (kNN) queries and skyline queries are important operators on multi-dimensional data points. Given a query point, kNN query returns the k nearest neighbors based on a scoring function such as a weighted sum of the attributes, which requires predefined attribute weights (or preferences). Skyline query returns all possible nearest neighbors for any monotonic scoring functions without requiring attribute weights but the number of returned points can be prohibitively large. We observe that both kNN and skyline are inflexible and cannot be easily customized. In this paper, we propose a novel eclipse operator that generalizes the classic 1NN and skyline queries and provides a more flexible and customizable query solution for users. In eclipse, users can specify rough and customizable attribute preferences and control the number of returned points. We show that both 1NN and skyline are instantiations of eclipse. To process eclipse queries, we propose a baseline algorithm with time complexity O(n^22^d-1), and an improved O(nlog ^d-1n) time transformation-based algorithm, where n is the number of points and d is the number of dimensions. Furthermore, we propose a novel index-based algorithm utilizing duality transform with much better efficiency. The experimental results on the real NBA dataset and the synthetic datasets demonstrate the effectiveness of the eclipse operator and the efficiency of our eclipse algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2022

Improved Search of Relevant Points for Nearest-Neighbor Classification

Given a training set P ⊂ℝ^d, the nearest-neighbor classifier assigns any...
research
01/25/2019

Metric Spaces with Expensive Distances

In algorithms for finite metric spaces, it is common to assume that the ...
research
01/16/2020

Query Results over Ongoing Databases that Remain Valid as Time Passes By (Extended Version)

Ongoing time point now is used to state that a tuple is valid from the s...
research
02/26/2018

All nearest neighbor calculation based on Delaunay graphs

When we have two data sets and want to find the nearest neighbour of eac...
research
12/04/2018

Skyline Diagram: Efficient Space Partitioning for Skyline Queries

Skyline queries are important in many application domains. In this paper...
research
01/17/2022

Paired compressed cover trees guarantee a near linear parametrized complexity for all k-nearest neighbors search in an arbitrary metric space

This paper studies the important problem of finding all k-nearest neighb...

Please sign up or login with your details

Forgot password? Click here to reset