Improved Search of Relevant Points for Nearest-Neighbor Classification

03/07/2022
by   Alejandro Flores-Velazco, et al.
0

Given a training set P ⊂ℝ^d, the nearest-neighbor classifier assigns any query point q ∈ℝ^d to the class of its closest point in P. To answer these classification queries, some training points are more relevant than others. We say a training point is relevant if its omission from the training set could induce the misclassification of some query point in ℝ^d. These relevant points are commonly known as border points, as they define the boundaries of the Voronoi diagram of P that separate points of different classes. Being able to compute this set of points efficiently is crucial to reduce the size of the training set without affecting the accuracy of the nearest-neighbor classifier. Improving over a decades-long result by Clarkson, in a recent paper by Eppstein an output-sensitive algorithm was proposed to find the set of border points of P in O( n^2 + nk^2 ) time, where k is the size of such set. In this paper, we improve this algorithm to have time complexity equal to O( nk^2 ) by proving that the first steps of their algorithm, which require O( n^2 ) time, are unnecessary.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 9

research
10/12/2021

Finding Relevant Points for Nearest-Neighbor Classification

In nearest-neighbor classification problems, a set of d-dimensional trai...
research
06/14/2019

Eclipse: Generalizing kNN and Skyline

k nearest neighbor (kNN) queries and skyline queries are important opera...
research
12/14/2017

Adaptive kNN using Expected Accuracy for Classification of Geo-Spatial Data

The k-Nearest Neighbor (kNN) classification approach is conceptually sim...
research
12/04/2018

Skyline Diagram: Efficient Space Partitioning for Skyline Queries

Skyline queries are important in many application domains. In this paper...
research
09/28/2018

Predicting Destinations by Nearest Neighbor Search on Training Vessel Routes

The DEBS Grand Challenge 2018 is set in the context of maritime route pr...
research
03/18/2021

Nearest-Neighbor Queries in Customizable Contraction Hierarchies and Applications

Customizable contraction hierarchies are one of the most popular route p...
research
10/19/2018

Stochastic temporal data upscaling using the generalized k-nearest neighbor algorithm

Three methods of temporal data upscaling, which may collectively be call...

Please sign up or login with your details

Forgot password? Click here to reset