Social Distancing is Good for Points too!

06/28/2020
by   Alejandro Flores-Velazco, et al.
1

The nearest-neighbor rule is a well-known classification technique that, given a training set P of labeled points, classifies any unlabeled query point with the label of its closest point in P. The nearest-neighbor condensation problem aims to reduce the training set without harming the accuracy of the nearest-neighbor rule. FCNN is the most popular algorithm for condensation. It is heuristic in nature, and theoretical results for it are scarce. In this paper, we settle the question of whether reasonable upper-bounds can be proven for the size of the subset selected by FCNN. First, we show that the algorithm can behave poorly when points are too close to each other, forcing it to select many more points than necessary. We then successfully modify the algorithm to avoid such cases, thus imposing that selected points should "keep some distance". This modification is sufficient to prove useful upper-bounds, along with approximation guarantees for the algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2013

An upper bound on prototype set size for condensed nearest neighbor

The condensed nearest neighbor (CNN) algorithm is a heuristic for reduci...
research
04/27/2019

Guarantees on Nearest-Neighbor Condensation heuristics

The problem of nearest-neighbor (NN) condensation aims to reduce the siz...
research
02/16/2020

Coresets for the Nearest-Neighbor Rule

The problem of nearest-neighbor condensation deals with finding a subset...
research
02/04/2023

Reducing Nearest Neighbor Training Sets Optimally and Exactly

In nearest-neighbor classification, a training set P of points in ℝ^d wi...
research
09/25/2019

On the Contact and Nearest-Neighbor Distance Distributions for the n-Dimensional Matern Cluster Process

This letter provides exact characterization of the contact and nearest-n...
research
09/28/2018

Predicting Destinations by Nearest Neighbor Search on Training Vessel Routes

The DEBS Grand Challenge 2018 is set in the context of maritime route pr...
research
06/16/2022

Active Nearest Neighbor Regression Through Delaunay Refinement

We introduce an algorithm for active function approximation based on nea...

Please sign up or login with your details

Forgot password? Click here to reset