Choice of neighbor order in nearest-neighbor classification

10/29/2008
by   Peter Hall, et al.
0

The kth-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method is inhibited by lack of knowledge about its properties, in particular, about the manner in which it is influenced by the value of k; and by the absence of techniques for empirical choice of k. In the present paper we detail the way in which the value of k determines the misclassification error. We consider two models, Poisson and Binomial, for the training samples. Under the first model, data are recorded in a Poisson stream and are "assigned" to one or other of the two populations in accordance with the prior probabilities. In particular, the total number of data in both training samples is a Poisson-distributed random variable. Under the Binomial model, however, the total number of data in the training samples is fixed, although again each data value is assigned in a random way. Although the values of risk and regret associated with the Poisson and Binomial models are different, they are asymptotically equivalent to first order, and also to the risks associated with kernel-based classifiers that are tailored to the case of two derivatives. These properties motivate new methods for choosing the value of k.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/11/2017

Fast Nearest-Neighbor Classification using RNN in Domains with Large Number of Classes

In scenarios involving text classification where the number of classes i...
research
09/06/2013

Convergence of Nearest Neighbor Pattern Classification with Selective Sampling

In the panoply of pattern classification techniques, few enjoy the intui...
research
12/12/2018

Distributed Nearest Neighbor Classification

Nearest neighbor is a popular nonparametric method for classification an...
research
08/01/2018

Model-order selection in statistical shape models

Statistical shape models enhance machine learning algorithms providing p...
research
05/13/2019

Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates

We establish the first nonasymptotic error bounds for Kaplan-Meier-based...
research
12/09/2020

KNN Classification with One-step Computation

KNN classification is a query triggered yet improvisational learning mod...
research
12/04/2019

Extreme Learning Machine design for dealing with unrepresentative features

Extreme Learning Machines (ELMs) have become a popular tool in the field...

Please sign up or login with your details

Forgot password? Click here to reset