k-Nearest Neighbour Classification of Datasets with a Family of Distances

11/29/2015
by   Stan Hatko, et al.
0

The k-nearest neighbour (k-NN) classifier is one of the oldest and most important supervised learning algorithms for classifying datasets. Traditionally the Euclidean norm is used as the distance for the k-NN classifier. In this thesis we investigate the use of alternative distances for the k-NN classifier. We start by introducing some background notions in statistical machine learning. We define the k-NN classifier and discuss Stone's theorem and the proof that k-NN is universally consistent on the normed space R^d. We then prove that k-NN is universally consistent if we take a sequence of random norms (that are independent of the sample and the query) from a family of norms that satisfies a particular boundedness condition. We extend this result by replacing norms with distances based on uniformly locally Lipschitz functions that satisfy certain conditions. We discuss the limitations of Stone's lemma and Stone's theorem, particularly with respect to quasinorms and adaptively choosing a distance for k-NN based on the labelled sample. We show the universal consistency of a two stage k-NN type classifier where we select the distance adaptively based on a split labelled sample and the query. We conclude by giving some examples of improvements of the accuracy of classifying various datasets using the above techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2020

Universal consistency of the k-NN rule in metric spaces and Nagata dimension

The k nearest neighbour learning rule (under the uniform distance tie br...
research
12/21/2021

Combining Minkowski and Chebyshev: New distance proposal and survey of distance metrics using k-nearest neighbours classifier

This work proposes a distance that combines Minkowski and Chebyshev dist...
research
10/19/2011

Is the k-NN classifier in high dimensions affected by the curse of dimensionality?

There is an increasing body of evidence suggesting that exact nearest ne...
research
08/29/2022

Learned k-NN Distance Estimation

Big data mining is well known to be an important task for data science, ...
research
09/10/2020

Universal consistency of Wasserstein k-NN classifier

The Wasserstein distance provides a notion of dissimilarities between pr...
research
07/01/2014

A Bayes consistent 1-NN classifier

We show that a simple modification of the 1-nearest neighbor classifier ...
research
06/17/2008

Supervised functional classification: A theoretical remark and some comparisons

The problem of supervised classification (or discrimination) with functi...

Please sign up or login with your details

Forgot password? Click here to reset