Optimal choice of k for k-nearest neighbor regression

09/12/2019
by   Mona Azadkia, et al.
0

The k-nearest neighbor algorithm (k-NN) is a widely used non-parametric method for classification and regression. We study the mean squared error of the k-NN estimator when k is chosen by leave-one-out cross-validation (LOOCV). Although it was known that this choice of k is asymptotically consistent, it was not known previously that it is an optimal k. We show, with high probability, the mean squared error of this estimator is close to the minimum mean squared error using the k-NN estimate, where the minimum is over all choices of k.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2021

How many samples are needed to reliably approximate the best linear estimator for a linear inverse problem?

The linear minimum mean squared error (LMMSE) estimator is the best line...
research
08/24/2022

Metric Effects based on Fluctuations in values of k in Nearest Neighbor Regressor

Regression branch of Machine Learning purely focuses on prediction of co...
research
03/30/2021

Generalized Linear Tree Space Nearest Neighbor

We present a novel method of stacking decision trees by projection into ...
research
07/14/2018

Non-separable Nearest-Neighbor Gaussian Process Model for Antarctic Surface Mass Balance and Ice Core Site Selection

Surface mass balance (SMB) is an important factor in the estimation of s...
research
08/01/2019

Modeling Daily Pan Evaporation in Humid Climates Using Gaussian Process Regression

Evaporation is one of the main processes in the hydrological cycle, and ...
research
08/04/2021

The Theory of Perfect Learning

The perfect learning exists. We mean a learning model that can be genera...
research
12/19/2010

Empirical estimation of entropy functionals with confidence

This paper introduces a class of k-nearest neighbor (k-NN) estimators ca...

Please sign up or login with your details

Forgot password? Click here to reset