Stabilized Nearest Neighbor Classifier and Its Statistical Properties

05/26/2014
by   Wei Sun, et al.
0

The stability of statistical analysis is an important indicator for reproducibility, which is one main principle of scientific method. It entails that similar statistical conclusions can be reached based on independent samples from the same underlying population. In this paper, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method. Interestingly, the asymptotic CIS of any weighted nearest neighbor classifier turns out to be proportional to the Euclidean norm of its weight vector. Based on this concise form, we propose a stabilized nearest neighbor (SNN) classifier, which distinguishes itself from other nearest neighbor classifiers, by taking the stability into consideration. In theory, we prove that SNN attains the minimax optimal convergence rate in risk, and a sharp convergence rate in CIS. The latter rate result is established for general plug-in classifiers under a low-noise condition. Extensive simulated and real examples demonstrate that SNN achieves a considerable improvement in CIS over existing nearest neighbor classifiers, with comparable classification accuracy. We implement the algorithm in a publicly available R package snn.

READ FULL TEXT

page 23

page 24

research
09/03/2019

Rates of Convergence for Large-scale Nearest Neighbor Classification

Nearest neighbor is a popular class of classification methods with many ...
research
06/22/2022

Nearest Neighbor Classification based on Imbalanced Data: A Statistical Approach

In a classification problem, where the competing classes are not of comp...
research
08/20/2019

Multi-hypothesis classifier

Accuracy is the most important parameter among few others which defines ...
research
11/19/2022

A Two-Stage Active Learning Algorithm for k-Nearest Neighbors

We introduce a simple and intuitive two-stage active learning algorithm ...
research
05/29/2019

An adaptive nearest neighbor rule for classification

We introduce a variant of the k-nearest neighbor classifier in which k i...
research
08/03/2023

Minimax Optimal Q Learning with Nearest Neighbors

Q learning is a popular model free reinforcement learning method. Most o...
research
02/26/2022

Enhanced Nearest Neighbor Classification for Crowdsourcing

In machine learning, crowdsourcing is an economical way to label a large...

Please sign up or login with your details

Forgot password? Click here to reset