DeepAI AI Chat
Log In Sign Up

Stabilized Nearest Neighbor Classifier and Its Statistical Properties

by   Wei Sun, et al.
Purdue University
Binghamton University

The stability of statistical analysis is an important indicator for reproducibility, which is one main principle of scientific method. It entails that similar statistical conclusions can be reached based on independent samples from the same underlying population. In this paper, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method. Interestingly, the asymptotic CIS of any weighted nearest neighbor classifier turns out to be proportional to the Euclidean norm of its weight vector. Based on this concise form, we propose a stabilized nearest neighbor (SNN) classifier, which distinguishes itself from other nearest neighbor classifiers, by taking the stability into consideration. In theory, we prove that SNN attains the minimax optimal convergence rate in risk, and a sharp convergence rate in CIS. The latter rate result is established for general plug-in classifiers under a low-noise condition. Extensive simulated and real examples demonstrate that SNN achieves a considerable improvement in CIS over existing nearest neighbor classifiers, with comparable classification accuracy. We implement the algorithm in a publicly available R package snn.


page 23

page 24


Rates of Convergence for Large-scale Nearest Neighbor Classification

Nearest neighbor is a popular class of classification methods with many ...

Nearest Neighbor Classification based on Imbalanced Data: A Statistical Approach

In a classification problem, where the competing classes are not of comp...

Multi-hypothesis classifier

Accuracy is the most important parameter among few others which defines ...

A Two-Stage Active Learning Algorithm for k-Nearest Neighbors

We introduce a simple and intuitive two-stage active learning algorithm ...

An adaptive nearest neighbor rule for classification

We introduce a variant of the k-nearest neighbor classifier in which k i...

Minimax Optimal Q Learning with Nearest Neighbors

Q learning is a popular model free reinforcement learning method. Most o...

Enhanced Nearest Neighbor Classification for Crowdsourcing

In machine learning, crowdsourcing is an economical way to label a large...