Enhanced Nearest Neighbor Classification for Crowdsourcing

by   Jiexin Duan, et al.

In machine learning, crowdsourcing is an economical way to label a large amount of data. However, the noise in the produced labels may deteriorate the accuracy of any classification method applied to the labelled data. We propose an enhanced nearest neighbor classifier (ENN) to overcome this issue. Two algorithms are developed to estimate the worker quality (which is often unknown in practice): one is to construct the estimate based on the denoised worker labels by applying the kNN classifier to the expert data; the other is an iterative algorithm that works even without access to the expert data. Other than strong numerical evidence, our proposed methods are proven to achieve the same regret as its oracle version based on high-quality expert data. As a technical by-product, a lower bound on the sample size assigned to each worker to reach the optimal convergence rate of regret is derived.



page 11

page 13

page 18


Rates of Convergence for Large-scale Nearest Neighbor Classification

Nearest neighbor is a popular class of classification methods with many ...

Distributed Nearest Neighbor Classification

Nearest neighbor is a popular nonparametric method for classification an...

Stabilized Nearest Neighbor Classifier and Its Statistical Properties

The stability of statistical analysis is an important indicator for repr...

Distributed Adaptive Nearest Neighbor Classifier: Algorithm and Theory

When data is of an extraordinarily large size or physically stored in di...

Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression

k Nearest Neighbor (kNN) method is a simple and popular statistical meth...

Evaluating Classifiers Without Expert Labels

This paper considers the challenge of evaluating a set of classifiers, a...

Deep k-NN for Noisy Labels

Modern machine learning models are often trained on examples with noisy ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.