Enhanced Nearest Neighbor Classification for Crowdsourcing

02/26/2022
by   Jiexin Duan, et al.
0

In machine learning, crowdsourcing is an economical way to label a large amount of data. However, the noise in the produced labels may deteriorate the accuracy of any classification method applied to the labelled data. We propose an enhanced nearest neighbor classifier (ENN) to overcome this issue. Two algorithms are developed to estimate the worker quality (which is often unknown in practice): one is to construct the estimate based on the denoised worker labels by applying the kNN classifier to the expert data; the other is an iterative algorithm that works even without access to the expert data. Other than strong numerical evidence, our proposed methods are proven to achieve the same regret as its oracle version based on high-quality expert data. As a technical by-product, a lower bound on the sample size assigned to each worker to reach the optimal convergence rate of regret is derived.

READ FULL TEXT

Authors

page 11

page 13

page 18

09/03/2019

Rates of Convergence for Large-scale Nearest Neighbor Classification

Nearest neighbor is a popular class of classification methods with many ...
12/12/2018

Distributed Nearest Neighbor Classification

Nearest neighbor is a popular nonparametric method for classification an...
05/26/2014

Stabilized Nearest Neighbor Classifier and Its Statistical Properties

The stability of statistical analysis is an important indicator for repr...
05/20/2021

Distributed Adaptive Nearest Neighbor Classifier: Algorithm and Theory

When data is of an extraordinarily large size or physically stored in di...
10/22/2019

Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression

k Nearest Neighbor (kNN) method is a simple and popular statistical meth...
12/05/2012

Evaluating Classifiers Without Expert Labels

This paper considers the challenge of evaluating a set of classifiers, a...
04/26/2020

Deep k-NN for Noisy Labels

Modern machine learning models are often trained on examples with noisy ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.