Label noise detection under the Noise at Random model with ensemble filters

12/02/2021
by   Kecia G. Moura, et al.
0

Label noise detection has been widely studied in Machine Learning because of its importance in improving training data quality. Satisfactory noise detection has been achieved by adopting ensembles of classifiers. In this approach, an instance is assigned as mislabeled if a high proportion of members in the pool misclassifies it. Previous authors have empirically evaluated this approach; nevertheless, they mostly assumed that label noise is generated completely at random in a dataset. This is a strong assumption since other types of label noise are feasible in practice and can influence noise detection results. This work investigates the performance of ensemble noise detection under two different noise models: the Noisy at Random (NAR), in which the probability of label noise depends on the instance class, in comparison to the Noisy Completely at Random model, in which the probability of label noise is entirely independent. In this setting, we investigate the effect of class distribution on noise detection performance since it changes the total noise level observed in a dataset under the NAR assumption. Further, an evaluation of the ensemble vote threshold is conducted to contrast with the most common approaches in the literature. In many performed experiments, choosing a noise generation model over another can lead to different results when considering aspects such as class imbalance and noise level ratio among different classes.

READ FULL TEXT
research
06/02/2022

Robustness to Label Noise Depends on the Shape of the Noise Distribution in Feature Space

Machine learning classifiers have been demonstrated, both empirically an...
research
09/23/2020

Using Under-trained Deep Ensembles to Learn Under Extreme Label Noise

Improper or erroneous labelling can pose a hindrance to reliable general...
research
04/16/2022

IIFNet: A Fusion based Intelligent Service for Noisy Preamble Detection in 6G

In this article, we present our vision of preamble detection in a physic...
research
04/20/2018

An Ensemble Generation MethodBased on Instance Hardness

In Machine Learning, ensemble methods have been receiving a great deal o...
research
04/20/2018

An Ensemble Generation Method Based on Instance Hardness

In Machine Learning, ensemble methods have been receiving a great deal o...
research
10/18/2021

Noise-Resilient Ensemble Learning using Evidence Accumulation Clustering

Ensemble Learning methods combine multiple algorithms performing the sam...
research
05/22/2021

Two-stage Training for Learning from Label Proportions

Learning from label proportions (LLP) aims at learning an instance-level...

Please sign up or login with your details

Forgot password? Click here to reset