Generation and Analysis of Feature-Dependent Pseudo Noise for Training Deep Neural Networks

by   Sree Ram Kamabattula, et al.

Training Deep neural networks (DNNs) on noisy labeled datasets is a challenging problem, because learning on mislabeled examples deteriorates the performance of the network. As the ground truth availability is limited with real-world noisy datasets, previous papers created synthetic noisy datasets by randomly modifying the labels of training examples of clean datasets. However, no final conclusions can be derived by just using this random noise, since it excludes feature-dependent noise. Thus, it is imperative to generate feature-dependent noisy datasets that additionally provide ground truth. Therefore, we propose an intuitive approach to creating feature-dependent noisy datasets by utilizing the training predictions of DNNs on clean datasets that also retain true label information. We refer to these datasets as "Pseudo Noisy datasets". We conduct several experiments to establish that Pseudo noisy datasets resemble feature-dependent noisy datasets across different conditions. We further randomly generate synthetic noisy datasets with the same noise distribution as that of Pseudo noise (referred as "Randomized Noise") to empirically show that i) learning is easier with feature-dependent label noise compared to random noise, ii) irrespective of noise distribution, Pseudo noisy datasets mimic feature-dependent label noise and iii) current training methods are not generalizable to feature-dependent label noise. Therefore, we believe that Pseudo noisy datasets will be quite helpful to study and develop robust training methods.


Label Noise Types and Their Effects on Deep Learning

The recent success of deep learning is mostly due to the availability of...

An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for Noisy Labels

We consider the problem of training robust and accurate deep neural netw...

Identifying Training Stop Point with Noisy Labeled Data

Training deep neural networks (DNNs) with noisy labels is a challenging ...

Emphasis Regularisation by Gradient Rescaling for Training Deep Neural Networks with Noisy Labels

It is fundamental and challenging to train robust and accurate Deep Neur...

Noisy Concurrent Training for Efficient Learning under Label Noise

Deep neural networks (DNNs) fail to learn effectively under label noise ...

Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

The drastic increase of data quantity often brings the severe decrease o...

Synthetic vs Real: Deep Learning on Controlled Noise

Performing controlled experiments on noisy data is essential in thorough...

Please sign up or login with your details

Forgot password? Click here to reset