A Survey of Label-noise Representation Learning: Past, Present and Future

11/09/2020
by   Bo Han, et al.
0

Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios. However, statistical learning-based methods may not train deep learning models robustly with these noisy labels. Therefore, it is urgent to design Label-Noise Representation Learning (LNRL) methods for robustly training deep models with noisy labels. To fully understand LNRL, we conduct a survey study. We first clarify a formal definition for LNRL from the perspective of machine learning. Then, via the lens of learning theory and empirical study, we figure out why noisy labels affect deep models' performance. Based on the theoretical guidance, we categorize different LNRL methods into three directions. Under this unified taxonomy, we provide a thorough discussion of the pros and cons of different categories. More importantly, we summarize the essential components of robust LNRL, which can spark new directions. Lastly, we propose possible research directions within LNRL, such as new datasets, instance-dependent LNRL, and adversarial LNRL. Finally, we envision potential directions beyond LNRL, such as learning with feature-noise, preference-noise, domain-noise, similarity-noise, graph-noise, and demonstration-noise.

READ FULL TEXT
research
07/16/2020

Learning from Noisy Labels with Deep Neural Networks: A Survey

Deep learning has achieved remarkable success in numerous domains with h...
research
04/26/2020

Deep k-NN for Noisy Labels

Modern machine learning models are often trained on examples with noisy ...
research
07/15/2022

Algorithms to estimate Shapley value feature attributions

Feature attributions based on the Shapley value are popular for explaini...
research
07/17/2023

A General Framework for Learning under Corruption: Label Noise, Attribute Noise, and Beyond

Corruption is frequently observed in collected data and has been extensi...
research
10/08/2022

A Survey on Extreme Multi-label Learning

Multi-label learning has attracted significant attention from both acade...
research
08/31/2021

Towards Out-Of-Distribution Generalization: A Survey

Classic machine learning methods are built on the i.i.d. assumption that...
research
07/18/2019

Learning Effective Embeddings From Crowdsourced Labels: An Educational Case Study

Learning representation has been proven to be helpful in numerous machin...

Please sign up or login with your details

Forgot password? Click here to reset