NeuCrowd: Neural Sampling Network for Representation Learning with Crowdsourced Labels

03/21/2020
by   Yang Hao, et al.
0

Representation learning approaches require a massive amount of discriminative training data, which is unavailable in many scenarios, such as healthcare, small city, education, etc. In practice, people refer to crowdsourcing to get annotated labels. However, due to issues like data privacy, budget limitation, shortage of domain-specific annotators, the number of crowdsourced labels are still very limited. Moreover, because of annotators' diverse expertises, crowdsourced labels are often inconsistent. Thus, directly applying existing representation learning algorithms may easily get the overfitting problem and yield suboptimal solutions. In this paper, we propose NeuCrowd, a unified framework for representation learning from crowdsourced labels. The proposed framework (1) creates a sufficient number of high-quality n-tuplet training samples by utilizing safety-aware sampling and robust anchor generation; and (2) automatically learns a neural sampling network that adaptively learns to select effective samples for representation learning network. The proposed framework is evaluated on both synthetic and real-world data sets. The results show that our approach outperforms a wide range of state-of-the-art baselines in terms of prediction accuracy and AUC[%s].

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2020

Representation Learning from Limited Educational Data with Crowdsourced Labels

Representation learning has been proven to play an important role in the...
research
07/15/2021

Temporal-aware Language Representation Learning From Crowdsourced Labels

Learning effective language representations from crowdsourced labels is ...
research
07/18/2019

Learning Effective Embeddings From Crowdsourced Labels: An Educational Case Study

Learning representation has been proven to be helpful in numerous machin...
research
06/07/2021

Multi-task Transformation Learning for Robust Out-of-Distribution Detection

Detecting out-of-distribution (OOD) samples plays a key role in open-wor...
research
12/02/2022

Generative Reasoning Integrated Label Noise Robust Deep Image Representation Learning in Remote Sensing

The development of deep learning based image representation learning (IR...
research
04/08/2021

Detecting of a Patient's Condition From Clinical Narratives Using Natural Language Representation

This paper proposes a joint clinical natural language representation lea...
research
08/08/2023

PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning

Synthetic image datasets offer unmatched advantages for designing and ev...

Please sign up or login with your details

Forgot password? Click here to reset