Learning Effective Embeddings From Crowdsourced Labels: An Educational Case Study

07/18/2019
by   Guowei Xu, et al.
0

Learning representation has been proven to be helpful in numerous machine learning tasks. The success of the majority of existing representation learning approaches often requires a large amount of consistent and noise-free labels. However, labels are not accessible in many real-world scenarios and they are usually annotated by the crowds. In practice, the crowdsourced labels are usually inconsistent among crowd workers given their diverse expertise and the number of crowdsourced labels is very limited. Thus, directly adopting crowdsourced labels for existing representation learning algorithms is inappropriate and suboptimal. In this paper, we investigate the above problem and propose a novel framework of Representation Learning with crowdsourced Labels, i.e., "RLL", which learns representation of data with crowdsourced labels by jointly and coherently solving the challenges introduced by limited and inconsistent labels. The proposed representation learning framework is evaluated in two real-world education applications. The experimental results demonstrate the benefits of our approach on learning representation from limited labeled data from the crowds, and show RLL is able to outperform state-of-the-art baselines. Moreover, detailed experiments are conducted on RLL to fully understand its key components and the corresponding performance.

READ FULL TEXT
research
09/23/2020

Representation Learning from Limited Educational Data with Crowdsourced Labels

Representation learning has been proven to play an important role in the...
research
03/21/2020

NeuCrowd: Neural Sampling Network for Representation Learning with Crowdsourced Labels

Representation learning approaches require a massive amount of discrimin...
research
07/15/2021

Temporal-aware Language Representation Learning From Crowdsourced Labels

Learning effective language representations from crowdsourced labels is ...
research
05/28/2023

Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery

Generalized intent discovery aims to extend a closed-set in-domain inten...
research
11/09/2020

A Survey of Label-noise Representation Learning: Past, Present and Future

Classical machine learning implicitly assumes that labels of the trainin...
research
08/18/2022

Representation Learning for the Automatic Indexing of Sound Effects Libraries

Labeling and maintaining a commercial sound effects library is a time-co...
research
04/08/2021

Detecting of a Patient's Condition From Clinical Narratives Using Natural Language Representation

This paper proposes a joint clinical natural language representation lea...

Please sign up or login with your details

Forgot password? Click here to reset