Leveraging Crowdsourcing Data For Deep Active Learning - An Application: Learning Intents in Alexa

03/12/2018
by   Jie Yang, et al.
0

This paper presents a generic Bayesian framework that enables any deep learning model to actively learn from targeted crowds. Our framework inherits from recent advances in Bayesian deep learning, and extends existing work by considering the targeted crowdsourcing approach, where multiple annotators with unknown expertise contribute an uncontrolled amount (often limited) of annotations. Our framework leverages the low-rank structure in annotations to learn individual annotator expertise, which then helps to infer the true labels from noisy and sparse annotations. It provides a unified Bayesian model to simultaneously infer the true labels and train the deep learning model in order to reach an optimal learning efficacy. Finally, our framework exploits the uncertainty of the deep learning model during prediction as well as the annotators' estimated expertise to minimize the number of required annotations and annotators for optimally training the deep learning model. We evaluate the effectiveness of our framework for intent classification in Alexa (Amazon's personal assistant), using both synthetic and real-world datasets. Experiments show that our framework can accurately learn annotator expertise, infer true labels, and effectively reduce the amount of annotations in model training as compared to state-of-the-art approaches. We further discuss the potential of our proposed framework in bridging machine learning and crowdsourcing towards improved human-in-the-loop systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/01/2021

Robust Deep Learning from Crowds with Belief Propagation

Crowdsourcing systems enable us to collect noisy labels from crowd worke...
research
06/01/2020

Variational Bayesian Inference for Crowdsourcing Predictions

Crowdsourcing has emerged as an effective means for performing a number ...
research
04/07/2020

Learning from Imperfect Annotations

Many machine learning systems today are trained on large amounts of huma...
research
03/31/2021

CrowdTeacher: Robust Co-teaching with Noisy Answers Sample-specific Perturbations for Tabular Data

Samples with ground truth labels may not always be available in numerous...
research
04/30/2013

Inferring ground truth from multi-annotator ordinal data: a probabilistic approach

A popular approach for large scale data annotation tasks is crowdsourcin...
research
09/23/2020

Representation Learning from Limited Educational Data with Crowdsourced Labels

Representation learning has been proven to play an important role in the...
research
07/18/2014

Bayesian Nonparametric Crowdsourcing

Crowdsourcing has been proven to be an effective and efficient tool to a...

Please sign up or login with your details

Forgot password? Click here to reset