Learning Ambiguity from Crowd Sequential Annotations

01/04/2023
by   Xiaolei Lu, et al.
0

Most crowdsourcing learning methods treat disagreement between annotators as noisy labelings while inter-disagreement among experts is often a good indicator for the ambiguity and uncertainty that is inherent in natural language. In this paper, we propose a framework called Learning Ambiguity from Crowd Sequential Annotations (LA-SCA) to explore the inter-disagreement between reliable annotators and effectively preserve confusing label information. First, a hierarchical Bayesian model is developed to infer ground-truth from crowds and group the annotators with similar reliability together. By modeling the relationship between the size of group the annotator involved in, the annotator's reliability and element's unambiguity in each sequence, inter-disagreement between reliable annotators on ambiguous elements is computed to obtain label confusing information that is incorporated to cost-sensitive sequence labeling. Experimental results on POS tagging and NER tasks show that our proposed framework achieves competitive performance in inferring ground-truth from crowds and predicting unknown sequences, and interpreting hierarchical clustering results helps discover labeling patterns of annotators with similar reliability.

READ FULL TEXT
research
09/20/2022

Modeling sequential annotations for sequence labeling with crowds

Crowd sequential annotations can be an efficient and cost-effective way ...
research
01/09/2017

Crowdsourcing Ground Truth for Medical Relation Extraction

Cognitive computing systems require human labeled data for evaluation, a...
research
09/20/2022

Partial sequence labeling with structured Gaussian Processes

Existing partial sequence labeling models mainly focus on max-margin fra...
research
04/26/2018

Weak Labeling for Crowd Learning

Crowdsourcing has become very popular among the machine learning communi...
research
09/24/2021

Rethinking Crowd Sourcing for Semantic Similarity

Estimation of semantic similarity is crucial for a variety of natural la...
research
10/09/2019

Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling

Sequence labeling is a fundamental framework for various natural languag...
research
12/31/2022

Approaching Peak Ground Truth

Machine learning models are typically evaluated by computing similarity ...

Please sign up or login with your details

Forgot password? Click here to reset