Empirical Methodology for Crowdsourcing Ground Truth

09/24/2018
by   Anca Dumitrache, et al.
0

The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods for populating the Semantic Web. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, in many domains, such as event detection, there is ambiguity in the data, as well as a multitude of perspectives of the information examples. We present an empirically derived methodology for efficiently gathering of ground truth data in a diverse set of use cases covering a variety of domains and annotation tasks. Central to our approach is the use of CrowdTruth metrics that capture inter-annotator disagreement. We show that measuring disagreement is essential for acquiring a high quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with majority vote, over a set of diverse crowdsourcing tasks: Medical Relation Extraction, Twitter Event Identification, News Event Extraction and Sound Interpretation. We also show that an increased number of crowd workers leads to growth and stabilization in the quality of annotations, going against the usual practice of employing a small number of annotators.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2018

CrowdTruth 2.0: Quality Metrics for Crowdsourcing with Disagreement

Typically crowdsourcing-based approaches to gather annotated data use in...
research
12/25/2020

Distributional Ground Truth: Non-Redundant Crowdsourcing Data Quality Control in UI Labeling Tasks

HCI increasingly employs Machine Learning and Image Recognition, in part...
research
10/03/2021

Unsupervised paradigm for information extraction from transcripts using BERT

Audio call transcripts are one of the valuable sources of information fo...
research
01/09/2017

Crowdsourcing Ground Truth for Medical Relation Extraction

Cognitive computing systems require human labeled data for evaluation, a...
research
10/12/2021

On Releasing Annotator-Level Labels and Information in Datasets

A common practice in building NLP datasets, especially using crowd-sourc...
research
06/02/2021

Survey Equivalence: A Procedure for Measuring Classifier Accuracy Against Human Labels

In many classification tasks, the ground truth is either noisy or subjec...
research
12/08/2020

Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution

This paper develops and implements a scalable methodology for (a) estima...

Please sign up or login with your details

Forgot password? Click here to reset