DeepAI AI Chat
Log In Sign Up

Identifying Chinese Opinion Expressions with Extremely-Noisy Crowdsourcing Annotations

04/22/2022
by   Xin Zhang, et al.
0

Recent works of opinion expression identification (OEI) rely heavily on the quality and scale of the manually-constructed training corpus, which could be extremely difficult to satisfy. Crowdsourcing is one practical solution for this problem, aiming to create a large-scale but quality-unguaranteed corpus. In this work, we investigate Chinese OEI with extremely-noisy crowdsourcing annotations, constructing a dataset at a very low cost. Following zhang et al. (2021), we train the annotator-adapter model by regarding all annotations as gold-standard in terms of crowd annotators, and test the model by using a synthetic expert, which is a mixture of all annotators. As this annotator-mixture for testing is never modeled explicitly in the training phase, we propose to generate synthetic training samples by a pertinent mixup strategy to make the training and testing highly consistent. The simulation experiments on our constructed dataset show that crowdsourcing is highly promising for OEI, and our proposed annotator-mixup can further enhance the crowdsourcing modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/31/2021

Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

Crowdsourcing is regarded as one prospective solution for effective supe...
05/05/2017

Crowdsourcing Argumentation Structures in Chinese Hotel Reviews

Argumentation mining aims at automatically extracting the premises-claim...
11/08/2019

Crowdsourcing a High-Quality Gold Standard for QA-SRL

Question-answer driven Semantic Role Labeling (QA-SRL) has been proposed...
03/10/2019

Deep Robust Subjective Visual Property Prediction in Crowdsourcing

The problem of estimating subjective visual properties (SVP) of images (...
03/31/2021

CrowdTeacher: Robust Co-teaching with Noisy Answers Sample-specific Perturbations for Tabular Data

Samples with ground truth labels may not always be available in numerous...
01/16/2018

Adversarial Learning for Chinese NER from Crowd Annotations

To quickly obtain new labeled data, we can choose crowdsourcing as an al...
10/25/2020

A Crowdsourcing Extension of the ITU-T Recommendation P.835 with Validation

The quality of the speech communication systems, which include noise sup...