Identifying Chinese Opinion Expressions with Extremely-Noisy Crowdsourcing Annotations

04/22/2022
by   Xin Zhang, et al.
0

Recent works of opinion expression identification (OEI) rely heavily on the quality and scale of the manually-constructed training corpus, which could be extremely difficult to satisfy. Crowdsourcing is one practical solution for this problem, aiming to create a large-scale but quality-unguaranteed corpus. In this work, we investigate Chinese OEI with extremely-noisy crowdsourcing annotations, constructing a dataset at a very low cost. Following zhang et al. (2021), we train the annotator-adapter model by regarding all annotations as gold-standard in terms of crowd annotators, and test the model by using a synthetic expert, which is a mixture of all annotators. As this annotator-mixture for testing is never modeled explicitly in the training phase, we propose to generate synthetic training samples by a pertinent mixup strategy to make the training and testing highly consistent. The simulation experiments on our constructed dataset show that crowdsourcing is highly promising for OEI, and our proposed annotator-mixup can further enhance the crowdsourcing modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2021

Crowdsourcing Learning as Domain Adaptation: A Case Study on Named Entity Recognition

Crowdsourcing is regarded as one prospective solution for effective supe...
research
05/05/2017

Crowdsourcing Argumentation Structures in Chinese Hotel Reviews

Argumentation mining aims at automatically extracting the premises-claim...
research
11/08/2019

Crowdsourcing a High-Quality Gold Standard for QA-SRL

Question-answer driven Semantic Role Labeling (QA-SRL) has been proposed...
research
03/10/2019

Deep Robust Subjective Visual Property Prediction in Crowdsourcing

The problem of estimating subjective visual properties (SVP) of images (...
research
11/25/2016

Clickstream analysis for crowd-based object segmentation with confidence

With the rapidly increasing interest in machine learning based solutions...
research
01/16/2018

Adversarial Learning for Chinese NER from Crowd Annotations

To quickly obtain new labeled data, we can choose crowdsourcing as an al...
research
10/25/2020

A Crowdsourcing Extension of the ITU-T Recommendation P.835 with Validation

The quality of the speech communication systems, which include noise sup...

Please sign up or login with your details

Forgot password? Click here to reset