CODA-19: Reliably Annotating Research Aspects on 10,000+ CORD-19 Abstracts Using a Non-Expert Crowd

05/05/2020
by   Ting-Hao Kenneth Huang, et al.
0

This paper introduces CODA-19, a human-annotated dataset that codes the Background, Purpose, Method, Finding/Contribution, and Other sections of 10,966 English abstracts in the COVID-19 Open Research Dataset. CODA-19 was created by 248 crowd workers from Amazon Mechanical Turk within 10 days, achieving a label quality comparable to that of experts. Each abstract was annotated by nine different workers, and the final labels were obtained by majority vote. The inter-annotator agreement (Cohen's kappa) between the crowd and the biomedical expert (0.741) is comparable to inter-expert agreement (0.788). CODA-19's labels have an accuracy of 82.2 labels, while the accuracy between experts was 85.0 annotations help scientists to understand the rapidly accelerating coronavirus literature and also serve as the battery of AI/NLP research, but obtaining expert annotations can be slow. We demonstrated that a non-expert crowd can be rapidly employed at scale to join the fight against COVID-19.

READ FULL TEXT
research
05/05/2020

CODA-19: Reliably Annotating Research Aspects on 10,000+ CORD-19 Abstracts Using Non-Expert Crowd

This paper introduces CODA-19, a human-annotated dataset that denotes th...
research
11/13/2020

End-to-End Learning from Noisy Crowd to Supervised Machine Learning Models

Labeling real-world datasets is time consuming but indispensable for sup...
research
10/27/2021

IndoNLI: A Natural Language Inference Dataset for Indonesian

We present IndoNLI, the first human-elicited NLI dataset for Indonesian....
research
08/21/2020

Automating the assessment of biofouling in images using expert agreement as a gold standard

Biofouling is the accumulation of organisms on surfaces immersed in wate...
research
04/12/2017

Real-time On-Demand Crowd-powered Entity Extraction

Output-agreement mechanisms such as ESP Game have been widely used in hu...
research
12/15/2021

Expert and Crowd-Guided Affect Annotation and Prediction

We employ crowdsourcing to acquire time-continuous affective annotations...
research
08/25/2015

Visualizing NLP annotations for Crowdsourcing

Visualizing NLP annotation is useful for the collection of training data...

Please sign up or login with your details

Forgot password? Click here to reset