ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks

03/27/2023
by   Fabrizio Gilardi, et al.
0

Many NLP applications require manual data annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd-workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using a sample of 2,382 tweets, we demonstrate that ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frames detection. Specifically, the zero-shot accuracy of ChatGPT exceeds that of crowd-workers for four out of five tasks, while ChatGPT's intercoder agreement exceeds that of both crowd-workers and trained annotators for all tasks. Moreover, the per-annotation cost of ChatGPT is less than 0.003 – about twenty times cheaper than MTurk. These results show the potential of large language models to drastically increase the efficiency of text classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/05/2023

Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks

This study examines the performance of open-source Large Language Models...
research
06/13/2023

Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks

Large language models (LLMs) are remarkable data annotators. They can be...
research
04/13/2023

ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning

This paper assesses the accuracy, reliability and bias of the Large Lang...
research
09/03/2023

How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets

Crowdsourced annotation is vital to both collecting labelled data to tra...
research
04/10/2022

Re-Examining Human Annotations for Interpretable NLP

Explanation methods in Interpretable NLP often explain the model's decis...
research
02/24/2019

Truth Inference at Scale: A Bayesian Model for Adjudicating Highly Redundant Crowd Annotations

Crowd-sourcing is a cheap and popular means of creating training and eva...
research
01/17/2019

Beyond monetary incentives: experiments in paid microtask contests modelled as continuous-time markov chains

In this paper, we aim to gain a better understanding into how paid micro...

Please sign up or login with your details

Forgot password? Click here to reset