AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

03/29/2023
by   Xingwei He, et al.
0

Many natural language processing (NLP) tasks rely on labeled data to train machine learning models to achieve high performance. However, data annotation can be a time-consuming and expensive process, especially when the task involves a large amount of data or requires specialized domains. Recently, GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks. In this paper, we first claim that large language models (LLMs), such as GPT-3.5, can serve as an excellent crowdsourced annotator by providing them with sufficient guidance and demonstrated examples. To make LLMs to be better annotators, we propose a two-step approach, 'explain-then-annotate'. To be more precise, we begin by creating prompts for every demonstrated example, which we subsequently utilize to prompt a LLM to provide an explanation for why the specific ground truth answer/label was chosen for that particular example. Following this, we construct the few-shot chain-of-thought prompt with the self-generated explanation and employ it to annotate the unlabeled data. We conduct experiments on three tasks, including user input and keyword relevance assessment, BoolQ and WiC. The annotation results from GPT-3.5 surpasses those from crowdsourced annotation for user input and keyword relevance assessment. Additionally, for the other two tasks, GPT-3.5 achieves results that are comparable to those obtained through crowdsourced annotation.

READ FULL TEXT

page 3

page 18

research
02/08/2023

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

Spurred by advancements in scale, large language models (LLMs) have demo...
research
09/14/2023

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing

Large language models (LLMs) have shown remarkable capabilities in Natur...
research
12/20/2022

Is GPT-3 a Good Data Annotator?

GPT-3 (Generative Pre-trained Transformer 3) is a large-scale autoregres...
research
09/20/2023

Making Small Language Models Better Multi-task Learners with Mixture-of-Task-Adapters

Recently, Large Language Models (LLMs) have achieved amazing zero-shot l...
research
02/03/2023

Towards Few-Shot Identification of Morality Frames using In-Context Learning

Data scarcity is a common problem in NLP, especially when the annotation...
research
07/20/2023

Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

Recent work has shown that language models' (LMs) prompt-based learning ...
research
05/24/2023

Using Natural Language Explanations to Rescale Human Judgments

The rise of large language models (LLMs) has brought a critical need for...

Please sign up or login with your details

Forgot password? Click here to reset