Leveraging Label Variation in Large Language Models for Zero-Shot Text Classification

The zero-shot learning capabilities of large language models (LLMs) make them ideal for text classification without annotation or supervised training. Many studies have shown impressive results across multiple tasks. While tasks, data, and results differ widely, their similarities to human annotation can aid us in tackling new tasks with minimal expenses. We evaluate using 5 state-of-the-art LLMs as "annotators" on 5 different tasks (age, gender, topic, sentiment prediction, and hate speech detection), across 4 languages: English, French, German, and Spanish. No single model excels at all tasks, across languages, or across all labels within a task. However, aggregation techniques designed for human annotators perform substantially better than any one individual model. Overall, though, LLMs do not rival even simple supervised models, so they do not (yet) replace the need for human annotation. We also discuss the tradeoffs between speed, accuracy, cost, and bias when it comes to aggregated model labeling versus human annotation.

READ FULL TEXT
research
05/03/2023

The Benefits of Label-Description Training for Zero-Shot Text Classification

Large language models have improved zero-shot text classification by all...
research
04/17/2023

Testing the Reliability of ChatGPT for Text Annotation and Classification: A Cautionary Remark

Recent studies have demonstrated promising potential of ChatGPT for vari...
research
05/22/2023

Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media

Automated stance detection and related machine learning methods can prov...
research
03/13/2023

Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification

This case study investigates the task of job classification in a real-wo...
research
10/23/2022

Conformal Predictor for Improving Zero-shot Text Classification Efficiency

Pre-trained language models (PLMs) have been shown effective for zero-sh...
research
04/04/2020

Knowledge Guided Metric Learning for Few-Shot Text Classification

The training of deep-learning-based text classification models relies he...
research
12/01/2022

Improving Zero-Shot Models with Label Distribution Priors

Labeling large image datasets with attributes such as facial age or obje...

Please sign up or login with your details

Forgot password? Click here to reset