How (Not) to Use Sociodemographic Information for Subjective NLP Tasks

09/13/2023
by   Tilman Beck, et al.
0

Annotators' sociodemographic backgrounds (i.e., the individual compositions of their gender, age, educational background, etc.) have a strong impact on their decisions when working on subjective NLP tasks, such as hate speech detection. Often, heterogeneous backgrounds result in high disagreements. To model this variation, recent work has explored sociodemographic prompting, a technique, which steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give. However, the available NLP literature disagrees on the efficacy of this technique – it remains unclear, for which tasks and scenarios it can help and evaluations are limited to specific tasks only. We address this research gap by presenting the largest and most comprehensive study of sociodemographic prompting today. Concretely, we evaluate several prompt formulations across seven datasets and six instruction-tuned model families. We find that (1) while sociodemographic prompting can be beneficial for improving zero-shot learning in subjective NLP tasks, (2) its outcomes largely vary for different model types, sizes, and datasets, (3) are subject to large variance with regards to prompt formulations. Thus, sociodemographic prompting is not a reliable proxy for traditional data annotation with a sociodemographically heterogeneous group of annotators. Instead, we propose (4) to use it for identifying ambiguous instances resulting in more informed annotation efforts.

READ FULL TEXT

page 8

page 18

page 19

page 22

research
05/21/2023

GPT-3.5 vs GPT-4: Evaluating ChatGPT's Reasoning Performance in Zero-shot Learning

Large Language Models (LLMs) have exhibited remarkable performance on va...
research
12/20/2022

Is GPT-3 a Good Data Annotator?

GPT-3 (Generative Pre-trained Transformer 3) is a large-scale autoregres...
research
06/12/2023

Gradient Ascent Post-training Enhances Language Model Generalization

In this work, we empirically show that updating pretrained LMs (350M, 1....
research
04/28/2023

SemEval-2023 Task 11: Learning With Disagreements (LeWiDi)

NLP datasets annotated with human judgments are rife with disagreements ...
research
06/20/2023

The Ecological Fallacy in Annotation: Modelling Human Label Variation goes beyond Sociodemographics

Many NLP tasks exhibit human label variation, where different annotators...
research
05/11/2023

KGA: A General Machine Unlearning Framework Based on Knowledge Gap Alignment

Recent legislation of the "right to be forgotten" has led to the interes...
research
11/15/2022

A Universal Discriminator for Zero-Shot Generalization

Generative modeling has been the dominant approach for large-scale pretr...

Please sign up or login with your details

Forgot password? Click here to reset