Inducing anxiety in large language models increases exploration and bias

04/21/2023
by   Julian Coda-Forno, et al.
0

Large language models are transforming research on machine learning while galvanizing public debates. Understanding not only when these models work well and succeed but also why they fail and misbehave is of great societal relevance. We propose to turn the lens of computational psychiatry, a framework used to computationally describe and modify aberrant behavior, to the outputs produced by these models. We focus on the Generative Pre-Trained Transformer 3.5 and subject it to tasks commonly studied in psychiatry. Our results show that GPT-3.5 responds robustly to a common anxiety questionnaire, producing higher anxiety scores than human subjects. Moreover, GPT-3.5's responses can be predictably changed by using emotion-inducing prompts. Emotion-induction not only influences GPT-3.5's behavior in a cognitive task measuring exploratory decision-making but also influences its behavior in a previously-established task measuring biases such as racism and ableism. Crucially, GPT-3.5 shows a strong increase in biases when prompted with anxiety-inducing text. Thus, it is likely that how prompts are communicated to large language models has a strong influence on their behavior in applied settings. These results progress our understanding of prompt engineering and demonstrate the usefulness of methods taken from computational psychiatry for studying the capable algorithms to which we increasingly delegate authority and autonomy.

READ FULL TEXT

page 1

page 4

page 7

research
06/06/2023

Turning large language models into cognitive models

Large language models are powerful systems that excel at many tasks, ran...
research
10/06/2020

On the Branching Bias of Syntax Extracted from Pre-trained Language Models

Many efforts have been devoted to extracting constituency trees from pre...
research
04/18/2021

Worst of Both Worlds: Biases Compound in Pre-trained Vision-and-Language Models

Numerous works have analyzed biases in vision and pre-trained language m...
research
05/11/2023

Think Twice: Measuring the Efficiency of Eliminating Prediction Shortcuts of Question Answering Models

While the Large Language Models (LLMs) dominate a majority of language u...
research
02/24/2022

Capturing Failures of Large Language Models via Human Cognitive Biases

Large language models generate complex, open-ended outputs: instead of o...
research
03/25/2022

GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models

Deep learning (DL) techniques involving fine-tuning large numbers of mod...

Please sign up or login with your details

Forgot password? Click here to reset