Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias

08/01/2023
by   Itay Itzhak, et al.
0

Recent studies show that instruction tuning and learning from human feedback improve the abilities of large language models (LMs) dramatically. While these tuning methods can make models generate high-quality text, we conjecture that more implicit cognitive biases may arise in these fine-tuned models. Our work provides evidence that these fine-tuned models exhibit biases that were absent or less pronounced in their pretrained predecessors. We examine the extent of this phenomenon in three cognitive biases - the decoy effect, the certainty effect, and the belief bias - all of which are known to influence human decision-making and reasoning. Our findings highlight the presence of these biases in various models, especially those that have undergone instruction tuning, such as Flan-T5, GPT3.5, and GPT4. This research constitutes a step toward comprehending cognitive biases in instruction-tuned LMs, which is crucial for the development of more reliable and unbiased language models.

READ FULL TEXT

page 1

page 8

page 10

page 19

page 20

research
07/19/2023

Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

As the breadth and depth of language model applications continue to expa...
research
12/15/2021

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Detecting social bias in text is challenging due to nuance, subjectivity...
research
09/07/2023

OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs

Instruction-tuned Large Language Models (LLMs) have recently showcased r...
research
07/08/2023

Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators

Large language models that exhibit instruction-following behaviour repre...
research
06/12/2023

On the Amplification of Linguistic Bias through Unintentional Self-reinforcement Learning by Generative Language Models – A Perspective

Generative Language Models (GLMs) have the potential to significantly sh...
research
09/05/2023

Making Large Language Models Better Reasoners with Alignment

Reasoning is a cognitive process of using evidence to reach a sound conc...
research
07/24/2023

Interpretable Stereotype Identification through Reasoning

Given that language models are trained on vast datasets that may contain...

Please sign up or login with your details

Forgot password? Click here to reset