Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

07/19/2023
by   Omkar Dige, et al.
0

As the breadth and depth of language model applications continue to expand rapidly, it is increasingly important to build efficient frameworks for measuring and mitigating the learned or inherited social biases of these models. In this paper, we present our work on evaluating instruction fine-tuned language models' ability to identify bias through zero-shot prompting, including Chain-of-Thought (CoT) prompts. Across LLaMA and its two instruction fine-tuned versions, Alpaca 7B performs best on the bias identification task with an accuracy of 56.7 data diversity could lead to further performance gain. This is a work-in-progress presenting the first component of our bias mitigation framework. We will keep updating this work as we get more results.

READ FULL TEXT

page 4

page 8

research
02/28/2023

In-Context Instruction Learning

Instruction learning of Large Language Models (LLMs) has enabled zero-sh...
research
08/01/2023

Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias

Recent studies show that instruction tuning and learning from human feed...
research
12/15/2021

Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

Detecting social bias in text is challenging due to nuance, subjectivity...
research
09/07/2023

OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs

Instruction-tuned Large Language Models (LLMs) have recently showcased r...
research
05/23/2023

Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

Instruction-tuned Large Language Models (LLMs) have exhibited impressive...
research
12/20/2022

Is GPT-3 a Psychopath? Evaluating Large Language Models from a Psychological Perspective

Are large language models (LLMs) like GPT-3 psychologically safe? In thi...
research
09/11/2023

Flesch or Fumble? Evaluating Readability Standard Alignment of Instruction-Tuned Language Models

Readability metrics and standards such as Flesch Kincaid Grade Level (FK...

Please sign up or login with your details

Forgot password? Click here to reset