In-Contextual Bias Suppression for Large Language Models

09/13/2023
by   Daisuke Oba, et al.
0

Despite their impressive performance in a wide range of NLP tasks, Large Language Models (LLMs) have been reported to encode worrying-levels of gender bias. Prior work has proposed debiasing methods that require human labelled examples, data augmentation and fine-tuning of the LLMs, which are computationally costly. Moreover, one might not even have access to the internal parameters for performing debiasing such as in the case of commercially available LLMs such as GPT-4. To address this challenge we propose bias suppression, a novel alternative to debiasing that does not require access to model parameters. We show that text-based preambles, generated from manually designed templates covering counterfactual statements, can accurately suppress gender biases in LLMs. Moreover, we find that descriptive sentences for occupations can further suppress gender biases. Interestingly, we find that bias suppression has a minimal adverse effect on downstream task performance, while effectively mitigating the gender biases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2023

Politeness Stereotypes and Attack Vectors: Gender Stereotypes in Japanese and Korean Language Models

In efforts to keep up with the rapid progress and use of large language ...
research
05/28/2023

Mitigating Label Biases for In-context Learning

Various design settings for in-context learning (ICL), such as the choic...
research
01/28/2023

Comparing Intrinsic Gender Bias Evaluation Measures without using Human Annotated Examples

Numerous types of social biases have been identified in pre-trained lang...
research
02/28/2021

Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP

When trained on large, unfiltered crawls from the internet, language mod...
research
04/14/2022

How Gender Debiasing Affects Internal Model Representations, and Why It Matters

Common studies of gender bias in NLP focus either on extrinsic bias meas...
research
11/10/2019

Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

Models often easily learn biases present in the training data, and their...
research
02/14/2023

AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models

Social bias in Pretrained Language Models (PLMs) affects text generation...

Please sign up or login with your details

Forgot password? Click here to reset