Quantifying Social Biases Using Templates is Unreliable

10/09/2022
by   Preethi Seshadri, et al.
0

Recently, there has been an increase in efforts to understand how large language models (LLMs) propagate and amplify social biases. Several works have utilized templates for fairness evaluation, which allow researchers to quantify social biases in the absence of test sets with protected attribute labels. While template evaluation can be a convenient and helpful diagnostic tool to understand model deficiencies, it often uses a simplistic and limited set of templates. In this paper, we study whether bias measurements are sensitive to the choice of templates used for benchmarking. Specifically, we investigate the instability of bias measurements by manually modifying templates proposed in previous works in a semantically-preserving manner and measuring bias across these modifications. We find that bias values and resulting conclusions vary considerably across template modifications on four tasks, ranging from an 81 reduction (NLI) to a 162 Our results indicate that quantifying fairness in LLMs, as done in current practice, can be brittle and needs to be approached with more care and caution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/14/2021

Measuring Fairness with Biased Rulers: A Survey on Quantifying Biases in Pretrained Language Models

An increasing awareness of biased patterns in natural language processin...
research
12/03/2022

Towards Robust NLG Bias Evaluation with Syntactically-diverse Prompts

We present a robust methodology for evaluating biases in natural languag...
research
02/14/2023

AutoBiasTest: Controllable Sentence Generation for Automated and Open-Ended Social Bias Testing in Language Models

Social bias in Pretrained Language Models (PLMs) affects text generation...
research
05/22/2023

Language-Agnostic Bias Detection in Language Models

Pretrained language models (PLMs) are key components in NLP, but they co...
research
08/05/2017

Quantifying homologous proteins and proteoforms

Many proteoforms - arising from alternative splicing, post-translational...
research
09/19/2021

Towards Automatic Bias Detection in Knowledge Graphs

With the recent surge in social applications relying on knowledge graphs...
research
09/19/2023

GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts

Large language models (LLMs) have recently experienced tremendous popula...

Please sign up or login with your details

Forgot password? Click here to reset