FairPy: A Toolkit for Evaluation of Social Biases and their Mitigation in Large Language Models

02/10/2023
by   Hrishikesh Viswanath, et al.
0

Studies have shown that large pretrained language models exhibit biases against social groups based on race, gender etc, which they inherit from the datasets they are trained on. Various researchers have proposed mathematical tools for quantifying and identifying these biases. There have been methods proposed to mitigate such biases. In this paper, we present a comprehensive quantitative evaluation of different kinds of biases such as race, gender, ethnicity, age etc. exhibited by popular pretrained language models such as BERT, GPT-2 etc. and also present a toolkit that provides plug-and-play interfaces to connect mathematical tools to identify biases with large pretrained language models such as BERT, GPT-2 etc. and also present users with the opportunity to test custom models against these metrics. The toolkit also allows users to debias existing and custom models using the debiasing techniques proposed so far. The toolkit is available at https://github.com/HrishikeshVish/Fairpy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/20/2020

StereoSet: Measuring stereotypical bias in pretrained language models

A stereotype is an over-generalized belief about a particular group of p...
research
05/23/2022

Challenges in Measuring Bias via Open-Ended Language Generation

Researchers have devised numerous ways to quantify social biases vested ...
research
09/24/2022

Moral Mimicry: Large Language Models Produce Moral Rationalizations Tailored to Political Identity

Large Language Models (LLMs) have recently demonstrated impressive capab...
research
05/26/2023

Finspector: A Human-Centered Visual Inspection Tool for Exploring and Comparing Biases among Foundation Models

Pre-trained transformer-based language models are becoming increasingly ...
research
03/23/2022

Can Prompt Probe Pretrained Language Models? Understanding the Invisible Risks from a Causal View

Prompt-based probing has been widely used in evaluating the abilities of...
research
06/06/2023

LEACE: Perfect linear concept erasure in closed form

Concept erasure aims to remove specified features from a representation....
research
11/15/2021

Assessing gender bias in medical and scientific masked language models with StereoSet

NLP systems use language models such as Masked Language Models (MLMs) th...

Please sign up or login with your details

Forgot password? Click here to reset