Towards Controllable Biases in Language Generation

05/01/2020
by   Emily Sheng, et al.
0

We present a general approach towards controllable societal biases in natural language generation (NLG). Building upon the idea of adversarial triggers, we develop a method to induce or avoid biases in generated text containing mentions of specified demographic groups. We then analyze two scenarios: 1) inducing biases for one demographic and avoiding biases for another, and 2) mitigating biases between demographic pairs (e.g., man and woman). The former scenario gives us a tool for detecting the types of biases present in the model, and the latter is useful for mitigating biases in downstream applications (e.g., dialogue generation). Specifically, our approach facilitates more explainable biases by allowing us to 1) use the relative effectiveness of inducing biases for different demographics as a new dimension for bias evaluation, and 2) discover topics that correspond to demographic inequalities in generated text. Furthermore, our mitigation experiments exemplify our technique's effectiveness at equalizing the amount of biases across demographics while simultaneously generating less negatively biased text overall.

READ FULL TEXT
research
05/28/2023

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

Large language models (LLMs) learn not only natural text generation abil...
research
09/03/2019

The Woman Worked as a Babysitter: On Biases in Language Generation

We present a systematic study of biases in natural language generation (...
research
04/18/2021

Revealing Persona Biases in Dialogue Systems

Dialogue systems in the form of chatbots and personal assistants are bei...
research
10/18/2021

Demographic Biases of Crowd Workers in Key Opinion Leaders Finding

Key Opinion Leaders (KOLs) are people that have a strong influence and t...
research
02/14/2023

A Friendly Face: Do Text-to-Image Systems Rely on Stereotypes when the Input is Under-Specified?

As text-to-image systems continue to grow in popularity with the general...
research
04/29/2020

Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting

With the recent proliferation of the use of text classifications, resear...
research
10/26/2020

PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction

Unconscious biases continue to be prevalent in modern text and media, ca...

Please sign up or login with your details

Forgot password? Click here to reset