PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction

10/26/2020
by   Xinyao Ma, et al.
0

Unconscious biases continue to be prevalent in modern text and media, calling for algorithms that can assist writers with bias correction. For example, a female character in a story is often portrayed as passive and powerless ("She daydreams about being a doctor") while a man is portrayed as more proactive and powerful ("He pursues his dream of being a doctor"). We formulate *Controllable Debiasing*, a new revision task that aims to rewrite a given text to correct the implicit and potentially undesirable bias in character portrayals. We then introduce PowerTransformer as an approach that debiases text through the lens of connotation frames (Sap et al., 2017), which encode pragmatic knowledge of implied power dynamics with respect to verb predicates. One key challenge of our task is the lack of parallel corpora. To address this challenge, we adopt an unsupervised approach using auxiliary supervision with related tasks such as paraphrasing and self-supervision based on a reconstruction loss, building on pretrained language models. Through comprehensive experiments based on automatic and human evaluations, we demonstrate that our approach outperforms ablations and existing methods from related tasks. Furthermore, we demonstrate the use of PowerTransformer as a step toward mitigating the well-documented gender bias in character portrayal in movie scripts.

READ FULL TEXT
research
06/30/2020

OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings

Language representations are known to carry stereotypical biases and, as...
research
02/19/2022

Reward Modeling for Mitigating Toxicity in Transformer-based Language Models

Transformer-based language models are able to generate fluent text and b...
research
05/01/2020

Towards Controllable Biases in Language Generation

We present a general approach towards controllable societal biases in na...
research
05/30/2019

Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function

Gender bias exists in natural language datasets which neural language mo...
research
12/19/2022

Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training

Language tasks involving character-level manipulations (e.g., spelling c...
research
06/23/2022

A Disability Lens towards Biases in GPT-3 Generated Open-Ended Languages

Language models (LM) are becoming prevalent in many language-based appli...
research
10/16/2020

Reflective Decoding: Unsupervised Paraphrasing and Abductive Reasoning

Pretrained Language Models (LMs) generate text with remarkable quality, ...

Please sign up or login with your details

Forgot password? Click here to reset