PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction

by   Xinyao Ma, et al.

Unconscious biases continue to be prevalent in modern text and media, calling for algorithms that can assist writers with bias correction. For example, a female character in a story is often portrayed as passive and powerless ("She daydreams about being a doctor") while a man is portrayed as more proactive and powerful ("He pursues his dream of being a doctor"). We formulate *Controllable Debiasing*, a new revision task that aims to rewrite a given text to correct the implicit and potentially undesirable bias in character portrayals. We then introduce PowerTransformer as an approach that debiases text through the lens of connotation frames (Sap et al., 2017), which encode pragmatic knowledge of implied power dynamics with respect to verb predicates. One key challenge of our task is the lack of parallel corpora. To address this challenge, we adopt an unsupervised approach using auxiliary supervision with related tasks such as paraphrasing and self-supervision based on a reconstruction loss, building on pretrained language models. Through comprehensive experiments based on automatic and human evaluations, we demonstrate that our approach outperforms ablations and existing methods from related tasks. Furthermore, we demonstrate the use of PowerTransformer as a step toward mitigating the well-documented gender bias in character portrayal in movie scripts.


OSCaR: Orthogonal Subspace Correction and Rectification of Biases in Word Embeddings

Language representations are known to carry stereotypical biases and, as...

Reward Modeling for Mitigating Toxicity in Transformer-based Language Models

Transformer-based language models are able to generate fluent text and b...

Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function

Gender bias exists in natural language datasets which neural language mo...

MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models

Existing pre-trained large language models have shown unparalleled gener...

uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers

The task of Chinese Spelling Check (CSC) is aiming to detect and correct...

A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models with Adversarial Learning

Vision-language models can encode societal biases and stereotypes, but t...

FrameAxis: Characterizing Framing Bias and Intensity with Word Embedding

We propose FrameAxis, a method of characterizing the framing of a given ...