BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

02/03/2021
by   Muhammad Hilmi Asyrofi, et al.
0

Artificial Intelligence (AI) software systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human biases. Consequently, the machine learning model in such software systems may exhibit unintended demographic bias based on specific characteristics (e.g., gender, occupation, country-of-origin, etc.). Such biases manifest in an SA system when it predicts a different sentiment for similar texts that differ only in the characteristic of individuals described. Existing studies on revealing bias in SA systems rely on the production of sentences from a small set of short, predefined templates. To address this limitation, we present BisaFinder, an approach to discover biased predictions in SA systems via metamorphic testing. A key feature of BisaFinder is the automatic curation of suitable templates based on the pieces of text from a large corpus, using various Natural Language Processing (NLP) techniques to identify words that describe demographic characteristics. Next, BisaFinder instantiates new text from these templates by filling in placeholders with words associated with a class of a characteristic (e.g., gender-specific words such as female names, "she", "her"). These texts are used to tease out bias in an SA system. BisaFinder identifies a bias-uncovering test case when it detects that the SA system exhibits demographic bias for a pair of texts, i.e., it predicts a different sentiment for texts that differ only in words associated with a different class (e.g., male vs. female) of a target characteristic (e.g., gender). Our empirical evaluation showed that BisaFinder can effectively create a large number of realistic and diverse test cases that uncover various biases in an SA system with a high true positive rate of up to 95.8%.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2018

Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems

Automatic machine learning systems can inadvertently accentuate and perp...
research
05/31/2021

BiasRV: Uncovering Biased Sentiment Predictions at Runtime

Sentiment analysis (SA) systems, though widely applied in many domains, ...
research
12/03/2022

Towards Robust NLG Bias Evaluation with Syntactically-diverse Prompts

We present a robust methodology for evaluating biases in natural languag...
research
02/04/2023

Rating Sentiment Analysis Systems for Bias through a Causal Lens

Sentiment Analysis Systems (SASs) are data-driven Artificial Intelligenc...
research
09/03/2019

The Woman Worked as a Babysitter: On Biases in Language Generation

We present a systematic study of biases in natural language generation (...
research
05/09/2022

Behind the Mask: Demographic bias in name detection for PII masking

Many datasets contain personally identifiable information, or PII, which...
research
10/09/2019

Perturbation Sensitivity Analysis to Detect Unintended Model Biases

Data-driven statistical Natural Language Processing (NLP) techniques lev...

Please sign up or login with your details

Forgot password? Click here to reset