Counterfactual Reasoning for Bias Evaluation and Detection in a Fairness under Unawareness setting

02/16/2023
by   Giandomenico Cornacchia, et al.
0

Current AI regulations require discarding sensitive features (e.g., gender, race, religion) in the algorithm's decision-making process to prevent unfair outcomes. However, even without sensitive features in the training set, algorithms can persist in discrimination. Indeed, when sensitive features are omitted (fairness under unawareness), they could be inferred through non-linear relations with the so called proxy features. In this work, we propose a way to reveal the potential hidden bias of a machine learning model that can persist even when sensitive features are discarded. This study shows that it is possible to unveil whether the black-box predictor is still biased by exploiting counterfactual reasoning. In detail, when the predictor provides a negative classification outcome, our approach first builds counterfactual examples for a discriminated user category to obtain a positive outcome. Then, the same counterfactual samples feed an external classifier (that targets a sensitive feature) that reveals whether the modifications to the user characteristics needed for a positive outcome moved the individual to the non-discriminated group. When this occurs, it could be a warning sign for discriminatory behavior in the decision process. Furthermore, we leverage the deviation of counterfactuals from the original sample to determine which features are proxies of specific sensitive information. Our experiments show that, even if the model is trained without sensitive features, it often suffers discriminatory biases.

READ FULL TEXT

page 4

page 6

page 12

research
06/14/2020

Fairness Under Feature Exemptions: Counterfactual and Observational Measures

With the growing use of AI in highly consequential domains, the quantifi...
research
01/29/2019

Repairing without Retraining: Avoiding Disparate Impact with Counterfactual Distributions

When the average performance of a prediction model varies significantly ...
research
10/09/2020

CryptoCredit: Securely Training Fair Models

When developing models for regulated decision making, sensitive features...
research
08/30/2020

Adversarial Learning for Counterfactual Fairness

In recent years, fairness has become an important topic in the machine l...
research
09/07/2020

Fairness in Risk Assessment Instruments: Post-Processing to Achieve Counterfactual Equalized Odds

Algorithmic fairness is a topic of increasing concern both within resear...
research
01/07/2020

Revealing Neural Network Bias to Non-Experts Through Interactive Counterfactual Examples

AI algorithms are not immune to biases. Traditionally, non-experts have ...

Please sign up or login with your details

Forgot password? Click here to reset