Model Explanation Disparities as a Fairness Diagnostic

03/03/2023
by   Peter W. Chang, et al.
0

In recent years, there has been a flurry of research focusing on the fairness of machine learning models, and in particular on quantifying and eliminating bias against protected subgroups. One line of work generalizes the notion of protected subgroups beyond simple discrete classes by introducing the notion of a "rich subgroup", and seeks to train models that are calibrated or equalize error rates with respect to these richer subgroup classes. Largely orthogonally, local model explanation methods have been developed that given a classifier h and test point x, attribute influence for the prediction h(x) to the individual features of x. This raises a natural question: Do local model explanation methods attribute different feature importance values on average across different protected subgroups, and can we detect these disparities efficiently? If the model places high weight on a given feature in a specific protected subgroup, but not on the dataset overall (or vice versa), this could be a potential indicator of bias in the predictive model or the underlying data generating process, and is at the very least a useful diagnostic that signals the need for a domain expert to delve deeper. In this paper, we formally introduce the notion of feature importance disparity (FID) in the context of rich subgroups, design oracle-efficent algorithms to identify large FID subgroups, and conduct a thorough empirical analysis that establishes auditing for FID as an important method to investigate dataset bias. Our experiments show that across 4 datasets and 4 common feature importance methods our algorithms find (feature, subgroup) pairs that simultaneously: (i) have subgroup feature importance that is often an order of magnitude different than the importance on the dataset as a whole (ii) generalize out of sample, and (iii) yield interesting discussions about potential bias inherent in these datasets.

READ FULL TEXT

page 10

page 13

research
07/09/2020

Transparency Tools for Fairness in AI (Luskin)

We propose new tools for policy-makers to use when assessing and correct...
research
03/15/2022

Distraction is All You Need for Fairness

With the recent growth in artificial intelligence models and its expandi...
research
03/16/2022

Measuring Fairness of Text Classifiers via Prediction Sensitivity

With the rapid growth in language processing applications, fairness has ...
research
05/16/2023

Measuring Implicit Bias Using SHAP Feature Importance and Fuzzy Cognitive Maps

In this paper, we integrate the concepts of feature importance with impl...
research
06/01/2022

How Biased is Your Feature?: Computing Fairness Influence Functions with Global Sensitivity Analysis

Fairness in machine learning has attained significant focus due to the w...
research
10/27/2021

Feature and Label Embedding Spaces Matter in Addressing Image Classifier Bias

This paper strives to address image classifier bias, with a focus on bot...
research
06/16/2022

Quantifying Feature Contributions to Overall Disparity Using Information Theory

When a machine-learning algorithm makes biased decisions, it can be help...

Please sign up or login with your details

Forgot password? Click here to reset