CERTIFAI: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models

05/20/2019
by   Shubham Sharma, et al.
0

As artificial intelligence plays an increasingly important role in our society, there are ethical and moral obligations for both businesses and researchers to ensure that their machine learning models are designed, deployed, and maintained responsibly. These models need to be rigorously audited for fairness, robustness, transparency, and interpretability. A variety of methods have been developed that focus on these issues in isolation, however, managing these methods in conjunction with model development can be cumbersome and timeconsuming. In this paper, we introduce a unified and model-agnostic approach to address these issues: Counterfactual Explanations for Robustness, Transparency, Interpretability, and Fairness of Artificial Intelligence models (CERTIFAI). Unlike previous methods in this domain, CERTIFAI is a general tool that can be applied to any black-box model and any type of input data. Given a model and an input instance, CERTIFAI uses a custom genetic algorithm to generate counterfactuals: instances close to the input that change the prediction of the model. We demonstrate how these counterfactuals can be used to examine issues of robustness, interpretability, transparency, and fairness. Additionally, we introduce CERScore, the first black-box model robustness score that performs comparably to methods that have access to model internals.

READ FULL TEXT

page 1

page 6

page 7

research
04/25/2023

Disagreement amongst counterfactual explanations: How transparency can be deceptive

Counterfactual explanations are increasingly used as an Explainable Arti...
research
08/22/2022

Shapelet-Based Counterfactual Explanations for Multivariate Time Series

As machine learning and deep learning models have become highly prevalen...
research
07/07/2021

Recurrence-Aware Long-Term Cognitive Network for Explainable Pattern Classification

Machine learning solutions for pattern classification problems are nowad...
research
04/19/2021

DA-DGCEx: Ensuring Validity of Deep Guided Counterfactual Explanations With Distribution-Aware Autoencoder Loss

Deep Learning has become a very valuable tool in different fields, and n...
research
08/29/2023

Glocal Explanations of Expected Goal Models in Soccer

The expected goal models have gained popularity, but their interpretabil...
research
03/14/2022

Fairness Evaluation in Deepfake Detection Models using Metamorphic Testing

Fairness of deepfake detectors in the presence of anomalies are not well...
research
12/30/2021

Self Reward Design with Fine-grained Interpretability

Transparency and fairness issues in Deep Reinforcement Learning may stem...

Please sign up or login with your details

Forgot password? Click here to reset