Generative causal explanations of black-box classifiers

06/24/2020
by   Matthew O'Shaughnessy, et al.
0

We develop a method for generating causal post-hoc explanations of black-box classifiers based on a learned low-dimensional representation of the data. The explanation is causal in the sense that changing learned latent factors produces a change in the classifier output statistics. To construct these explanations, we design a learning framework that leverages a generative model and information-theoretic measures of causal influence. Our objective function encourages both the generative model to faithfully represent the data distribution and the latent factors to have a large causal influence on the classifier output. Our method learns both global and local explanations, is compatible with any classifier that admits class probabilities and a gradient, and does not require labeled attributes or knowledge of causal structure. Using carefully controlled test cases, we provide intuition that illuminates the function of our causal objective. We then demonstrate the practical utility of our method on image recognition tasks.

READ FULL TEXT

page 15

page 16

page 17

page 26

page 27

research
03/29/2022

OrphicX: A Causality-Inspired Latent Variable Model for Interpreting Graph Neural Networks

This paper proposes a new eXplanation framework, called OrphicX, for gen...
research
09/09/2021

Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation

We aim to explain a black-box classifier with the form: `data X is class...
research
07/20/2018

Explaining Image Classifiers by Adaptive Dropout and Generative In-filling

Explanations of black-box classifiers often rely on saliency maps, which...
research
09/08/2021

Model Explanations via the Axiomatic Causal Lens

Explaining the decisions of black-box models has been a central theme in...
research
07/04/2023

Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers

Classifiers tend to learn a false causal relationship between an over-re...
research
03/28/2018

Supervising Feature Influence

Causal influence measures for machine learnt classifiers shed light on t...
research
09/05/2023

Causal Scoring Medical Image Explanations: A Case Study On Ex-vivo Kidney Stone Images

On the promise that if human users know the cause of an output, it would...

Please sign up or login with your details

Forgot password? Click here to reset