DeepAI AI Chat
Log In Sign Up

Explaining The Behavior Of Black-Box Prediction Algorithms With Causal Learning

by   Numair Sani, et al.
Johns Hopkins University

We propose to explain the behavior of black-box prediction methods (e.g., deep neural networks trained on image pixel data) using causal graphical models. Specifically, we explore learning the structure of a causal graph where the nodes represent prediction outcomes along with a set of macro-level "interpretable" features, while allowing for arbitrary unmeasured confounding among these variables. The resulting graph may indicate which of the interpretable features, if any, are possible causes of the prediction outcome and which may be merely associated with prediction outcomes due to confounding. The approach is motivated by a counterfactual theory of causal explanation wherein good explanations point to factors which are "difference-makers" in an interventionist sense. The resulting analysis may be useful in algorithm auditing and evaluation, by identifying features which make a causal difference to the algorithm's output.


page 5

page 6


A Causal Lens for Peeking into Black Box Predictive Models: Predictive Model Interpretation via Causal Attribution

With the increasing adoption of predictive models trained using machine ...

What's in the box? Explaining the black-box model through an evaluation of its interpretable features

Algorithms are powerful and necessary tools behind a large part of the i...

Inducing Causal Structure for Interpretable Neural Networks

In many areas, we have well-founded insights about causal structure that...

Causal Analysis for Robust Interpretability of Neural Networks

Interpreting the inner function of neural networks is crucial for the tr...

Extracting Causal Visual Features for Limited label Classification

Neural networks trained to classify images do so by identifying features...

Using Interpretable Machine Learning to Predict Maternal and Fetal Outcomes

Most pregnancies and births result in a good outcome, but complications ...

Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation

We aim to explain a black-box classifier with the form: `data X is class...