Born Identity Network: Multi-way Counterfactual Map Generation to Explain a Classifier's Decision

by   Kwanseok Oh, et al.

There exists an apparent negative correlation between performance and interpretability of deep learning models. In an effort to reduce this negative correlation, we propose Born Identity Network (BIN), which is a post-hoc approach for producing multi-way counterfactual maps. A counterfactual map transforms an input sample to be classified as a target label, which is similar to how humans process knowledge through counterfactual thinking. Thus, producing a better counterfactual map may be a step towards explanation at the level of human knowledge. For example, a counterfactual map can localize hypothetical abnormalities from a normal brain image that may cause it to be diagnosed with a disease. Specifically, our proposed BIN consists of two core components: Counterfactual Map Generator and Target Attribution Network. The Counterfactual Map Generator is a variation of conditional GAN which can synthesize a counterfactual map conditioned on an arbitrary target label. The Target Attribution Network works in a complementary manner to enforce target label attribution to the synthesized map. We have validated our proposed BIN in qualitative, quantitative analysis on MNIST, 3D Shapes, and ADNI datasets, and show the comprehensibility and fidelity of our method from various ablation studies.


page 1

page 2

page 6

page 7

page 15


Learn-Explain-Reinforce: Counterfactual Reasoning and Its Guidance to Reinforce an Alzheimer's Disease Diagnosis Model

Existing studies on disease diagnostic models focus either on diagnostic...

Motif-guided Time Series Counterfactual Explanations

With the rising need of interpretable machine learning methods, there is...

Counterfactual Explanation with Multi-Agent Reinforcement Learning for Drug Target Prediction

Motivation: Several accurate deep learning models have been proposed to ...

Diffusion Models for Counterfactual Explanations

Counterfactual explanations have shown promising results as a post-hoc f...

DreaMR: Diffusion-driven Counterfactual Explanation for Functional MRI

Deep learning analyses have offered sensitivity leaps in detection of co...

Removing input features via a generative model to explain their attributions to classifier's decisions

Interpretability methods often measure the contribution of an input feat...

Shopping in the Multiverse: A Counterfactual Approach to In-Session Attribution

We tackle the challenge of in-session attribution for on-site search eng...

Please sign up or login with your details

Forgot password? Click here to reset