Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

01/27/2020
by   Riccardo Guidotti, et al.
0

We present an approach to explain the decisions of black box models for image classification. While using the black box to label images, our explanation method exploits the latent feature space learned through an adversarial autoencoder. The proposed method first generates exemplar images in the latent feature space and learns a decision tree classifier. Then, it selects and decodes exemplars respecting local decision rules. Finally, it visualizes them in a manner that shows to the user how the exemplars can be modified to either stay within their class, or to become counter-factuals by "morphing" into another class. Since we focus on black box decision systems for image classification, the explanation obtained from the exemplars also provides a saliency map highlighting the areas of the image that contribute to its classification, and areas of the image that push it into another class. We present the results of an experimental evaluation on three datasets and two black box models. Besides providing the most useful and interpretable explanations, we show that the proposed method outperforms existing explainers in terms of fidelity, relevance, coherence, and stability.

READ FULL TEXT
research
12/20/2020

Explaining Black-box Models for Biomedical Text Classification

In this paper, we propose a novel method named Biomedical Confident Item...
research
09/27/2019

Interpreting Undesirable Pixels for Image Classification on Black-Box Models

In an effort to interpret black-box models, researches for developing ex...
research
11/26/2021

Reinforcement Explanation Learning

Deep Learning has become overly complicated and has enjoyed stellar succ...
research
01/04/2022

McXai: Local model-agnostic explanation as two games

To this day, a variety of approaches for providing local interpretabilit...
research
09/30/2019

Decision Explanation and Feature Importance for Invertible Networks

Deep neural networks are vulnerable to adversarial attacks and hard to i...
research
09/23/2020

Information-Theoretic Visual Explanation for Black-Box Classifiers

In this work, we attempt to explain the prediction of any black-box clas...
research
11/01/2019

Explanation by Progressive Exaggeration

As machine learning methods see greater adoption and implementation in h...

Please sign up or login with your details

Forgot password? Click here to reset