Counterfactual Explanation Based on Gradual Construction for Deep Networks

08/05/2020
by   Sin-Han Kang, et al.
0

To understand the black-box characteristics of deep networks, counterfactual explanation that deduces not only the important features of an input space but also how those features should be modified to classify input as a target class has gained an increasing interest. The patterns that deep networks have learned from a training dataset can be grasped by observing the feature variation among various classes. However, current approaches perform the feature modification to increase the classification probability for the target class irrespective of the internal characteristics of deep networks. This often leads to unclear explanations that deviate from real-world data distributions. To address this problem, we propose a counterfactual explanation method that exploits the statistics learned from a training dataset. Especially, we gradually construct an explanation by iterating over masking and composition steps. The masking step aims to select an important feature from the input data to be classified as a target class. Meanwhile, the composition step aims to optimize the previously selected feature by ensuring that its output score is close to the logit space of the training data that are classified as the target class. Experimental results show that our method produces human-friendly interpretations on various classification datasets and verify that such interpretations can be achieved with fewer feature modification.

READ FULL TEXT

page 15

page 21

research
08/08/2021

TDLS: A Top-Down Layer Searching Algorithm for Generating Counterfactual Visual Explanation

Explanation of AI, as well as fairness of algorithms' decisions and the ...
research
11/17/2022

Features Compression based on Counterfactual Analysis

Counterfactual Explanations are becoming a de-facto standard in post-hoc...
research
03/24/2021

Counterfactual Explanation with Multi-Agent Reinforcement Learning for Drug Target Prediction

Motivation: Several accurate deep learning models have been proposed to ...
research
02/20/2023

Why is the prediction wrong? Towards underfitting case explanation via meta-classification

In this paper we present a heuristic method to provide individual explan...
research
11/15/2020

Declarative Approaches to Counterfactual Explanations for Classification

We propose answer-set programs that specify and compute counterfactual i...
research
07/21/2021

A Sparsity Algorithm with Applications to Corporate Credit Rating

In Artificial Intelligence, interpreting the results of a Machine Learni...
research
05/13/2022

DualCF: Efficient Model Extraction Attack from Counterfactual Explanations

Cloud service providers have launched Machine-Learning-as-a-Service (MLa...

Please sign up or login with your details

Forgot password? Click here to reset