Removing input features via a generative model to explain their attributions to classifier's decisions

10/09/2019
by   Chirag Agarwal, et al.
24

Interpretability methods often measure the contribution of an input feature to an image classifier's decisions by heuristically removing it via e.g. blurring, adding noise, or graying out, which often produce unrealistic, out-of-samples. Instead, we propose to integrate a generative inpainter into three representative attribution methods to remove an input feature. Compared to the original counterparts, our methods (1) generate more plausible counterfactual samples under the true data generating process; (2) are more robust to hyperparameter settings; and (3) localize objects more accurately. Our findings were consistent across both ImageNet and Places365 datasets and two different pairs of classifiers and inpainters.

READ FULL TEXT

page 17

page 19

page 20

page 21

page 30

page 31

page 32

page 33

research
05/16/2022

Sparse Visual Counterfactual Explanations in Image Space

Visual counterfactual explanations (VCEs) in image space are an importan...
research
10/22/2021

Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models?

Explaining how important each input feature is to a classifier's decisio...
research
10/21/2022

Diffusion Visual Counterfactual Explanations

Visual Counterfactual Explanations (VCEs) are an important tool to under...
research
11/20/2020

Born Identity Network: Multi-way Counterfactual Map Generation to Explain a Classifier's Decision

There exists an apparent negative correlation between performance and in...
research
07/06/2019

Generative Counterfactual Introspection for Explainable Deep Learning

In this work, we propose an introspection technique for deep neural netw...
research
05/19/2022

Towards a Theory of Faithfulness: Faithful Explanations of Differentiable Classifiers over Continuous Data

There is broad agreement in the literature that explanation methods shou...
research
07/19/2021

Path Integrals for the Attribution of Model Uncertainties

Enabling interpretations of model uncertainties is of key importance in ...

Please sign up or login with your details

Forgot password? Click here to reset