Can I Trust the Explainer? Verifying Post-hoc Explanatory Methods

10/04/2019
by   Camburu Oana-Maria, et al.
0

For AI systems to garner widespread public acceptance, we must develop methods capable of explaining the decisions of black-box models such as neural networks. In this work, we identify two issues of current explanatory methods. First, we show that two prevalent perspectives on explanations---feature-additivity and feature-selection---lead to fundamentally different instance-wise explanations. In the literature, explainers from different perspectives are currently being directly compared, despite their distinct explanation goals. The second issue is that current post-hoc explainers have only been thoroughly validated on simple models, such as linear regression, and, when applied to real-world neural networks, explainers are commonly evaluated under the assumption that the learned models behave reasonably. However, neural networks often rely on unreasonable correlations, even when producing correct decisions. We introduce a verification framework for explanatory methods under the feature-selection perspective. Our framework is based on a non-trivial neural network architecture trained on a real-world task, and for which we are able to provide guarantees on its inner workings. We validate the efficacy of our evaluation by showing the failure modes of current explainers. We aim for this framework to provide a publicly available, off-the-shelf evaluation when the feature-selection perspective on explanations is needed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2021

On the Objective Evaluation of Post Hoc Explainers

Many applications of data-driven models demand transparency of decisions...
research
07/27/2023

Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability

With the increased deployment of machine learning models in various real...
research
05/05/2020

Post-hoc explanation of black-box classifiers using confident itemsets

It is difficult to trust decisions made by Black-box Artificial Intellig...
research
05/25/2023

Robust Ante-hoc Graph Explainer using Bilevel Optimization

Explaining the decisions made by machine learning models for high-stakes...
research
08/01/2023

Copula for Instance-wise Feature Selection and Ranking

Instance-wise feature selection and ranking methods can achieve a good s...
research
09/29/2022

Sequential Attention for Feature Selection

Feature selection is the problem of selecting a subset of features for a...
research
02/22/2021

Shapley values for feature selection: The good, the bad, and the axioms

The Shapley value has become popular in the Explainable AI (XAI) literat...

Please sign up or login with your details

Forgot password? Click here to reset