On the Objective Evaluation of Post Hoc Explainers

06/15/2021
by   Zachariah Carmichael, et al.
0

Many applications of data-driven models demand transparency of decisions, especially in health care, criminal justice, and other high-stakes environments. Modern trends in machine learning research have led to algorithms that are increasingly intricate to the degree that they are considered to be black boxes. In an effort to reduce the opacity of decisions, methods have been proposed to construe the inner workings of such models in a human-comprehensible manner. These post hoc techniques are described as being universal explainers - capable of faithfully augmenting decisions with algorithmic insight. Unfortunately, there is little agreement about what constitutes a "good" explanation. Moreover, current methods of explanation evaluation are derived from either subjective or proxy means. In this work, we propose a framework for the evaluation of post hoc explainers on ground truth that is directly derived from the additive structure of a model. We demonstrate the efficacy of the framework in understanding explainers by evaluating popular explainers on thousands of synthetic and several real-world tasks. The framework unveils that explanations may be accurate but misattribute the importance of individual features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2019

How can we fool LIME and SHAP? Adversarial Attacks on Post hoc Explanation Methods

As machine learning black boxes are increasingly being deployed in domai...
research
10/04/2019

Can I Trust the Explainer? Verifying Post-hoc Explanatory Methods

For AI systems to garner widespread public acceptance, we must develop m...
research
12/02/2022

Evaluation of FEM and MLFEM AI-explainers in Image Classification tasks with reference-based and no-reference metrics

The most popular methods and algorithms for AI are, for the vast majorit...
research
06/22/2022

OpenXAI: Towards a Transparent Evaluation of Model Explanations

While several types of post hoc explanation methods (e.g., feature attri...
research
02/21/2021

Towards the Unification and Robustness of Perturbation and Gradient Based Explanations

As machine learning black boxes are increasingly being deployed in criti...
research
07/16/2019

Evaluating Explanation Without Ground Truth in Interpretable Machine Learning

Interpretable Machine Learning (IML) has become increasingly important i...
research
05/24/2021

Reproducibility Report: Contextualizing Hate Speech Classifiers with Post-hoc Explanation

The presented report evaluates Contextualizing Hate Speech Classifiers w...

Please sign up or login with your details

Forgot password? Click here to reset