Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts

01/25/2022
by   Sebastian Bordt, et al.
1

Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms and their functioning, often interpreted as obligations to "explain". Many researchers suggest using post-hoc explanation algorithms for this purpose. In this paper, we combine legal, philosophical and technical arguments to show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives. Indeed, most situations where explanations are requested are adversarial, meaning that the explanation provider and receiver have opposing interests and incentives, so that the provider might manipulate the explanation for her own ends. We show that this fundamental conflict cannot be resolved because of the high degree of ambiguity of post-hoc explanations in realistic application scenarios. As a consequence, post-hoc explanation algorithms are unsuitable to achieve the transparency objectives inherent to the legal norms. Instead, there is a need to more explicitly discuss the objectives underlying "explainability" obligations as these can often be better achieved through other mechanisms. There is an urgent need for a more open and honest discussion regarding the potential and limitations of post-hoc explanations in adversarial contexts, in particular in light of the current negotiations about the European Union's draft Artificial Intelligence Act.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/10/2021

Towards Benchmarking the Utility of Explanations for Model Debugging

Post-hoc explanation methods are an important class of approaches that h...
research
12/10/2022

Identifying the Source of Vulnerability in Explanation Discrepancy: A Case Study in Neural Text Classification

Some recent works observed the instability of post-hoc explanations when...
research
03/13/2020

Flexible and Context-Specific AI Explainability: A Multidisciplinary Approach

The recent enthusiasm for artificial intelligence (AI) is due principall...
research
05/31/2021

Bounded logit attention: Learning to explain image classifiers

Explainable artificial intelligence is the attempt to elucidate the work...
research
06/15/2018

The Limits of Post-Selection Generalization

While statistics and machine learning offers numerous methods for ensuri...
research
01/11/2023

A Quantum Algorithm for Shapley Value Estimation

The introduction of the European Union's (EU) set of comprehensive regul...
research
01/27/2020

One Explanation Does Not Fit All: The Promise of Interactive Explanations for Machine Learning Transparency

The need for transparency of predictive systems based on Machine Learnin...

Please sign up or login with your details

Forgot password? Click here to reset