Semantics and explanation: why counterfactual explanations produce adversarial examples in deep neural networks

12/18/2020
by   Kieran Browne, et al.
0

Recent papers in explainable AI have made a compelling case for counterfactual modes of explanation. While counterfactual explanations appear to be extremely effective in some instances, they are formally equivalent to adversarial examples. This presents an apparent paradox for explainability researchers: if these two procedures are formally equivalent, what accounts for the explanatory divide apparent between counterfactual explanations and adversarial examples? We resolve this paradox by placing emphasis back on the semantics of counterfactual expressions. Producing satisfactory explanations for deep learning systems will require that we find ways to interpret the semantics of hidden layer representations in deep neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/11/2020

Counterfactual Explanations Adversarial Examples – Common Grounds, Essential Differences, and Potential Transfers

It is well known that adversarial examples and counterfactual explanatio...
research
06/25/2019

Explaining Deep Learning Models with Constrained Adversarial Examples

Machine learning algorithms generally suffer from a problem of explainab...
research
06/18/2021

On the Connections between Counterfactual Explanations and Adversarial Examples

Counterfactual explanations and adversarial examples have emerged as cri...
research
03/01/2021

Counterfactual Explanations for Oblique Decision Trees: Exact, Efficient Algorithms

We consider counterfactual explanations, the problem of minimally adjust...
research
05/26/2020

Good Counterfactuals and Where to Find Them: A Case-Based Technique for Generating Counterfactuals for Explainable AI (XAI)

Recently, a groundswell of research has identified the use of counterfac...
research
04/16/2021

MEG: Generating Molecular Counterfactual Explanations for Deep Graph Networks

Explainable AI (XAI) is a research area whose objective is to increase t...
research
08/31/2022

Formalising the Robustness of Counterfactual Explanations for Neural Networks

The use of counterfactual explanations (CFXs) is an increasingly popular...

Please sign up or login with your details

Forgot password? Click here to reset