Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks

04/12/2023
by   Lorenz Linhardt, et al.
0

Explainable AI has become a popular tool for validating machine learning models. Mismatches between the explained model's decision strategy and the user's domain knowledge (e.g. Clever Hans effects) have also been recognized as a starting point for improving faulty models. However, it is less clear what to do when the user and the explanation agree. In this paper, we demonstrate that acceptance of explanations by the user is not a guarantee for a ML model to function well, in particular, some Clever Hans effects may remain undetected. Such hidden flaws of the model can nevertheless be mitigated, and we demonstrate this by contributing a new method, Explanation-Guided Exposure Minimization (EGEM), that premptively prunes variations in the ML model that have not been the subject of positive explanation feedback. Experiments on natural image data demonstrate that our approach leads to models that strongly reduce their reliance on hidden Clever Hans strategies, and consequently achieve higher accuracy on new data.

READ FULL TEXT

page 7

page 9

page 11

page 23

page 24

page 25

page 26

research
04/14/2023

One Explanation Does Not Fit XIL

Current machine learning models produce outstanding results in many area...
research
07/26/2019

How model accuracy and explanation fidelity influence user trust

Machine learning systems have become popular in fields such as marketing...
research
09/03/2020

Explainable Empirical Risk Minimization

The widespread use of modern machine learning methods in decision making...
research
06/18/2023

Can predictive models be used for causal inference?

Supervised machine learning (ML) and deep learning (DL) algorithms excel...
research
07/03/2023

Fighting the disagreement in Explainable Machine Learning with consensus

Machine learning (ML) models are often valued by the accuracy of their p...
research
08/14/2023

Can we Agree? On the Rashōmon Effect and the Reliability of Post-Hoc Explainable AI

The Rashōmon effect poses challenges for deriving reliable knowledge fro...

Please sign up or login with your details

Forgot password? Click here to reset