The future of human-centric eXplainable Artificial Intelligence (XAI) is not post-hoc explanations

by   Vinitra Swamy, et al.

Explainable Artificial Intelligence (XAI) plays a crucial role in enabling human understanding and trust in deep learning systems, often defined as determining which features are most important to a model's prediction. As models get larger, more ubiquitous, and pervasive in aspects of daily life, explainability is necessary to avoid or minimize adverse effects of model mistakes. Unfortunately, current approaches in human-centric XAI (e.g. predictive tasks in healthcare, education, or personalized ads) tend to rely on a single explainer. This is a particularly concerning trend when considering that recent work has identified systematic disagreement in explainability methods when applied to the same points and underlying black-box models. In this paper, we therefore present a call for action to address the limitations of current state-of-the-art explainers. We propose to shift from post-hoc explainability to designing interpretable neural network architectures; moving away from approximation techniques in human-centric and high impact applications. We identify five needs of human-centric XAI (real-time, accurate, actionable, human-interpretable, and consistent) and propose two schemes for interpretable-by-design neural network workflows (adaptive routing for interpretable conditional computation and diagnostic benchmarks for iterative model learning). We postulate that the future of human-centric XAI is neither in explaining black-boxes nor in reverting to traditional, interpretable models, but in neural networks that are intrinsically interpretable.


page 1

page 2

page 3

page 4


Explainability Is in the Mind of the Beholder: Establishing the Foundations of Explainable Artificial Intelligence

Explainable artificial intelligence and interpretable machine learning a...

Infusing domain knowledge in AI-based "black box" models for better explainability with application in bankruptcy prediction

Although "black box" models such as Artificial Neural Networks, Support ...

A Lightweight, Efficient and Explainable-by-Design Convolutional Neural Network for Internet Traffic Classification

Traffic classification, i.e. the identification of the type of applicati...

ILMART: Interpretable Ranking with Constrained LambdaMART

Interpretable Learning to Rank (LtR) is an emerging field within the res...

Evaluating the Explainers: Black-Box Explainable Machine Learning for Student Success Prediction in MOOCs

Neural networks are ubiquitous in applied machine learning for education...

Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction

Explaining recommendations enables users to understand whether recommend...

Towards explainable artificial intelligence (XAI) for early anticipation of traffic accidents

Traffic accident anticipation is a vital function of Automated Driving S...

Please sign up or login with your details

Forgot password? Click here to reset