"Why Should I Trust You?": Explaining the Predictions of Any Classifier

02/16/2016
by   Marco Tulio Ribeiro, et al.
0

Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

READ FULL TEXT
research
10/02/2019

Contextual Local Explanation for Black Box Classifiers

We introduce a new model-agnostic explanation technique which explains t...
research
05/22/2018

"Why Should I Trust Interactive Learners?" Explaining Interactive Queries of Classifiers to Users

Although interactive learning puts the user into the loop, the learner r...
research
04/14/2021

To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

In hybrid human-AI systems, users need to decide whether or not to trust...
research
09/23/2020

The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets

For neural models to garner widespread public trust and ensure fairness,...
research
10/13/2022

Self-explaining deep models with logic rule reasoning

We present SELOR, a framework for integrating self-explaining capabiliti...
research
10/28/2021

Explaining Latent Representations with a Corpus of Examples

Modern machine learning models are complicated. Most of them rely on con...
research
05/30/2018

To Trust Or Not To Trust A Classifier

Knowing when a classifier's prediction can be trusted is useful in many ...

Please sign up or login with your details

Forgot password? Click here to reset