An Additive Instance-Wise Approach to Multi-class Model Interpretation

07/07/2022
by   Vy Vo, et al.
0

Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system and whether to trust it for high-stakes decisions or large-scale deployment. Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches. Additive models use heuristically sampled perturbations to learn instance-specific explainers sequentially. The process is thus inefficient and susceptible to poorly-conditioned samples. Meanwhile, instance-wise techniques directly learn local sampling distributions and can leverage global information from other inputs. However, they can only interpret single-class predictions and suffer from inconsistency across different settings, due to a strict reliance on a pre-defined number of features selected. This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes. We also propose an adaptive inference strategy to determine the optimal number of features for a specific instance. Our model explainer significantly outperforms additive and instance-wise counterparts on faithfulness while achieves high level of brevity on various data sets and black-box model architectures.

READ FULL TEXT
research
06/26/2018

Open the Black Box Data-Driven Explanation of Black Box Decision Systems

Black box systems for automated decision making, often based on machine ...
research
02/19/2019

Explaining a black-box using Deep Variational Information Bottleneck Approach

Briefness and comprehensiveness are necessary in order to give a lot of ...
research
07/31/2019

Local Interpretation Methods to Machine Learning Using the Domain of the Feature Space

As machine learning becomes an important part of many real world applica...
research
03/31/2022

Interpretation of Black Box NLP Models: A Survey

An increasing number of machine learning models have been deployed in do...
research
09/06/2022

Making the black-box brighter: interpreting machine learning algorithm for forecasting drilling accidents

We present an approach for interpreting a black-box alarming system for ...
research
02/28/2019

SAFE ML: Surrogate Assisted Feature Extraction for Model Learning

Complex black-box predictive models may have high accuracy, but opacity ...
research
06/25/2019

Interpretable Image Recognition with Hierarchical Prototypes

Vision models are interpretable when they classify objects on the basis ...

Please sign up or login with your details

Forgot password? Click here to reset