Show or Suppress? Managing Input Uncertainty in Machine Learning Model Explanations

01/23/2021
by   Danding Wang, et al.
0

Feature attribution is widely used in interpretable machine learning to explain how influential each measured input feature value is for an output inference. However, measurements can be uncertain, and it is unclear how the awareness of input uncertainty can affect the trust in explanations. We propose and study two approaches to help users to manage their perception of uncertainty in a model explanation: 1) transparently show uncertainty in feature attributions to allow users to reflect on, and 2) suppress attribution to features with uncertain measurements and shift attribution to other features by regularizing with an uncertainty penalty. Through simulation experiments, qualitative interviews, and quantitative user evaluations, we identified the benefits of moderately suppressing attribution uncertainty, and concerns regarding showing attribution uncertainty. This work adds to the understanding of handling and communicating uncertainty for model interpretability.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/10/2020

Towards Unifying Feature Attribution and Counterfactual Explanations: Different Means to the Same End

To explain a machine learning model, there are two main approaches: feat...
research
07/05/2023

Harmonizing Feature Attributions Across Deep Learning Architectures: Enhancing Interpretability and Consistency

Ensuring the trustworthiness and interpretability of machine learning mo...
research
05/19/2022

Towards a Theory of Faithfulness: Faithful Explanations of Differentiable Classifiers over Continuous Data

There is broad agreement in the literature that explanation methods shou...
research
05/31/2022

Attribution-based Explanations that Provide Recourse Cannot be Robust

Different users of machine learning methods require different explanatio...
research
02/24/2023

Don't be fooled: label leakage in explanation methods and the importance of their quantitative evaluation

Feature attribution methods identify which features of an input most inf...
research
12/22/2022

Impossibility Theorems for Feature Attribution

Despite a sea of interpretability methods that can produce plausible exp...
research
04/27/2019

Working women and caste in India: A study of social disadvantage using feature attribution

Women belonging to the socially disadvantaged caste-groups in India have...

Please sign up or login with your details

Forgot password? Click here to reset