Stability Guarantees for Feature Attributions with Multiplicative Smoothing

07/12/2023
by   Anton Xue, et al.
0

Explanation methods for machine learning models tend to not provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. To achieve such a model, we develop a smoothing method called Multiplicative Smoothing (MuS). We show that MuS overcomes theoretical limitations of standard smoothing techniques and can be integrated with any classifier and feature attribution method. We evaluate MuS on vision and language models with a variety of feature attribution methods, such as LIME and SHAP, and demonstrate that MuS endows feature attributions with non-trivial stability guarantees.

READ FULL TEXT
research
05/28/2021

A General Taylor Framework for Unifying and Revisiting Attribution Methods

Attribution methods provide an insight into the decision-making process ...
research
08/21/2020

A Unified Taylor Framework for Revisiting Attribution Methods

Attribution methods have been developed to understand the decision makin...
research
11/23/2022

Evaluating Feature Attribution Methods for Electrocardiogram

The performance of cardiac arrhythmia detection with electrocardiograms(...
research
07/07/2023

On Formal Feature Attribution and Its Approximation

Recent years have witnessed the widespread use of artificial intelligenc...
research
06/18/2021

NoiseGrad: enhancing explanations by introducing stochasticity to model weights

Attribution methods remain a practical instrument that is used in real-w...
research
04/27/2019

Working women and caste in India: A study of social disadvantage using feature attribution

Women belonging to the socially disadvantaged caste-groups in India have...
research
05/19/2022

Towards a Theory of Faithfulness: Faithful Explanations of Differentiable Classifiers over Continuous Data

There is broad agreement in the literature that explanation methods shou...

Please sign up or login with your details

Forgot password? Click here to reset