Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis

11/07/2021
by   Thomas Fel, et al.
9

We describe a novel attribution method which is grounded in Sensitivity Analysis and uses Sobol indices. Beyond modeling the individual contributions of image regions, Sobol indices provide an efficient way to capture higher-order interactions between image regions and their contributions to a neural network's prediction through the lens of variance. We describe an approach that makes the computation of these indices efficient for high-dimensional problems by using perturbation masks coupled with efficient estimators to handle the high dimensionality of images. Importantly, we show that the proposed method leads to favorable scores on standard benchmarks for vision (and language models) while drastically reducing the computing time compared to other black-box methods – even surpassing the accuracy of state-of-the-art white-box methods which require access to internal representations. Our code is freely available: https://github.com/fel-thomas/Sobol-Attribution-Method

READ FULL TEXT

page 2

page 16

page 17

page 20

research
06/13/2022

Making Sense of Dependence: Efficient Black-box Explanations Using Dependence Measure

This paper presents a new efficient black-box attribution method based o...
research
08/18/2023

On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box

Attribution methods shed light on the explainability of data-driven appr...
research
03/02/2023

SHAP-IQ: Unified Approximation of any-order Shapley Interactions

Predominately in explainable artificial intelligence (XAI) research, the...
research
07/02/2020

Efficient estimation of the ANOVA mean dimension, with an application to neural net classification

The mean dimension of a black box function of d variables is a convenien...
research
11/03/2022

Data-free Defense of Black Box Models Against Adversarial Attacks

Several companies often safeguard their trained deep models (i.e. detail...
research
02/17/2016

Authorship Attribution Using a Neural Network Language Model

In practice, training language models for individual authors is often ex...
research
07/04/2022

Harmonizer: Learning to Perform White-Box Image and Video Harmonization

Recent works on image harmonization solve the problem as a pixel-wise im...

Please sign up or login with your details

Forgot password? Click here to reset