Deletion and Insertion Tests in Regression Models

05/25/2022
by   Naofumi Hama, et al.
0

A basic task in explainable AI (XAI) is to identify the most important features behind a prediction made by a black box function f. The insertion and deletion tests of <cit.> are used to judge the quality of algorithms that rank pixels from most to least important for a classification. Motivated by regression problems we establish a formula for their area under the curve (AUC) criteria in terms of certain main effects and interactions in an anchored decomposition of f. We find an expression for the expected value of the AUC under a random ordering of inputs to f and propose an alternative area above a straight line for the regression setting. We use this criterion to compare feature importances computed by integrated gradients (IG) to those computed by Kernel SHAP (KS). Exact computation of KS grows exponentially with dimension, while that of IG grows linearly with dimension. In two data sets including binary variables we find that KS is superior to IG in insertion and deletion tests, but only by a very small amount. Our comparison problems include some binary inputs that pose a challenge to IG because it must use values between the possible variable levels. We show that IG will match KS when f is an additive function plus a multilinear function of the variables. This includes a multilinear interpolation over the binary variables that would cause IG to have exponential cost in a naive implementation.

READ FULL TEXT
research
01/12/2021

Listwise Deletion in High Dimensions

We consider the properties of listwise deletion when both n and the numb...
research
02/08/2022

An Exploration of a New Group of Channel Symmetries

We study a certain symmetry group associated to any given communication ...
research
11/15/2022

Model free Shapley values for high dimensional data

A model-agnostic variable importance method can be used with arbitrary p...
research
07/02/2020

Efficient estimation of the ANOVA mean dimension, with an application to neural net classification

The mean dimension of a black box function of d variables is a convenien...
research
12/03/2020

Online Forgetting Process for Linear Regression Models

Motivated by the EU's "Right To Be Forgotten" regulation, we initiate a ...
research
08/28/2020

A Simple Algorithm for Exact Multinomial Tests

This work proposes a new method for computing acceptance regions of exac...

Please sign up or login with your details

Forgot password? Click here to reset