Maximally Invariant Data Perturbation as Explanation

06/19/2018
by   Satoshi Hara, et al.
0

While several feature scoring methods are proposed to explain the output of complex machine learning models, most of them lack formal mathematical definitions. In this study, we propose a novel definition of the feature score using the maximally invariant data perturbation, which is inspired from the idea of adversarial example. In adversarial example, one seeks the smallest data perturbation that changes the model's output. In our proposed approach, we consider the opposite: we seek the maximally invariant data perturbation that does not change the model's output. In this way, we can identify important input features as the ones with small allowable data perturbations. To find the maximally invariant data perturbation, we formulate the problem as linear programming. The experiment on the image classification with VGG16 shows that the proposed method could identify relevant parts of the images effectively.

READ FULL TEXT
research
07/12/2018

Maximizing Invariant Data Perturbation with Stochastic Optimization

Feature attribution methods, or saliency maps, are one of the most popul...
research
08/02/2023

A Novel Cross-Perturbation for Single Domain Generalization

Single domain generalization aims to enhance the ability of the model to...
research
07/13/2017

Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models

Even todays most advanced machine learning models are easily fooled by a...
research
11/19/2015

A Unified Gradient Regularization Family for Adversarial Examples

Adversarial examples are augmented data points generated by imperceptibl...
research
03/27/2021

Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries

In addition to high accuracy, robustness is becoming increasingly import...
research
09/13/2022

Class-Level Logit Perturbation

Features, logits, and labels are the three primary data when a sample pa...
research
12/19/2019

Mitigating large adversarial perturbations on X-MAS (X minus Moving Averaged Samples)

We propose the scheme that mitigates an adversarial perturbation ϵ on th...

Please sign up or login with your details

Forgot password? Click here to reset