Evaluations and Methods for Explanation through Robustness Analysis

05/31/2020
by   Cheng-Yu Hsieh, et al.
15

Among multiple ways of interpreting a machine learning model, measuring the importance of a set of features tied to a prediction is probably one of the most intuitive ways to explain a model. In this paper, we establish the link between a set of features to a prediction with a new evaluation criterion, robustness analysis, which measures the minimum distortion distance of adversarial perturbation. By measuring the tolerance level for an adversarial attack, we can extract a set of features that provides the most robust support for a prediction, and also can extract a set of features that contrasts the current prediction to a target class by setting a targeted adversarial attack. By applying this methodology to various prediction tasks across multiple domains, we observe the derived explanations are indeed capturing the significant feature set qualitatively and quantitatively.

READ FULL TEXT

page 13

page 14

research
07/13/2017

Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models

Even todays most advanced machine learning models are easily fooled by a...
research
07/16/2022

CARBEN: Composite Adversarial Robustness Benchmark

Prior literature on adversarial attack methods has mainly focused on att...
research
04/22/2022

How Sampling Impacts the Robustness of Stochastic Neural Networks

Stochastic neural networks (SNNs) are random functions and predictions a...
research
08/21/2019

Testing Robustness Against Unforeseen Adversaries

Considerable work on adversarial defense has studied robustness to a fix...
research
05/10/2019

Interpreting and Evaluating Neural Network Robustness

Recently, adversarial deception becomes one of the most considerable thr...
research
04/23/2021

Evaluating Deception Detection Model Robustness To Linguistic Variation

With the increasing use of machine-learning driven algorithmic judgement...
research
12/20/2019

Explainability and Adversarial Robustness for RNNs

Recurrent Neural Networks (RNNs) yield attractive properties for constru...

Please sign up or login with your details

Forgot password? Click here to reset