Decision Explanation and Feature Importance for Invertible Networks

09/30/2019
by   Juntang Zhuang, et al.
0

Deep neural networks are vulnerable to adversarial attacks and hard to interpret because of their black-box nature. The recently proposed invertible network is able to accurately reconstruct the inputs to a layer from its outputs, thus has the potential to unravel the black-box model. An invertible network classifier can be viewed as a two-stage model: (1) invertible transformation from input space to the feature space; (2) a linear classifier in the feature space. We can determine the decision boundary of a linear classifier in the feature space; since the transform is invertible, we can invert the decision boundary from the feature space to the input space. Furthermore, we propose to determine the projection of a data point onto the decision boundary, and define explanation as the difference between data and its projection. Finally, we propose to locally approximate a neural network with its first-order Taylor expansion, and define feature importance using a local linear model. We provide the implementation of our method: <https://github.com/juntang-zhuang/explain_invertible>.

READ FULL TEXT
research
07/23/2019

Invertible Network for Classification and Biomarker Selection for ASD

Determining biomarkers for autism spectrum disorder (ASD) is crucial to ...
research
01/27/2020

Black Box Explanation by Learning Image Exemplars in the Latent Feature Space

We present an approach to explain the decisions of black box models for ...
research
06/19/2018

Defining Locality for Surrogates in Post-hoc Interpretablity

Local surrogate models, to approximate the local decision boundary of a ...
research
10/27/2022

Rethinking the Reverse-engineering of Trojan Triggers

Deep Neural Networks are vulnerable to Trojan (or backdoor) attacks. Rev...
research
04/08/2022

An Adaptive Black-box Backdoor Detection Method for Deep Neural Networks

With the surge of Machine Learning (ML), An emerging amount of intellige...
research
10/17/2022

DE-CROP: Data-efficient Certified Robustness for Pretrained Classifiers

Certified defense using randomized smoothing is a popular technique to p...
research
11/01/2019

Explanation by Progressive Exaggeration

As machine learning methods see greater adoption and implementation in h...

Please sign up or login with your details

Forgot password? Click here to reset