A Rate-Distortion Framework for Explaining Black-box Model Decisions

10/12/2021
by   Stefan Kolek, et al.
0

We present the Rate-Distortion Explanation (RDE) framework, a mathematically well-founded method for explaining black-box model decisions. The framework is based on perturbations of the target input signal and applies to any differentiable pre-trained model such as neural networks. Our experiments demonstrate the framework's adaptability to diverse data modalities, particularly images, audio, and physical simulations of urban environments.

READ FULL TEXT

page 8

page 9

page 10

page 15

page 16

page 17

research
07/21/2020

Towards Visual Distortion in Black-Box Attacks

Constructing adversarial examples in a black-box threat model injures th...
research
01/28/2019

Fairwashing: the risk of rationalization

Black-box explanation is the problem of explaining how a machine learnin...
research
11/30/2020

TimeSHAP: Explaining Recurrent Models through Sequence Perturbations

Recurrent neural networks are a standard building block in numerous mach...
research
11/26/2021

Reinforcement Explanation Learning

Deep Learning has become overly complicated and has enjoyed stellar succ...
research
06/04/2021

DOCTOR: A Simple Method for Detecting Misclassification Errors

Deep neural networks (DNNs) have shown to perform very well on large sca...
research
09/11/2018

Visualizing Convolutional Neural Networks to Improve Decision Support for Skin Lesion Classification

Because of their state-of-the-art performance in computer vision, CNNs a...
research
11/01/2019

Explaining black box decisions by Shapley cohort refinement

We introduce a variable importance measure to explain the importance of ...

Please sign up or login with your details

Forgot password? Click here to reset