Explaining Deep Neural Networks

10/04/2020
by   Oana-Maria Camburu, et al.
0

Deep neural networks are becoming more and more popular due to their revolutionary success in diverse areas, such as computer vision, natural language processing, and speech recognition. However, the decision-making processes of these models are generally not interpretable to users. In various domains, such as healthcare, finance, or law, it is critical to know the reasons behind a decision made by an artificial intelligence system. Therefore, several directions for explaining neural models have recently been explored. In this thesis, I investigate two major directions for explaining deep neural networks. The first direction consists of feature-based post-hoc explanatory methods, that is, methods that aim to explain an already trained and fixed model (post-hoc), and that provide explanations in terms of input features, such as tokens for text and superpixels for images (feature-based). The second direction consists of self-explanatory neural models that generate natural language explanations, that is, models that have a built-in module that generates explanations for the predictions of the model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2019

Minimalistic Explanations: Capturing the Essence of Decisions

The use of complex machine learning models can make systems opaque to us...
research
09/23/2020

The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets

For neural models to garner widespread public trust and ensure fairness,...
research
12/14/2020

Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision

The black-box nature of neural models has motivated a line of research t...
research
10/25/2018

Tackling Sequence to Sequence Mapping Problems with Neural Networks

In Natural Language Processing (NLP), it is important to detect the rela...
research
05/29/2023

Faithfulness Tests for Natural Language Explanations

Explanations of neural models aim to reveal a model's decision-making pr...
research
08/13/2018

Learning Explanations from Language Data

PatternAttribution is a recent method, introduced in the vision domain, ...
research
03/27/2021

Explaining the Road Not Taken

It is unclear if existing interpretations of deep neural network models ...

Please sign up or login with your details

Forgot password? Click here to reset