DeepAI AI Chat
Log In Sign Up

Explaining Deep Neural Networks

10/04/2020
by   Oana-Maria Camburu, et al.
0

Deep neural networks are becoming more and more popular due to their revolutionary success in diverse areas, such as computer vision, natural language processing, and speech recognition. However, the decision-making processes of these models are generally not interpretable to users. In various domains, such as healthcare, finance, or law, it is critical to know the reasons behind a decision made by an artificial intelligence system. Therefore, several directions for explaining neural models have recently been explored. In this thesis, I investigate two major directions for explaining deep neural networks. The first direction consists of feature-based post-hoc explanatory methods, that is, methods that aim to explain an already trained and fixed model (post-hoc), and that provide explanations in terms of input features, such as tokens for text and superpixels for images (feature-based). The second direction consists of self-explanatory neural models that generate natural language explanations, that is, models that have a built-in module that generates explanations for the predictions of the model.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/08/2019

Minimalistic Explanations: Capturing the Essence of Decisions

The use of complex machine learning models can make systems opaque to us...
09/23/2020

The Struggles of Feature-Based Explanations: Shapley Values vs. Minimal Sufficient Subsets

For neural models to garner widespread public trust and ensure fairness,...
12/14/2020

Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision

The black-box nature of neural models has motivated a line of research t...
10/25/2018

Tackling Sequence to Sequence Mapping Problems with Neural Networks

In Natural Language Processing (NLP), it is important to detect the rela...
05/29/2023

Faithfulness Tests for Natural Language Explanations

Explanations of neural models aim to reveal a model's decision-making pr...
05/30/2023

Explaining Hate Speech Classification with Model Agnostic Methods

There have been remarkable breakthroughs in Machine Learning and Artific...
08/13/2018

Learning Explanations from Language Data

PatternAttribution is a recent method, introduced in the vision domain, ...