On Concept-Based Explanations in Deep Neural Networks

10/17/2019
by   Chih-Kuan Yeh, et al.
21

Deep neural networks (DNNs) build high-level intelligence on low-level raw features. Understanding of this high-level intelligence can be enabled by deciphering the concepts they base their decisions on, as human-level thinking. In this paper, we study concept-based explainability for DNNs in a systematic framework. First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior. Based on performance and variability motivations, we propose two definitions to quantify completeness. We show that under degenerate conditions, our method is equivalent to Principal Component Analysis. Next, we propose a concept discovery method that considers two additional constraints to encourage the interpretability of the discovered concepts. We use game-theoretic notions to aggregate over sets to define an importance score for each discovered concept, which we call ConceptSHAP. On specifically-designed synthetic datasets and real-world text and image datasets, we validate the effectiveness of our framework in finding concepts that are complete in explaining the decision, and interpretable.

READ FULL TEXT

page 10

page 18

page 19

research
09/13/2022

Concept-Based Explanations for Tabular Data

The interpretability of machine learning models has been an essential ar...
research
03/04/2022

Concept-based Explanations for Out-Of-Distribution Detectors

Out-of-distribution (OOD) detection plays a crucial role in ensuring the...
research
07/15/2020

Explaining Deep Neural Networks using Unsupervised Clustering

We propose a novel method to explain trained deep neural networks (DNNs)...
research
05/14/2021

Cause and Effect: Concept-based Explanation of Neural Networks

In many scenarios, human decisions are explained based on some high-leve...
research
01/27/2023

Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees

The completeness axiom renders the explanation of a post-hoc XAI method ...
research
07/10/2023

Hierarchical Semantic Tree Concept Whitening for Interpretable Image Classification

With the popularity of deep neural networks (DNNs), model interpretabili...
research
04/01/2022

Provable concept learning for interpretable predictions using variational inference

In safety critical applications, practitioners are reluctant to trust ne...

Please sign up or login with your details

Forgot password? Click here to reset