Towards Faithful and Meaningful Interpretable Representations

08/16/2020
by   Kacper Sokol, et al.
7

Interpretable representations are the backbone of many black-box explainers. They translate the low-level data representation necessary for good predictive performance into high-level human-intelligible concepts used to convey the explanation. Notably, the explanation type and its cognitive complexity are directly controlled by the interpretable representation, allowing to target a particular audience and use case. However, many explainers that rely on interpretable representations overlook their merit and fall back on default solutions, which may introduce implicit assumptions, thereby degrading the explanatory power of such techniques. To address this problem, we study properties of interpretable representations that encode presence and absence of human-comprehensible concepts. We show how they are operationalised for tabular, image and text data, discussing their strengths and weaknesses. Finally, we analyse their explanatory properties in the context of tabular data, where a linear model is used to quantify the importance of interpretable concepts.

READ FULL TEXT

page 4

page 6

page 10

page 13

page 16

research
05/28/2019

EDUCE: Explaining model Decisions through Unsupervised Concepts Extraction

With the advent of deep neural networks, some research focuses towards u...
research
05/11/2021

Rationalization through Concepts

Automated predictions require explanations to be interpretable by humans...
research
05/24/2023

SenteCon: Leveraging Lexicons to Learn Human-Interpretable Language Representations

Although deep language representations have become the dominant form of ...
research
11/22/2022

Towards Human-Interpretable Prototypes for Visual Assessment of Image Classification Models

Explaining black-box Artificial Intelligence (AI) models is a cornerston...
research
07/20/2023

Identifying Interpretable Subspaces in Image Representations

We propose Automatic Feature Explanation using Contrasting Concepts (FAL...
research
09/09/2021

Unsupervised Causal Binary Concepts Discovery with VAE for Black-box Model Explanation

We aim to explain a black-box classifier with the form: `data X is class...
research
05/02/2022

VICE: Variational Interpretable Concept Embeddings

A central goal in the cognitive sciences is the development of numerical...

Please sign up or login with your details

Forgot password? Click here to reset