ELUDE: Generating interpretable explanations via a decomposition into labelled and unlabelled features

06/15/2022
by   Vikram V. Ramaswamy, et al.
27

Deep learning models have achieved remarkable success in different areas of machine learning over the past decade; however, the size and complexity of these models make them difficult to understand. In an effort to make them more interpretable, several recent works focus on explaining parts of a deep neural network through human-interpretable, semantic attributes. However, it may be impossible to completely explain complex models using only semantic attributes. In this work, we propose to augment these attributes with a small set of uninterpretable features. Specifically, we develop a novel explanation framework ELUDE (Explanation via Labelled and Unlabelled DEcomposition) that decomposes a model's prediction into two parts: one that is explainable through a linear combination of the semantic attributes, and another that is dependent on the set of uninterpretable features. By identifying the latter, we are able to analyze the "unexplained" portion of the model, obtaining insights into the information used by the model. We show that the set of unlabelled features can generalize to multiple models trained with the same feature space and compare our work to two popular attribute-oriented methods, Interpretable Basis Decomposition and Concept Bottleneck, and discuss the additional insights ELUDE provides.

READ FULL TEXT

page 9

page 10

page 18

page 19

page 20

page 23

page 24

research
12/11/2020

Dependency Decomposition and a Reject Option for Explainable Models

Deploying machine learning models in safety-related do-mains (e.g. auton...
research
05/26/2019

Why do These Match? Explaining the Behavior of Image Similarity Models

Explaining a deep learning model can help users understand its behavior ...
research
07/12/2021

Interpretable Mammographic Image Classification using Cased-Based Reasoning and Deep Learning

When we deploy machine learning models in high-stakes medical settings, ...
research
06/23/2016

Identifying individual facial expressions by deconstructing a neural network

This paper focuses on the problem of explaining predictions of psycholog...
research
04/27/2021

Explaining in Style: Training a GAN to explain a classifier in StyleSpace

Image classification models can depend on multiple different semantic at...
research
01/23/2023

Feature construction using explanations of individual predictions

Feature construction can contribute to comprehensibility and performance...
research
07/12/2022

eX-ViT: A Novel eXplainable Vision Transformer for Weakly Supervised Semantic Segmentation

Recently vision transformer models have become prominent models for a ra...

Please sign up or login with your details

Forgot password? Click here to reset