Unsupervised Interpretable Basis Extraction for Concept-Based Visual Explanations

03/19/2023
by   Alexandros Doumanoglou, et al.
0

An important line of research attempts to explain CNN image classifier predictions and intermediate layer representations in terms of human understandable concepts. In this work, we expand on previous works in the literature that use annotated concept datasets to extract interpretable feature space directions and propose an unsupervised post-hoc method to extract a disentangling interpretable basis by looking for the rotation of the feature space that explains sparse one-hot thresholded transformed representations of pixel activations. We do experimentation with existing popular CNNs and demonstrate the effectiveness of our method in extracting an interpretable basis across network architectures and training datasets. We make extensions to the existing basis interpretability metrics found in the literature and show that, intermediate layer representations become more interpretable when transformed to the bases extracted with our method. Finally, using the basis interpretability metrics, we compare the bases extracted with our method with the bases derived with a supervised approach and find that, in one aspect, the proposed unsupervised approach has a strength that constitutes a limitation of the supervised one and give potential directions for future research.

READ FULL TEXT

page 1

page 2

page 10

page 11

page 12

page 13

page 14

page 15

research
11/15/2017

Interpreting Deep Visual Representations via Network Dissection

The success of recent deep convolutional neural networks (CNNs) depends ...
research
05/27/2022

Neural Basis Models for Interpretability

Due to the widespread use of complex machine learning models in real-wor...
research
09/14/2023

Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning

Focus in Explainable AI is shifting from explanations defined in terms o...
research
07/13/2023

Uncovering Unique Concept Vectors through Latent Space Decomposition

Interpreting the inner workings of deep learning models is crucial for e...
research
09/21/2021

Learning Interpretable Concept Groups in CNNs

We propose a novel training methodology – Concept Group Learning (CGL) –...
research
01/11/2021

Learning Semantically Meaningful Features for Interpretable Classifications

Learning semantically meaningful features is important for Deep Neural N...
research
04/14/2021

Is Disentanglement all you need? Comparing Concept-based Disentanglement Approaches

Concept-based explanations have emerged as a popular way of extracting h...

Please sign up or login with your details

Forgot password? Click here to reset