Unsupervised discovery of Interpretable Visual Concepts

Providing interpretability of deep-learning models to non-experts, while fundamental for a responsible real-world usage, is challenging. Attribution maps from xAI techniques, such as Integrated Gradients, are a typical example of a visualization technique containing a high level of information, but with difficult interpretation. In this paper, we propose two methods, Maximum Activation Groups Extraction (MAGE) and Multiscale Interpretable Visualization (Ms-IV), to explain the model's decision, enhancing global interpretability. MAGE finds, for a given CNN, combinations of features which, globally, form a semantic meaning, that we call concepts. We group these similar feature patterns by clustering in “concepts”, that we visualize through Ms-IV. This last method is inspired by Occlusion and Sensitivity analysis (incorporating causality), and uses a novel metric, called Class-aware Order Correlation (CaOC), to globally evaluate the most important image regions according to the model's decision space. We compare our approach to xAI methods such as LIME and Integrated Gradients. Experimental results evince the Ms-IV higher localization and faithfulness values. Finally, qualitative evaluation of combined MAGE and Ms-IV demonstrate humans' ability to agree, based on the visualization, on the decision of clusters' concepts; and, to detect, among a given set of networks, the existence of bias.

READ FULL TEXT

page 18

page 23

page 24

page 31

research
09/02/2021

Integrated Directional Gradients: Feature Interaction Attribution for Neural NLP Models

In this paper, we introduce Integrated Directional Gradients (IDG), a me...
research
07/04/2023

Interpretable Computer Vision Models through Adversarial Training: Unveiling the Robustness-Interpretability Connection

With the perpetual increase of complexity of the state-of-the-art deep n...
research
01/27/2018

Understanding Deep Architectures by Interpretable Visual Summaries

A consistent body of research investigates the recurrent visual patterns...
research
10/23/2020

Investigating Saturation Effects in Integrated Gradients

Integrated Gradients has become a popular method for post-hoc model inte...
research
09/21/2021

Learning Interpretable Concept Groups in CNNs

We propose a novel training methodology – Concept Group Learning (CGL) –...
research
09/18/2020

Contextual Semantic Interpretability

Convolutional neural networks (CNN) are known to learn an image represen...
research
04/20/2021

Revisiting The Evaluation of Class Activation Mapping for Explainability: A Novel Metric and Experimental Analysis

As the request for deep learning solutions increases, the need for expla...

Please sign up or login with your details

Forgot password? Click here to reset