DeepAI AI Chat
Log In Sign Up

Improving Interpretability of CNN Models Using Non-Negative Concept Activation Vectors

by   Ruihan Zhang, et al.
The University of Melbourne

Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form. This deficiency remains a key challenge when applying CNNs in important domains. Recent work for explanations through feature importance of approximate linear models has moved from input-level features (pixels or segments) to features from mid-layer feature maps in the guise of concept activation vectors (CAVs). CAVs contain concept-level information and could be learnt via Clustering. In this work, we rethink the ACE algorithm of Ghorbani et al., proposing an alternative concept-based explanation framework. Based on the requirements of fidelity (approximate models) and interpretability (being meaningful to people), we design measurements and evaluate a range of dimensionality reduction methods for alignment with our framework. We find that non-negative concept activation vectors from non-negative matrix factorization provide superior performance in interpretability and fidelity based on computational and human subject experiments. Our framework provides both local and global concept-level explanations for pre-trained CNN models.


page 2

page 3

page 6


Concept-based Explanations using Non-negative Concept Activation Vectors and Decision Tree for CNN Models

This paper evaluates whether training a decision tree based on concepts ...

Robust Semantic Interpretability: Revisiting Concept Activation Vectors

Interpretability methods for image classification assess model trustwort...

Detecting Memorization in ReLU Networks

We propose a new notion of `non-linearity' of a network layer with respe...

PatClArC: Using Pattern Concept Activation Vectors for Noise-Robust Model Debugging

State-of-the-art machine learning models are commonly (pre-)trained on l...

CRAFT: Concept Recursive Activation FacTorization for Explainability

Attribution methods are a popular class of explainability methods that u...

Respond-CAM: Analyzing Deep Models for 3D Imaging Data by Visualizations

The convolutional neural network (CNN) has become a powerful tool for va...

Code Repositories


Invertible Concept-based Explanation (ICE)

view repo