Automating Interpretability: Discovering and Testing Visual Concepts Learned by Neural Networks

02/07/2019
by   Amirata Ghorbani, et al.
40

Interpretability has become an important topic of research as more machine learning (ML) models are deployed and widely used to make important decisions. Due to it's complexity, i For high-stakes domains such as medical, providing intuitive explanations that can be consumed by domain experts without ML expertise becomes crucial. To this demand, concept-based methods (e.g., TCAV) were introduced to provide explanations using user-chosen high-level concepts rather than individual input features. While these methods successfully leverage rich representations learned by the networks to reveal how human-defined concepts are related to the prediction, they require users to select concepts of their choice and collect labeled examples of those concepts. In this work, we introduce DTCAV (Discovery TCAV) a global concept-based interpretability method that can automatically discover concepts as image segments, along with each concept's estimated importance for a deep neural network's predictions. We validate that discovered concepts are as coherent to humans as hand-labeled concepts. We also show that the discovered concepts carry significant signal for prediction by analyzing a network's performance with stitched/added/deleted concepts. DTCAV results revealed a number of undesirable correlations (e.g., a basketball player's jersey was a more important concept for predicting the basketball class than the ball itself) and show the potential shallow reasoning of these networks.

READ FULL TEXT

page 11

page 12

page 13

page 15

page 17

page 20

page 21

page 22

research
09/13/2022

Concept-Based Explanations for Tabular Data

The interpretability of machine learning models has been an essential ar...
research
07/20/2022

Overlooked factors in concept-based explanations: Dataset choice, concept salience, and human capability

Concept-based interpretability methods aim to explain deep neural networ...
research
04/04/2022

ConceptExplainer: Understanding the Mental Model of Deep Learning Algorithms via Interactive Concept-based Explanations

Traditional deep learning interpretability methods which are suitable fo...
research
11/17/2021

Acquisition of Chess Knowledge in AlphaZero

What is learned by sophisticated neural network agents such as AlphaZero...
research
09/14/2023

Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning

Focus in Explainable AI is shifting from explanations defined in terms o...
research
12/23/2020

Analyzing Representations inside Convolutional Neural Networks

How can we discover and succinctly summarize the concepts that a neural ...
research
10/27/2020

Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning

The increasing impact of black box models, and particularly of unsupervi...

Please sign up or login with your details

Forgot password? Click here to reset