Concept Gradient: Concept-based Interpretation Without Linear Assumption

08/31/2022
by   Andrew Bai, et al.
0

Concept-based interpretations of black-box models are often more intuitive for humans to understand. The most widely adopted approach for concept-based interpretation is Concept Activation Vector (CAV). CAV relies on learning a linear relation between some latent representation of a given model and concepts. The linear separability is usually implicitly assumed but does not hold true in general. In this work, we started from the original intent of concept-based interpretation and proposed Concept Gradient (CG), extending concept-based interpretation beyond linear concept functions. We showed that for a general (potentially non-linear) concept, we can mathematically evaluate how a small change of concept affecting the model's prediction, which leads to an extension of gradient-based interpretation to the concept space. We demonstrated empirically that CG outperforms CAV in both toy examples and real world datasets.

READ FULL TEXT

page 9

page 19

research
06/24/2021

Promises and Pitfalls of Black-Box Concept Learning Models

Machine learning models that incorporate concept learning as an intermed...
research
05/07/2022

ConceptDistil: Model-Agnostic Distillation of Concept Explanations

Concept-based explanations aims to fill the model interpretability gap f...
research
04/05/2022

Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors

Robustness of machine learning models on ever-changing real-world data i...
research
10/18/2022

Linear Guardedness and its Implications

Previous work on concept identification in neural representations has fo...
research
04/10/2023

Coherent Concept-based Explanations in Medical Image and Its Application to Skin Lesion Diagnosis

Early detection of melanoma is crucial for preventing severe complicatio...
research
06/14/2023

Selective Concept Models: Permitting Stakeholder Customisation at Test-Time

Concept-based models perform prediction using a set of concepts that are...
research
05/21/2022

Exploring Concept Contribution Spatially: Hidden Layer Interpretation with Spatial Activation Concept Vector

To interpret deep learning models, one mainstream is to explore the lear...

Please sign up or login with your details

Forgot password? Click here to reset