Consistent Explanations by Contrastive Learning

10/01/2021
by   Vipin Pillai, et al.
28

Understanding and explaining the decisions of neural networks are critical to building trust, rather than relying on them as black box algorithms. Post-hoc evaluation techniques, such as Grad-CAM, enable humans to inspect the spatial regions responsible for a particular network decision. However, it is shown that such explanations are not always consistent with human priors, such as consistency across image transformations. Given an interpretation algorithm, e.g., Grad-CAM, we introduce a novel training method to train the model to produce more consistent explanations. Since obtaining the ground truth for a desired model interpretation is not a well-defined task, we adopt ideas from contrastive self-supervised learning and apply them to the interpretations of the model rather than its embeddings. Explicitly training the network to produce more reasonable interpretations and subsequently evaluating those interpretations will enhance our ability to trust the network. We show that our method, Contrastive Grad-CAM Consistency (CGC), results in Grad-CAM interpretation heatmaps that are consistent with human annotations while still achieving comparable classification accuracy. Moreover, since our method can be seen as a form of regularizer, on limited-data fine-grained classification settings, our method outperforms the baseline classification accuracy on Caltech-Birds, Stanford Cars, VGG Flowers, and FGVC-Aircraft datasets. In addition, because our method does not rely on annotations, it allows for the incorporation of unlabeled data into training, which enables better generalization of the model. Our code is publicly available.

READ FULL TEXT

page 1

page 7

page 10

page 11

research
05/18/2023

BELLA: Black box model Explanations by Local Linear Approximations

In recent years, understanding the decision-making process of black-box ...
research
07/02/2023

CLIMAX: An exploration of Classifier-Based Contrastive Explanations

Explainable AI is an evolving area that deals with understanding the dec...
research
12/27/2020

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

Humans give contrastive explanations that explain why an observed event ...
research
07/21/2020

SUBPLEX: Towards a Better Understanding of Black Box Model Explanations at the Subpopulation Level

Understanding the interpretation of machine learning (ML) models has bee...
research
11/17/2020

Dual-stream Multiple Instance Learning Network for Whole Slide Image Classification with Self-supervised Contrastive Learning

Whole slide images (WSIs) have large resolutions and usually lack locali...
research
04/29/2019

Why should you trust my interpretation? Understanding uncertainty in LIME predictions

Methods for interpreting machine learning black-box models increase the ...
research
09/12/2021

The Logic Traps in Evaluating Post-hoc Interpretations

Post-hoc interpretation aims to explain a trained model and reveal how t...

Please sign up or login with your details

Forgot password? Click here to reset