Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

10/07/2016
by   Ramprasaath R. Selvaraju, et al.
0

We propose a technique for producing "visual explanations" for decisions from a large class of CNN-based models, making them more transparent. Our approach - Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept, flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Unlike previous approaches, GradCAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g. VGG), (2) CNNs used for structured outputs (e.g. captioning), (3) CNNs used in tasks with multimodal inputs (e.g. VQA) or reinforcement learning, without any architectural changes or re-training. We combine GradCAM with fine-grained visualizations to create a high-resolution class-discriminative visualization and apply it to off-the-shelf image classification, captioning, and visual question answering (VQA) models, including ResNet-based architectures. In the context of image classification models, our visualizations (a) lend insights into their failure modes (showing that seemingly unreasonable predictions have reasonable explanations), (b) are robust to adversarial images, (c) outperform previous methods on weakly-supervised localization, (d) are more faithful to the underlying model and (e) help achieve generalization by identifying dataset bias. For captioning and VQA, our visualizations show that even non-attention based models can localize inputs. Finally, we conduct human studies to measure if GradCAM explanations help users establish trust in predictions from deep networks and show that GradCAM helps untrained users successfully discern a "stronger" deep network from a "weaker" one. Our code is available at https://github.com/ramprs/grad-cam. A demo and a video of the demo can be found at http://gradcam.cloudcv.org and youtu.be/COjUB9Izk6E.

READ FULL TEXT

page 13

page 14

page 15

page 18

page 21

page 22

page 23

page 24

research
11/22/2016

Grad-CAM: Why did you say that?

We propose a technique for making Convolutional Neural Network (CNN)-bas...
research
05/31/2018

Respond-CAM: Analyzing Deep Models for 3D Imaging Data by Visualizations

The convolutional neural network (CNN) has become a powerful tool for va...
research
04/11/2021

Enhancing Deep Neural Network Saliency Visualizations with Gradual Extrapolation

We propose an enhancement technique of the Class Activation Mapping meth...
research
08/22/2017

CNN Fixations: An unraveling approach to visualize the discriminative image regions

Deep convolutional neural networks (CNN) have revolutionized various fie...
research
05/22/2022

Visual Explanations from Deep Networks via Riemann-Stieltjes Integrated Gradient-based Localization

Neural networks are becoming increasingly better at tasks that involve c...
research
12/18/2017

Visual Explanations from Hadamard Product in Multimodal Deep Networks

The visual explanation of learned representation of models helps to unde...
research
08/01/2020

Eigen-CAM: Class Activation Map using Principal Components

Deep neural networks are ubiquitous due to the ease of developing models...

Please sign up or login with your details

Forgot password? Click here to reset