DeepAI AI Chat
Log In Sign Up

Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

02/15/2018
by   Dong Huk Park, et al.
1

Deep models that are both effective and explainable are desirable in many settings; prior explainable models have been unimodal, offering either image-based visualization of attention weights or text-based generation of post-hoc justifications. We propose a multimodal approach to explanation, and argue that the two modalities provide complementary explanatory strengths. We collect two new datasets to define and evaluate this task, and propose a novel model which can provide joint textual rationale generation and attention visualization. Our datasets define visual and textual justifications of a classification decision for activity recognition tasks (ACT-X) and for visual question answering tasks (VQA-X). We quantitatively show that training with the textual explanations not only yields better textual justification models, but also better localizes the evidence that supports the decision. We also qualitatively show cases where visual explanation is more insightful than textual explanation, and vice versa, supporting our thesis that multimodal explanation models offer significant benefits over unimodal approaches.

READ FULL TEXT

page 1

page 4

page 7

page 8

page 9

11/17/2017

Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)

Deep models are the defacto standard in visual decision problems due to ...
04/29/2021

A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations

Explainable deep learning models are advantageous in many situations. Pr...
12/14/2016

Attentive Explanations: Justifying Decisions and Pointing to the Evidence

Deep models are the defacto standard in visual decision models due to th...
12/18/2017

Visual Explanations from Hadamard Product in Multimodal Deep Networks

The visual explanation of learned representation of models helps to unde...
08/07/2019

Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks

To verify and validate networks, it is essential to gain insight into th...
12/11/2022

Multimodal and Explainable Internet Meme Classification

Warning: this paper contains content that may be offensive or upsetting....
08/25/2021

Inducing Semantic Grouping of Latent Concepts for Explanations: An Ante-Hoc Approach

Self-explainable deep models are devised to represent the hidden concept...