DeepAI AI Chat
Log In Sign Up

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks

03/09/2022
by   Fawaz Sammani, et al.
etrovub.be
Vrije Universiteit Brussel
30

Natural language explanation (NLE) models aim at explaining the decision-making process of a black box system via generating natural language sentences which are human-friendly, high-level and fine-grained. Current NLE models explain the decision-making process of a vision or vision-language model (a.k.a., task model), e.g., a VQA model, via a language model (a.k.a., explanation model), e.g., GPT. Other than the additional memory resources and inference time required by the task model, the task and explanation models are completely independent, which disassociates the explanation from the reasoning process made to predict the answer. We introduce NLX-GPT, a general, compact and faithful language model that can simultaneously predict an answer and explain it. We first conduct pre-training on large scale data of image-caption pairs for general understanding of images, and then formulate the answer as a text prediction task along with the explanation. Without region proposals nor a task model, our resulting overall framework attains better evaluation scores, contains much less parameters and is 15× faster than the current SoA model. We then address the problem of evaluating the explanations which can be in many times generic, data-biased and can come in several forms. We therefore design 2 new evaluation measures: (1) explain-predict and (2) retrieval-based attack, a self-evaluation framework that requires no labels. Code is at: https://github.com/fawazsammani/nlxgpt.

READ FULL TEXT

page 3

page 7

page 8

page 13

page 15

page 16

page 17

page 18

05/25/2020

NILE : Natural Language Inference with Faithful Natural Language Explanations

The recent growth in the popularity and success of deep learning models ...
07/23/2022

Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations

Visual Entailment with natural language explanations aims to infer the r...
12/19/2022

Explanation Regeneration via Information Bottleneck

Explaining the black-box predictions of NLP models naturally and accurat...
11/13/2022

Language Model Classifier Aligns Better with Physician Word Sensitivity than XGBoost on Readmission Prediction

Traditional evaluation metrics for classification in natural language pr...
05/11/2023

Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts

We present a sequence-to-sequence vision-language model whose parameters...
12/14/2016

Attentive Explanations: Justifying Decisions and Pointing to the Evidence

Deep models are the defacto standard in visual decision models due to th...
01/11/2021

Explain and Predict, and then Predict again

A desirable property of learning systems is to be both effective and int...

Code Repositories

nlxgpt

NLX-GPT: A Model for Natural Language Explanations in Vision and Vision-Language Tasks, CVPR 2022


view repo