DeepAI
Log In Sign Up

KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

01/02/2021
by   Yiran Xing, et al.
0

We present Knowledge Enhanced Multimodal BART (KM-BART), which is a Transformer-based sequence-to-sequence model capable of reasoning about commonsense knowledge from multimodal inputs of images and texts. We extend the popular BART architecture to a multi-modal model. We design a new pretraining task to improve the model performance on Visual Commonsense Generation task. Our pretraining task improves the Visual Commonsense Generation performance by leveraging knowledge from a large language model pretrained on an external knowledge graph. To the best of our knowledge, we are the first to propose a dedicated task for improving model performance on Visual Commonsense Generation. Experimental results show that by pretraining, our model reaches state-of-the-art performance on the Visual Commonsense Generation task.

READ FULL TEXT
12/16/2021

Commonsense Knowledge-Augmented Pretrained Language Models for Causal Reasoning Classification

Commonsense knowledge can be leveraged for identifying causal relations ...
06/04/2021

MERLOT: Multimodal Neural Script Knowledge Models

As humans, we understand events in the visual world contextually, perfor...
12/01/2020

An Enhanced Knowledge Injection Model for Commonsense Generation

Commonsense generation aims at generating plausible everyday scenario de...
08/08/2021

Leveraging Commonsense Knowledge on Classifying False News and Determining Checkworthiness of Claims

Widespread and rapid dissemination of false news has made fact-checking ...
10/10/2020

Beyond Language: Learning Commonsense from Images for Reasoning

This paper proposes a novel approach to learn commonsense from images, i...
10/12/2020

Social Commonsense Reasoning with Multi-Head Knowledge Attention

Social Commonsense Reasoning requires understanding of text, knowledge a...
10/10/2022

Do Children Texts Hold The Key To Commonsense Knowledge?

Compiling comprehensive repositories of commonsense knowledge is a long-...