KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

01/02/2021
by   Yiran Xing, et al.
0

We present Knowledge Enhanced Multimodal BART (KM-BART), which is a Transformer-based sequence-to-sequence model capable of reasoning about commonsense knowledge from multimodal inputs of images and texts. We extend the popular BART architecture to a multi-modal model. We design a new pretraining task to improve the model performance on Visual Commonsense Generation task. Our pretraining task improves the Visual Commonsense Generation performance by leveraging knowledge from a large language model pretrained on an external knowledge graph. To the best of our knowledge, we are the first to propose a dedicated task for improving model performance on Visual Commonsense Generation. Experimental results show that by pretraining, our model reaches state-of-the-art performance on the Visual Commonsense Generation task.

READ FULL TEXT
research
12/16/2021

Commonsense Knowledge-Augmented Pretrained Language Models for Causal Reasoning Classification

Commonsense knowledge can be leveraged for identifying causal relations ...
research
10/12/2020

Social Commonsense Reasoning with Multi-Head Knowledge Attention

Social Commonsense Reasoning requires understanding of text, knowledge a...
research
06/04/2021

MERLOT: Multimodal Neural Script Knowledge Models

As humans, we understand events in the visual world contextually, perfor...
research
01/27/2023

Learning the Effects of Physical Actions in a Multi-modal Environment

Large Language Models (LLMs) handle physical commonsense information ina...
research
12/01/2020

An Enhanced Knowledge Injection Model for Commonsense Generation

Commonsense generation aims at generating plausible everyday scenario de...
research
08/08/2021

Leveraging Commonsense Knowledge on Classifying False News and Determining Checkworthiness of Claims

Widespread and rapid dissemination of false news has made fact-checking ...
research
10/10/2022

Do Children Texts Hold The Key To Commonsense Knowledge?

Compiling comprehensive repositories of commonsense knowledge is a long-...

Please sign up or login with your details

Forgot password? Click here to reset