Cognitive Visual Commonsense Reasoning Using Dynamic Working Memory

07/04/2021
by   Xuejiao Tang, et al.
0

Visual Commonsense Reasoning (VCR) predicts an answer with corresponding rationale, given a question-image input. VCR is a recently introduced visual scene understanding task with a wide range of applications, including visual question answering, automated vehicle systems, and clinical decision support. Previous approaches to solving the VCR task generally rely on pre-training or exploiting memory with long dependency relationship encoded models. However, these approaches suffer from a lack of generalizability and prior knowledge. In this paper we propose a dynamic working memory based cognitive VCR network, which stores accumulated commonsense between sentences to provide prior knowledge for inference. Extensive experiments show that the proposed model yields significant improvements over existing methods on the benchmark VCR dataset. Moreover, the proposed model provides intuitive interpretation into visual commonsense reasoning. A Python implementation of our mechanism is publicly available at https://github.com/tanjatang/DMVCR

READ FULL TEXT
research
04/17/2022

Attention Mechanism based Cognition-level Scene Understanding

Given a question-image input, the Visual Commonsense Reasoning (VCR) mod...
research
08/06/2021

Interpretable Visual Understanding with Cognitive Attention Network

While image understanding on recognition-level has achieved remarkable a...
research
05/30/2022

From Representation to Reasoning: Towards both Evidence and Commonsense Reasoning for Video Question-Answering

Video understanding has achieved great success in representation learnin...
research
03/01/2018

Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Relational Knowledge for Commonsense Machine Comprehension

This paper describes our system for SemEval-2018 Task 11: Machine Compre...
research
07/27/2019

A Hybrid Neural Network Model for Commonsense Reasoning

This paper proposes a hybrid neural network (HNN) model for commonsense ...
research
01/14/2018

Top k Memory Candidates in Memory Networks for Common Sense Reasoning

Successful completion of reasoning task requires the agent to have relev...
research
11/17/2016

Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic

In this work, we explore a genre of puzzles ("image riddles") which invo...

Please sign up or login with your details

Forgot password? Click here to reset