Unsupervised Keyword Extraction for Full-sentence VQA

11/23/2019
by   Kohei Uehara, et al.
28

In existing studies on Visual Question Answering (VQA), which aims to train an intelligent system to be able to answer questions about images, the answers corresponding to the questions consists of short, almost single words. However, considering the natural conversation with humans, the answers would more likely to be sentences, rather than single words. In such a situation, the system needs to focus on a keyword, i.e., the most important word in the sentence, to answer the question. Therefore, we have proposed a novel keyword extraction method for VQA. Because collecting keywords and full-sentence annotations for VQA can be highly costly, we perform the keyword extraction in an unsupervised manner. Our key insight is that the full-sentence answer can be decomposed into two parts: the part contains new information for the question and the part only contains information already included in the question. Since the keyword is considered as the part which contains new information as the answer, we need to identify which words in the full-sentence answer are the part of new information and which words are not. To ensure such decomposition, we extracted two features from the full-sentence answers, and designed discriminative decoders to make each feature to include the information of the question and answers respectively. We conducted experiments on existing VQA datasets, which contains full-sentence annotations, and show that our proposed model can correctly extract the keyword without any keyword annotations.

READ FULL TEXT

page 1

page 8

research
09/21/2016

The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)

Visual Question Answering (VQA) task has showcased a new stage of intera...
research
10/23/2019

A Novel Approach for Automatic Bengali Question Answering System using Semantic Similarity Analysis

Finding the semantically accurate answer is one of the key challenges in...
research
04/02/2022

Co-VQA : Answering by Interactive Sub Question Sequence

Most existing approaches to Visual Question Answering (VQA) answer quest...
research
10/01/2022

A Dual-Attention Learning Network with Word and Sentence Embedding for Medical Visual Question Answering

Research in medical visual question answering (MVQA) can contribute to t...
research
11/01/2017

Keyword-based Query Comprehending via Multiple Optimized-Demand Augmentation

In this paper, we consider the problem of machine reading task when the ...
research
04/04/2023

Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder

Medical Visual Question Answering (VQA) systems play a supporting role t...
research
01/15/2019

Answering Comparative Questions: Better than Ten-Blue-Links?

We present CAM (comparative argumentative machine), a novel open-domain ...

Please sign up or login with your details

Forgot password? Click here to reset