Zero-shot Visual Question Answering using Knowledge Graph

07/12/2021
by   Zhuo Chen, et al.
13

Incorporating external knowledge to Visual Question Answering (VQA) has become a vital practical need. Existing methods mostly adopt pipeline approaches with different components for knowledge matching and extraction, feature learning, etc.However, such pipeline approaches suffer when some component does not perform well, which leads to error propagation and poor overall performance. Furthermore, the majority of existing approaches ignore the answer bias issue – many answers may have never appeared during training (i.e., unseen answers) in real-word application. To bridge these gaps, in this paper, we propose a Zero-shot VQA algorithm using knowledge graphs and a mask-based learning mechanism for better incorporating external knowledge, and present new answer-based Zero-shot VQA splits for the F-VQA dataset. Experiments show that our method can achieve state-of-the-art performance in Zero-shot VQA with unseen answers, meanwhile dramatically augment existing end-to-end models on the normal F-VQA task.

READ FULL TEXT
research
11/17/2016

Zero-Shot Visual Question Answering

Part of the appeal of Visual Question Answering (VQA) is its promise to ...
research
12/21/2022

From Images to Textual Prompts: Zero-shot VQA with Frozen Large Language Models

Large language models (LLMs) have demonstrated excellent zero-shot gener...
research
11/02/2018

Zero-Shot Transfer VQA Dataset

Acquiring a large vocabulary is an important aspect of human intelligenc...
research
06/30/2023

Multimodal Prompt Retrieval for Generative Visual Question Answering

Recent years have witnessed impressive results of pre-trained vision-lan...
research
09/14/2022

MUST-VQA: MUltilingual Scene-text VQA

In this paper, we present a framework for Multilingual Scene Text Visual...
research
12/22/2022

When are Lemons Purple? The Concept Association Bias of CLIP

Large-scale vision-language models such as CLIP have shown impressive pe...
research
07/03/2023

Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction

In the current digitalization era, capturing and effectively representin...

Please sign up or login with your details

Forgot password? Click here to reset