Zero-Shot Visual Question Answering

11/17/2016
by   Damien Teney, et al.
0

Part of the appeal of Visual Question Answering (VQA) is its promise to answer new questions about previously unseen images. Most current methods demand training questions that illustrate every possible concept, and will therefore never achieve this capability, since the volume of required training data would be prohibitive. Answering general questions about images requires methods capable of Zero-Shot VQA, that is, methods able to answer questions beyond the scope of the training questions. We propose a new evaluation protocol for VQA methods which measures their ability to perform Zero-Shot VQA, and in doing so highlights significant practical deficiencies of current approaches, some of which are masked by the biases in current datasets. We propose and evaluate several strategies for achieving Zero-Shot VQA, including methods based on pretrained word embeddings, object classifiers with semantic embeddings, and test-time retrieval of example images. Our extensive experiments are intended to serve as baselines for Zero-Shot VQA, and they also achieve state-of-the-art performance in the standard VQA evaluation setting.

READ FULL TEXT

page 3

page 12

page 13

page 14

page 15

page 16

page 17

page 18

research
07/12/2021

Zero-shot Visual Question Answering using Knowledge Graph

Incorporating external knowledge to Visual Question Answering (VQA) has ...
research
11/02/2018

Zero-Shot Transfer VQA Dataset

Acquiring a large vocabulary is an important aspect of human intelligenc...
research
05/27/2023

Modularized Zero-shot VQA with Pre-trained Models

Large-scale pre-trained models (PTMs) show great zero-shot capabilities....
research
09/21/2022

Continual VQA for Disaster Response Systems

Visual Question Answering (VQA) is a multi-modal task that involves answ...
research
05/16/2021

Survey of Visual-Semantic Embedding Methods for Zero-Shot Image Retrieval

Visual-semantic embedding is an interesting research topic because it is...
research
01/08/2023

Mind Reasoning Manners: Enhancing Type Perception for Generalized Zero-shot Logical Reasoning over Text

Logical reasoning task involves diverse types of complex reasoning over ...
research
05/25/2020

Knowledge Graph Simple Question Answering for Unseen Domains

Knowledge graph simple question answering (KGSQA), in its standard form,...

Please sign up or login with your details

Forgot password? Click here to reset