Visual Question Answering: Datasets, Algorithms, and Future Challenges

10/05/2016
by   Kushal Kafle, et al.
0

Visual Question Answering (VQA) is a recent problem in computer vision and natural language processing that has garnered a large amount of interest from the deep learning, computer vision, and natural language processing communities. In VQA, an algorithm needs to answer text-based questions about images. Since the release of the first VQA dataset in 2014, additional datasets have been released and many algorithms have been proposed. In this review, we critically examine the current state of VQA in terms of problem formulation, existing datasets, evaluation metrics, and algorithms. In particular, we discuss the limitations of current datasets with regard to their ability to properly train and assess VQA algorithms. We then exhaustively review existing algorithms for VQA. Finally, we discuss possible future directions for VQA and image understanding research.

READ FULL TEXT

page 3

page 7

page 10

page 12

page 13

page 20

page 26

page 28

research
05/02/2021

A survey on VQA_Datasets and Approaches

Visual question answering (VQA) is a task that combines both the techniq...
research
09/24/2017

Survey of Recent Advances in Visual Question Answering

Visual Question Answering (VQA) presents a unique challenge as it requir...
research
03/21/2018

Attention on Attention: Architectures for Visual Question Answering (VQA)

Visual Question Answering (VQA) is an increasingly popular topic in deep...
research
07/21/2023

Robust Visual Question Answering: Datasets, Methods, and Future Challenges

Visual question answering requires a system to provide an accurate natur...
research
11/16/2021

Language bias in Visual Question Answering: A Survey and Taxonomy

Visual question answering (VQA) is a challenging task, which has attract...
research
03/04/2021

Visual Question Answering: which investigated applications?

Visual Question Answering (VQA) is an extremely stimulating and challeng...
research
05/13/2019

Quantifying and Alleviating the Language Prior Problem in Visual Question Answering

Benefiting from the advancement of computer vision, natural language pro...

Please sign up or login with your details

Forgot password? Click here to reset