Visual Question Answering using Deep Learning: A Survey and Performance Analysis

08/27/2019
by   Yash Srivastava, et al.
0

The Visual Question Answering (VQA) task combines challenges for processing data with both Visual and Linguistic processing, to answer basic `common sense' questions about given images. Given an image and a question in natural language, the VQA system tries to find the correct answer to it using visual elements of the image and inference gathered from textual questions. In this survey, we cover and discuss the recent datasets released in the VQA domain dealing with various types of question-formats and enabling robustness of the machine-learning models. Next, we discuss about new deep learning models that have shown promising results over the VQA datasets. At the end, we present and discuss some of the results computed by us over the vanilla VQA models, Stacked Attention Network and the VQA Challenge 2017 winner model. We also provide the detailed analysis along with the challenges and future research directions.

READ FULL TEXT
research
05/10/2017

Survey of Visual Question Answering: Datasets and Techniques

Visual question answering (or VQA) is a new and exciting problem that co...
research
11/19/2021

Medical Visual Question Answering: A Survey

Medical Visual Question Answering (VQA) is a combination of medical arti...
research
05/18/2023

Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature

Visual Question Answering (VQA) is an emerging area of interest for rese...
research
07/19/2023

A reinforcement learning approach for VQA validation: an application to diabetic macular edema grading

Recent advances in machine learning models have greatly increased the pe...
research
04/16/2016

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

This paper proposes deep convolutional network models that utilize local...
research
07/21/2023

Robust Visual Question Answering: Datasets, Methods, and Future Challenges

Visual question answering requires a system to provide an accurate natur...
research
06/23/2016

Analyzing the Behavior of Visual Question Answering Models

Recently, a number of deep-learning based models have been proposed for ...

Please sign up or login with your details

Forgot password? Click here to reset