AI Chat AI Image Generator AI Video Text to Speech

A survey on VQA_Datasets and Approaches

05/02/2021

∙

by Yeyun Zou, et al.

∙

∙

Visual question answering (VQA) is a task that combines both the techniques of computer vision and natural language processing. It requires models to answer a text-based question according to the information contained in a visual. In recent years, the research field of VQA has been expanded. Research that focuses on the VQA, examining the reasoning ability and VQA on scientific diagrams, has also been explored more. Meanwhile, more multimodal feature fusion mechanisms have been proposed. This paper will review and analyze existing datasets, metrics, and models proposed for the VQA task.

Yeyun Zou
1 publication
Qiyu Xie
1 publication

page 1

page 2

page 3

page 4

research

∙ 10/05/2016

Visual Question Answering: Datasets, Algorithms, and Future Challenges

Visual Question Answering (VQA) is a recent problem in computer vision a...

0 Kushal Kafle, et al. ∙

research

∙ 06/03/2018

On the Flip Side: Identifying Counterexamples in Visual Question Answering

Visual question answering (VQA) models respond to open-ended natural lan...

0 Gabriel Grand, et al. ∙

research

∙ 03/01/2019

Answer Them All! Toward Universal Visual Question Answering Models

Visual Question Answering (VQA) research is split into two camps: the fi...

6 Robik Shrestha, et al. ∙

research

∙ 01/15/2021

Recent Advances in Video Question Answering: A Review of Datasets and Methods

Video Question Answering (VQA) is a recent emerging challenging task in ...

0 Devshree Patel, et al. ∙

research

∙ 02/12/2020

Component Analysis for Visual Question Answering Architectures

Recent research advances in Computer Vision and Natural Language Process...

0 Camila Kolling, et al. ∙

research

∙ 11/29/2018

Visual Question Answering as Reading Comprehension

Visual question answering (VQA) demands simultaneous comprehension of bo...

4 Hui Li, et al. ∙

research

∙ 09/24/2021

How to find a good image-text embedding for remote sensing visual question answering?

Visual question answering (VQA) has recently been introduced to remote s...

0 Christel Chappuis, et al. ∙