Visual Question Answering: which investigated applications?

03/04/2021
by   Silvio Barra, et al.
0

Visual Question Answering (VQA) is an extremely stimulating and challenging research area where Computer Vision (CV) and Natural Language Processig (NLP) have recently met. In image captioning and video summarization, the semantic information is completely contained in still images or video dynamics, and it has only to be mined and expressed in a human-consistent way. Differently from this, in VQA semantic information in the same media must be compared with the semantics implied by a question expressed in natural language, doubling the artificial intelligence-related effort. Some recent surveys about VQA approaches have focused on methods underlying either the image-related processing or the verbal-related one, or on the way to consistently fuse the conveyed information. Possible applications are only suggested, and, in fact, most cited works rely on general-purpose datasets that are used to assess the building blocks of a VQA system. This paper rather considers the proposals that focus on real-world applications, possibly using as benchmarks suitable data bound to the application domain. The paper also reports about some recent challenges in VQA research.

READ FULL TEXT
research
10/05/2016

Visual Question Answering: Datasets, Algorithms, and Future Challenges

Visual Question Answering (VQA) is a recent problem in computer vision a...
research
01/15/2021

Recent Advances in Video Question Answering: A Review of Datasets and Methods

Video Question Answering (VQA) is a recent emerging challenging task in ...
research
02/12/2020

Component Analysis for Visual Question Answering Architectures

Recent research advances in Computer Vision and Natural Language Process...
research
01/31/2020

Augmenting Visual Question Answering with Semantic Frame Information in a Multitask Learning Approach

Visual Question Answering (VQA) concerns providing answers to Natural La...
research
09/06/2021

Improved RAMEN: Towards Domain Generalization for Visual Question Answering

Currently nearing human-level performance, Visual Question Answering (VQ...
research
06/08/2018

CS-VQA: Visual Question Answering with Compressively Sensed Images

Visual Question Answering (VQA) is a complex semantic task requiring bot...
research
08/09/2017

Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

This paper presents a state-of-the-art model for visual question answeri...

Please sign up or login with your details

Forgot password? Click here to reset