Latent Variable Models for Visual Question Answering

01/16/2021
by   Zixu Wang, et al.
0

Conventional models for Visual Question Answering (VQA) explore deterministic approaches with various types of image features, question features, and attention mechanisms. However, there exist other modalities that can be explored in addition to image and question pairs to bring extra information to the models. In this work, we propose latent variable models for VQA where extra information (e.g. captions and answer categories) are incorporated as latent variables to improve inference, which in turn benefits question-answering performance. Experiments on the VQA v2.0 benchmarking dataset demonstrate the effectiveness of our proposed models in that they improve over strong baselines, especially those that do not rely on extensive language-vision pre-training.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 7

09/21/2017

Visual Question Generation as Dual Task of Visual Question Answering

Recently visual question answering (VQA) and visual question generation ...
03/31/2021

Analysis on Image Set Visual Question Answering

We tackle the challenge of Visual Question Answering in multi-image sett...
01/10/2020

In Defense of Grid Features for Visual Question Answering

Popularized as 'bottom-up' attention, bounding box (or region) based vis...
01/22/2021

Visual Question Answering based on Local-Scene-Aware Referring Expression Generation

Visual question answering requires a deep understanding of both images a...
12/30/2021

VisQA: Quantifying Information Visualisation Recallability via Question Answering

Despite its importance for assessing the effectiveness of communicating ...
04/30/2021

Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads

Vision-and-Language (VL) pre-training has shown great potential on many ...
12/19/2019

Deep Exemplar Networks for VQA and VQG

In this paper, we consider the problem of solving semantic tasks such as...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.