What's in a Question: Using Visual Questions as a Form of Supervision

04/12/2017
by   Siddha Ganju, et al.
0

Collecting fully annotated image datasets is challenging and expensive. Many types of weak supervision have been explored: weak manual annotations, web search results, temporal continuity, ambient sound and others. We focus on one particular unexplored mode: visual questions that are asked about images. The key observation that inspires our work is that the question itself provides useful information about the image (even without the answer being available). For instance, the question "what is the breed of the dog?" informs the AI that the animal in the scene is a dog and that there is only one dog present. We make three contributions: (1) providing an extensive qualitative and quantitative analysis of the information contained in human visual questions, (2) proposing two simple but surprisingly effective modifications to the standard visual question answering models that allow them to make use of weak supervision in the form of unanswered questions associated with images and (3) demonstrating that a simple data augmentation strategy inspired by our insights results in a 7.1

READ FULL TEXT

page 3

page 4

page 7

page 11

page 12

research
08/27/2020

Visual Question Answering on Image Sets

We introduce the task of Image-Set Visual Question Answering (ISVQA), wh...
research
08/09/2017

Learning to Disambiguate by Asking Discriminative Questions

The ability to ask questions is a powerful tool to gather information in...
research
11/13/2019

Neural Duplicate Question Detection without Labeled Training Data

Supervised training of neural models to duplicate question detection in ...
research
07/01/2019

Weak Supervision Enhanced Generative Network for Question Generation

Automatic question generation according to an answer within the given pa...
research
12/08/2022

Successive Prompting for Decomposing Complex Questions

Answering complex questions that require making latent decisions is a ch...
research
04/16/2016

Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering

This paper proposes deep convolutional network models that utilize local...
research
05/15/2020

C3VQG: Category Consistent Cyclic Visual Question Generation

Visual Question Generation (VQG) is the task of generating natural quest...

Please sign up or login with your details

Forgot password? Click here to reset