C3VQG: Category Consistent Cyclic Visual Question Generation

by   Shagun Uppal, et al.

Visual Question Generation (VQG) is the task of generating natural questions based on an image. Popular methods in the past have explored image-to-sequence architectures trained with maximum likelihood which often lead to generic questions. While generative models try to exploit more concepts in an image, they still require ground-truth questions, answers (and categories in some cases). In this paper, we try to exploit the different visual cues and concepts in an image to generate questions using a variational autoencoder without the need for ground-truth answers. In this work, we, therefore, address two shortcomings of the current VQG approaches by minimizing the level of supervision and replacing generic questions by category-relevant generations. We, therefore, eliminate the need for expensive answer annotations thus weakening the required supervision in this task and use question categories instead. Using different categories enables us to exploit different concepts as the inference requires only the image and category. We maximize the mutual information between the image, question, and question category in the latent space of our VAE. We also propose a novel category consistent cyclic loss that motivates the model to generate consistent predictions with respect to the question category, reducing its redundancies and irregularities. Additionally, we also impose supplementary constraints on the latent space of our generative model to provide structure based on categories and enhance generalization by encapsulating decorrelated features within each dimension. Finally, we compare our qualitative as well as quantitative results to the state-of-the-art in VQG.


page 1

page 8


Information Maximizing Visual Question Generation

Though image-to-sequence generation models have become overwhelmingly po...

Guiding Visual Question Generation

In traditional Visual Question Generation (VQG), most images have multip...

ParaQG: A System for Generating Questions and Answers from Paragraphs

Generating syntactically and semantically valid and relevant questions f...

K-VQG: Knowledge-aware Visual Question Generation for Common-sense Acquisition

Visual Question Generation (VQG) is a task to generate questions from im...

A Probabilistic Generative Model of Free Categories

Applied category theory has recently developed libraries for computing w...

What's in a Question: Using Visual Questions as a Form of Supervision

Collecting fully annotated image datasets is challenging and expensive. ...

Learning Compositional Visual Concepts with Mutual Consistency

Compositionality of semantic concepts in image synthesis and analysis is...

Please sign up or login with your details

Forgot password? Click here to reset