Customized Image Narrative Generation via Interactive Visual Question Generation and Answering

04/27/2018
by   Andrew Shin, et al.
0

Image description task has been invariably examined in a static manner with qualitative presumptions held to be universally applicable, regardless of the scope or target of the description. In practice, however, different viewers may pay attention to different aspects of the image, and yield different descriptions or interpretations under various contexts. Such diversity in perspectives is difficult to derive with conventional image description techniques. In this paper, we propose a customized image narrative generation task, in which the users are interactively engaged in the generation process by providing answers to the questions. We further attempt to learn the user's interest via repeating such interactive stages, and to automatically reflect the interest in descriptions for new images. Experimental results demonstrate that our model can generate a variety of descriptions from single image that cover a wider range of topics than conventional models, while being customizable to the target user of interaction.

READ FULL TEXT

page 3

page 8

page 11

page 12

page 14

page 15

page 16

page 17

research
09/11/2018

Unsupervised Stylish Image Description Generation via Domain Layer Norm

Most of the existing works on image description focus on generating expr...
research
05/31/2015

Visual Madlibs: Fill in the blank Image Generation and Question Answering

In this paper, we introduce a new dataset consisting of 360,001 focused ...
research
07/25/2022

Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?

The use of Deep Learning and Computer Vision in the Cultural Heritage do...
research
11/07/2021

NarrationBot and InfoBot: A Hybrid System for Automated Video Description

Video accessibility is crucial for blind and low vision users for equita...
research
06/05/2023

User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques

Recent text-driven image editing in diffusion models has shown remarkabl...
research
09/01/2016

How Much is 131 Million Dollars? Putting Numbers in Perspective with Compositional Descriptions

How much is 131 million US dollars? To help readers put such numbers in ...
research
09/13/2021

Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation

Have you ever looked at a painting and wondered what is the story behind...

Please sign up or login with your details

Forgot password? Click here to reset