Visual Storytelling

04/13/2016
by   Ting-Hao, et al.
0

We introduce the first dataset for sequential vision-to-language, and explore how this data may be used for the task of visual storytelling. The first release of this dataset, SIND v.1, includes 81,743 unique photos in 20,211 sequences, aligned to both descriptive (caption) and story language. We establish several strong baselines for the storytelling task, and motivate an automatic metric to benchmark progress. Modelling concrete description as well as figurative and social language, as provided in this dataset and the storytelling task, has the potential to move artificial intelligence from basic understandings of typical visual scenes towards more and more human-like understanding of grounded event structure and subjective expression.

READ FULL TEXT

page 1

page 2

page 5

research
06/05/2019

Visual Story Post-Editing

We introduce the first dataset for human edits of machine-generated visu...
research
08/09/2017

Hierarchically-Attentive RNN for Album Summarization and Storytelling

We address the problem of end-to-end visual storytelling. Given a photo ...
research
04/19/2019

Challenges and Prospects in Vision and Language Research

Language grounded image understanding tasks have often been proposed as ...
research
04/29/2022

EndoMapper dataset of complete calibrated endoscopy procedures

Computer-assisted systems are becoming broadly used in medicine. In endo...
research
09/26/2019

A Hierarchical Approach for Visual Storytelling Using Image Description

One of the primary challenges of visual storytelling is developing techn...
research
05/07/2020

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA

Despite recent progress on computer vision and natural language processi...
research
05/30/2018

Visual Referring Expression Recognition: What Do Systems Actually Learn?

We present an empirical analysis of the state-of-the-art systems for ref...

Please sign up or login with your details

Forgot password? Click here to reset