The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives

by   Mohit Iyyer, et al.

Visual narrative is often a combination of explicit information and judicious omissions, relying on the viewer to supply missing details. In comics, most movements in time and space are hidden in the "gutters" between panels. To follow the story, readers logically connect panels together by inferring unseen actions through a process called "closure". While computers can now describe what is explicitly depicted in natural images, in this paper we examine whether they can understand the closure-driven narratives conveyed by stylized artwork and dialogue in comic book panels. We construct a dataset, COMICS, that consists of over 1.2 million panels (120 GB) paired with automatic textbox transcriptions. An in-depth analysis of COMICS demonstrates that neither text nor image alone can tell a comic book story, so a computer must understand both modalities to keep up with the plot. We introduce three cloze-style tasks that ask models to predict narrative and character-centric aspects of a panel given n preceding panels as context. Various deep neural architectures underperform human baselines on these tasks, suggesting that COMICS contains fundamental challenges for both vision and language.



There are no comments yet.


page 1

page 2

page 4

page 5

page 6

page 8


Telling Stories through Multi-User Dialogue by Modeling Character Relations

This paper explores character-driven story continuation, in which the st...

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

When reading a literary piece, readers often make inferences about vario...

Experimental Evaluation of Book Drawing Algorithms

A k-page book drawing of a graph G=(V,E) consists of a linear ordering o...

What time is it? Temporal Analysis of Novels

Recognizing the flow of time in a story is a crucial aspect of understan...

Font Style that Fits an Image – Font Generation Based on Image Context

When fonts are used on documents, they are intentionally selected by des...

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

Books are a rich source of both fine-grained information, how a characte...

From Logic to Biology via Physics: a survey

This short text summarizes the work in biology proposed in our book, Per...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.