Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

06/22/2015
by   Yukun Zhu, et al.
0

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story. This paper aims to align books to their movie releases in order to provide rich descriptive explanations for visual content that go semantically far beyond the captions available in current datasets. To align movies and books we exploit a neural sentence embedding that is trained in an unsupervised way from a large corpus of books, as well as a video-text neural embedding for computing similarities between movie clips and sentences in the book. We propose a context-aware CNN to combine information from multiple sources. We demonstrate good quantitative performance for movie/book alignment and show several qualitative examples that showcase the diversity of tasks our model can be used for.

READ FULL TEXT

page 8

page 11

page 12

page 13

page 15

page 16

page 17

page 18

research
05/08/2020

Condensed Movies: Story Based Retrieval with Contextual Embeddings

Our objective in this work is the long range understanding of the narrat...
research
08/16/2017

mAnI: Movie Amalgamation using Neural Imitation

Cross-modal data retrieval has been the basis of various creative tasks ...
research
05/17/2023

Personality Understanding of Fictional Characters during Book Reading

Comprehending characters' personalities is a crucial aspect of story rea...
research
03/11/2022

Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding

Despite recent advances of AI, story understanding remains an open and u...
research
05/19/2020

A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks

Many high-level procedural tasks can be decomposed into sequences of ins...
research
12/02/2022

Sonus Texere! Automated Dense Soundtrack Construction for Books using Movie Adaptations

Reading, much like music listening, is an immersive experience that tran...
research
11/16/2016

The Amazing Mysteries of the Gutter: Drawing Inferences Between Panels in Comic Book Narratives

Visual narrative is often a combination of explicit information and judi...

Please sign up or login with your details

Forgot password? Click here to reset