Movie101: A New Movie Understanding Benchmark

05/20/2023
by   Zihao Yue, et al.
0

To help the visually impaired enjoy movies, automatic movie narrating systems are expected to narrate accurate, coherent, and role-aware plots when there are no speaking lines of actors. Existing works benchmark this challenge as a normal video captioning task via some simplifications, such as removing role names and evaluating narrations with ngram-based metrics, which makes it difficult for automatic systems to meet the needs of real application scenarios. To narrow this gap, we construct a large-scale Chinese movie benchmark, named Movie101. Closer to real scenarios, the Movie Clip Narrating (MCN) task in our benchmark asks models to generate role-aware narration paragraphs for complete movie clips where no actors are speaking. External knowledge, such as role information and movie genres, is also provided for better movie understanding. Besides, we propose a new metric called Movie Narration Score (MNScore) for movie narrating evaluation, which achieves the best correlation with human evaluation. Our benchmark also supports the Temporal Narration Grounding (TNG) task to investigate clip localization given text descriptions. For both two tasks, our proposed methods well leverage external knowledge and outperform carefully designed baselines. The dataset and codes are released at https://github.com/yuezih/Movie101.

READ FULL TEXT

page 2

page 13

page 14

research
12/15/2021

Is "my favorite new movie" my favorite movie? Probing the Understanding of Recursive Noun Phrases

Recursive noun phrases (NPs) have interesting semantic properties. For e...
research
05/12/2016

Movie Description

Audio Description (AD) provides linguistic descriptions of movies and al...
research
03/04/2019

M-VAD Names: a Dataset for Video Captioning with Naming

Current movie captioning architectures are not capable of mentioning cha...
research
04/22/2023

Detecting Spoilers in Movie Reviews with External Movie Knowledge and User Networks

Online movie review platforms are providing crowdsourced feedback for th...
research
11/25/2021

V2C: Visual Voice Cloning

Existing Voice Cloning (VC) tasks aim to convert a paragraph text to a s...
research
03/29/2023

AutoAD: Movie Description in Context

The objective of this paper is an automatic Audio Description (AD) model...
research
08/22/2020

Identity-Aware Multi-Sentence Video Description

Standard video and movie description tasks abstract away from person ide...

Please sign up or login with your details

Forgot password? Click here to reset