Effectively leveraging Multi-modal Features for Movie Genre Classification

03/24/2022
by   Zhongping Zhang, et al.
0

Movie genre classification has been widely studied in recent years due to its various applications in video editing, summarization, and recommendation. Prior work has typically addressed this task by predicting genres based solely on the visual content. As a result, predictions from these methods often perform poorly for genres such as documentary or musical, since non-visual modalities like audio or language play an important role in correctly classifying these genres. In addition, the analysis of long videos at frame level is always associated with high computational cost and makes the prediction less efficient. To address these two issues, we propose a Multi-Modal approach leveraging shot information, MMShot, to classify video genres in an efficient and effective way. We evaluate our method on MovieNet and Condensed Movies for genre classification, achieving 17 (mAP) over the state-of-the-art. Extensive experiments are conducted to demonstrate the ability of MMShot for long video analysis and uncover the correlations between genres and multiple movie elements. We also demonstrate our approach's ability to generalize by evaluating the scene boundary detection task, achieving 1.1 state-of-the-art.

READ FULL TEXT

page 2

page 10

page 20

page 21

page 22

research
01/26/2021

A Case Study of Deep Learning Based Multi-Modal Methods for Predicting the Age-Suitability Rating of Movie Trailers

In this work, we explore different approaches to combine modalities for ...
research
12/09/2022

Tencent AVS: A Holistic Ads Video Dataset for Multi-modal Scene Segmentation

Temporal video segmentation and classification have been advanced greatl...
research
03/14/2023

Improving Music Genre Classification from multi-modal properties of music and genre correlations Perspective

Music genre classification has been widely studied in past few years for...
research
07/17/2017

Show and Recall: Learning What Makes Videos Memorable

With the explosion of video content on the Internet, there is a need for...
research
06/17/2017

Truly Multi-modal YouTube-8M Video Classification with Video, Audio, and Text

The YouTube-8M video classification challenge requires teams to classify...
research
01/05/2023

What You Say Is What You Show: Visual Narration Detection in Instructional Videos

Narrated "how-to" videos have emerged as a promising data source for a w...
research
08/08/2020

A Unified Framework for Shot Type Classification Based on Subject Centric Lens

Shots are key narrative elements of various videos, e.g. movies, TV seri...

Please sign up or login with your details

Forgot password? Click here to reset