DeepAI AI Chat
Log In Sign Up

MovieCuts: A New Dataset and Benchmark for Cut Type Recognition

09/12/2021
by   Alejandro Pardo, et al.
King Abdullah University of Science and Technology
adobe
0

Understanding movies and their structural patterns is a crucial task to decode the craft of video editing. While previous works have developed tools for general analysis such as detecting characters or recognizing cinematography properties at the shot level, less effort has been devoted to understanding the most basic video edit, the Cut. This paper introduces the cut type recognition task, which requires modeling of multi-modal information. To ignite research in the new task, we construct a large-scale dataset called MovieCuts, which contains more than 170K videoclips labeled among ten cut types. We benchmark a series of audio-visual approaches, including some that deal with the problem's multi-modal and multi-label nature. Our best model achieves 45.7 suggests that the task is challenging and that attaining highly accurate cut type recognition is an open research problem.

READ FULL TEXT

page 1

page 3

page 5

page 8

page 11

page 13

09/23/2021

Pairwise Emotional Relationship Recognition in Drama Videos: Dataset and Benchmark

Recognizing the emotional state of people is a basic but challenging tas...
09/03/2018

LRS3-TED: a large-scale dataset for visual speech recognition

This paper introduces a new multi-modal dataset for visual and audio-vis...
06/15/2021

Is this Harmful? Learning to Predict Harmfulness Ratings from Video

Automatically identifying harmful content in video is an important task ...
08/09/2021

Learning to Cut by Watching Movies

Video content creation keeps growing at an incredible pace; yet, creatin...
09/16/2021

Overview of Tencent Multi-modal Ads Video Understanding Challenge

Multi-modal Ads Video Understanding Challenge is the first grand challen...
04/02/2021

Language-based Video Editing via Multi-Modal Multi-Level Transformer

Video editing tools are widely used nowadays for digital design. Althoug...
07/28/2020

Families In Wild Multimedia (FIW-MM): A Multi-Modal Database for Recognizing Kinship

Recognizing kinship - a soft biometric with vast applications - in photo...