Video Ads Content Structuring by Combining Scene Confidence Prediction and Tagging

08/20/2021
by   Tomoyuki Suzuki, et al.
0

Video ads segmentation and tagging is a challenging task due to two main reasons: (1) the video scene structure is complex and (2) it includes multiple modalities (e.g., visual, audio, text.). While previous work focuses mostly on activity videos (e.g. "cooking", "sports"), it is not clear how they can be leveraged to tackle the task of video ads content structuring. In this paper, we propose a two-stage method that first provides the boundaries of the scenes, and then combines a confidence score for each segmented scene and the tag classes predicted for that scene. We provide extensive experimental results on the network architectures and modalities used for the proposed method. Our combined method improves the previous baselines on the challenging "Tencent Advertisement Video" dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

MOSO: Decomposing MOtion, Scene and Object for Video Prediction

Motion, scene and object are three primary visual components of a video....
research
05/28/2021

Audio-visual scene classification: analysis of DCASE 2021 Challenge submissions

This paper presents the details of the Audio-Visual Scene Classification...
research
03/07/2022

A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

In this paper, we propose two techniques, namely joint modeling and data...
research
07/18/2022

Visual Representations of Physiological Signals for Fake Video Detection

Realistic fake videos are a potential tool for spreading harmful misinfo...
research
03/15/2023

Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

The last decade has witnessed the proliferation of micro-videos on vario...
research
08/22/2023

MEGA: Multimodal Alignment Aggregation and Distillation For Cinematic Video Segmentation

Previous research has studied the task of segmenting cinematic videos in...
research
08/02/2021

Multimodal Feature Fusion for Video Advertisements Tagging Via Stacking Ensemble

Automated tagging of video advertisements has been a critical yet challe...

Please sign up or login with your details

Forgot password? Click here to reset